Introduce XID age and inactive timeout based replication slot invalidation

Started by Bharath Rupireddyabout 2 years ago414 messages

bharath.rupireddyforpostgres@gmail.com

about 2 years ago

4 attachment(s)

Hi,

Replication slots in postgres will prevent removal of required
resources when there is no connection using them (inactive). This
consumes storage because neither required WAL nor required rows from
the user tables/system catalogs can be removed by VACUUM as long as
they are required by a replication slot. In extreme cases this could
cause the transaction ID wraparound.

Currently postgres has the ability to invalidate inactive replication
slots based on the amount of WAL (set via max_slot_wal_keep_size GUC)
that will be needed for the slots in case they become active. However,
the wraparound issue isn't effectively covered by
max_slot_wal_keep_size - one can't tell postgres to invalidate a
replication slot if it is blocking VACUUM. Also, it is often tricky to
choose a default value for max_slot_wal_keep_size, because the amount
of WAL that gets generated and allocated storage for the database can
vary.

Therefore, it is often easy for developers to do the following:
a) set an XID age (age of slot's xmin or catalog_xmin) of say 1 or 1.5
billion, after which the slots get invalidated.
b) set a timeout of say 1 or 2 or 3 days, after which the inactive
slots get invalidated.

To implement (a), postgres needs a new GUC called max_slot_xid_age.
The checkpointer then invalidates all the slots whose xmin (the oldest
transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain) has reached the age
specified by this setting.

To implement (b), first postgres needs to track the replication slot
metrics like the time at which the slot became inactive (inactive_at
timestamptz) and the total number of times the slot became inactive in
its lifetime (inactive_count numeric) in ReplicationSlotPersistentData
structure. And, then it needs a new timeout GUC called
inactive_replication_slot_timeout. Whenever a slot becomes inactive,
the current timestamp and inactive count are stored in
ReplicationSlotPersistentData structure and persisted to disk. The
checkpointer then invalidates all the slots that are lying inactive
for about inactive_replication_slot_timeout duration starting from
inactive_at.

In addition to implementing (b), these two new metrics enable
developers to improve their monitoring tools as the metrics are
exposed via pg_replication_slots system view. For instance, one can
build a monitoring tool that signals when replication slots are lying
inactive for a day or so using inactive_at metric, and/or when a
replication slot is becoming inactive too frequently using inactive_at
metric.

I’m attaching the v1 patch set as described below:
0001 - Tracks invalidation_reason in pg_replication_slots. This is
needed because slots now have multiple reasons for slot invalidation.
0002 - Tracks inactive replication slot information inactive_at and
inactive_timeout.
0003 - Adds inactive_timeout based replication slot invalidation.
0004 - Adds XID based replication slot invalidation.

Thoughts?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Track-invalidation_reason-in-pg_replication_slots.patchapplication/octet-stream; name=v1-0001-Track-invalidation_reason-in-pg_replication_slots.patchDownload

From 68f11db0afa7d9b2d2e083fd9ec0d578c66ae06a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 6 Jan 2024 14:18:01 +0000
Subject: [PATCH v1] Track invalidation_reason in pg_replication_slots

Currently the reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
conflict_reason to show the reasons for slot invalidation, but
only for logical slots. This commit renames conflict_reason to
invalidation_reason, and adds the support to show invalidation
reasons for both physical and logical slots.
---
 doc/src/sgml/system-views.sgml                | 11 +++---
 src/backend/catalog/system_views.sql          |  2 +-
 src/backend/replication/slotfuncs.c           | 37 ++++++++-----------
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  2 +-
 .../t/035_standby_logical_decoding.pl         | 32 ++++++++--------
 src/test/regress/expected/rules.out           |  4 +-
 7 files changed, 44 insertions(+), 48 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 72d01fc624..104bd2fb1f 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,13 +2525,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflict_reason</structfield> <type>text</type>
+       <structfield>invalidation_reason</structfield> <type>text</type>
       </para>
       <para>
-       The reason for the logical slot's conflict with recovery. It is always
-       NULL for physical slots, as well as for logical slots which are not
-       invalidated. The non-NULL values indicate that the slot is marked
-       as invalidated. Possible values are:
+       The reason for the slot's invalidation. <literal>NULL</literal> if the
+       slot is currently actively being used. The non-NULL values indicate that
+       the slot is marked as invalidated. In case of logical slots, it
+       represents the reason for the logical slot's conflict with recovery.
+       Possible values are:
        <itemizedlist spacing="compact">
         <listitem>
          <para>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index e43e36f5ac..7d40e9549b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflict_reason
+            L.invalidation_reason
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index cad35dce7f..77f7134872 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -402,28 +402,23 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.data.database == InvalidOid)
-			nulls[i++] = true;
-		else
+		switch (slot_contents.data.invalidated)
 		{
-			switch (slot_contents.data.invalidated)
-			{
-				case RS_INVAL_NONE:
-					nulls[i++] = true;
-					break;
-
-				case RS_INVAL_WAL_REMOVED:
-					values[i++] = CStringGetTextDatum("wal_removed");
-					break;
-
-				case RS_INVAL_HORIZON:
-					values[i++] = CStringGetTextDatum("rows_removed");
-					break;
-
-				case RS_INVAL_WAL_LEVEL:
-					values[i++] = CStringGetTextDatum("wal_level_insufficient");
-					break;
-			}
+			case RS_INVAL_NONE:
+				nulls[i++] = true;
+				break;
+
+			case RS_INVAL_WAL_REMOVED:
+				values[i++] = CStringGetTextDatum("wal_removed");
+				break;
+
+			case RS_INVAL_HORIZON:
+				values[i++] = CStringGetTextDatum("rows_removed");
+				break;
+
+			case RS_INVAL_WAL_LEVEL:
+				values[i++] = CStringGetTextDatum("wal_level_insufficient");
+				break;
 		}
 
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 190dd53a42..1aae971692 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -667,13 +667,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, "
-							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+							"(CASE WHEN invalidation_reason IS NOT NULL THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 7979392776..51e0f8f264 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11117,7 +11117,7 @@
   proargtypes => '',
   proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,invalidation_reason}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 8bc39a5f03..bbef71767a 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,7 +168,7 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflict_reason in pg_replication_slots.
+# Check invalidation_reason in pg_replication_slots.
 sub check_slots_conflict_reason
 {
 	my ($slot_prefix, $reason) = @_;
@@ -178,15 +178,15 @@ sub check_slots_conflict_reason
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$active_slot';));
 
-	is($res, "$reason", "$active_slot conflict_reason is $reason");
+	is($res, "$reason", "$active_slot invalidation_reason is $reason");
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$inactive_slot';));
 
-	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
+	is($res, "$reason", "$inactive_slot invalidation_reason is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -258,13 +258,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflict_reason is NULL for physical slot
+# Check invalidation_reason is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT invalidation_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflict_reason as NULL");
+is($res, 't', "Physical slot reports invalidation_reason as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -481,7 +481,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 $handle =
@@ -500,7 +500,7 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify conflict_reason is retained across a restart.
+# Verify invalidation_reason is retained across a restart.
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
@@ -509,7 +509,7 @@ check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
+	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and invalidation_reason is not null;"
 );
 
 chomp($restart_lsn);
@@ -563,7 +563,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
@@ -602,7 +602,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
@@ -658,7 +658,7 @@ ok( $node_standby->poll_query_until(
 is( $node_standby->safe_psql(
 		'postgres',
 		q[select bool_or(conflicting) from
-		  (select conflict_reason is not NULL as conflicting
+		  (select invalidation_reason is not NULL as conflicting
 		   from pg_replication_slots WHERE slot_type = 'logical')]),
 	'f',
 	'Logical slots are reported as non conflicting');
@@ -697,7 +697,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
@@ -741,7 +741,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+# Verify invalidation_reason is 'wal_level_insufficient' in pg_replication_slots
 check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index d878a971df..7cca0fbc87 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,8 +1473,8 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflict_reason
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason)
+    l.invalidation_reason
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, invalidation_reason)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v1-0002-Track-inactive-replication-slot-information.patchapplication/octet-stream; name=v1-0002-Track-inactive-replication-slot-information.patchDownload

From 9229ef9e28694a55906e92f42e966280c1beffea Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 6 Jan 2024 14:19:05 +0000
Subject: [PATCH v1] Track inactive replication slot information

Currently postgres doesn't track metrics like the time at which
the slot became inactive, and the total number of times the slot
became inactive in its lifetime. This commit adds two new metrics
inactive_at of type timestamptz and inactive_count of type numeric
to ReplicationSlotPersistentData. Whenever a slot becomes
inactive, the current timestamp and inactive count are persisted
to disk.

These metrics are useful in the following ways:

- To improve replication slot monitoring tools. For instance, one
can build a monitoring tool that signals a) when replication slots
is lying inactive for a day or so using inactive_at metric,
b) when a replication slot is becoming inactive too frequently
using inactive_at metric.

- To implement timeout-based inactive replication slot management
capability in postgres.

Increases SLOT_VERSION due to the added two new metrics.
---
 doc/src/sgml/system-views.sgml       | 20 +++++++++++
 src/backend/catalog/system_views.sql |  4 ++-
 src/backend/replication/slot.c       | 50 +++++++++++++++++++++++-----
 src/backend/replication/slotfuncs.c  | 15 ++++++++-
 src/include/catalog/pg_proc.dat      |  6 ++--
 src/include/replication/slot.h       |  6 ++++
 src/test/regress/expected/rules.out  |  6 ++--
 7 files changed, 91 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 104bd2fb1f..b6914a3197 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2556,6 +2556,26 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        </itemizedlist>
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_at</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_count</structfield> <type>numeric</type>
+      </para>
+      <para>
+        The total number of times the slot became inactive in its lifetime.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 7d40e9549b..611682a1b5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,9 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.invalidation_reason
+            L.invalidation_reason,
+            L.inactive_at,
+            L.inactive_count
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 52da694c79..f4a884d96e 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -90,7 +90,7 @@ typedef struct ReplicationSlotOnDisk
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	3		/* version for new files */
+#define SLOT_VERSION	4		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -311,6 +311,8 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.persistency = persistency;
 	slot->data.two_phase = two_phase;
 	slot->data.two_phase_at = InvalidXLogRecPtr;
+	slot->data.inactive_at = 0;
+	slot->data.inactive_count = 0;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
@@ -540,6 +542,17 @@ retry:
 
 	if (am_walsender)
 	{
+		if (s->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&s->mutex);
+			s->data.inactive_at = 0;
+			SpinLockRelease(&s->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				SlotIsLogical(s)
 				? errmsg("acquired logical replication slot \"%s\"",
@@ -607,16 +620,27 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
-	MyReplicationSlot = NULL;
-
-	/* might not have been set when we've been a plain slot */
-	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
-	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
-	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
-	LWLockRelease(ProcArrayLock);
-
 	if (am_walsender)
 	{
+		if (slot->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&slot->mutex);
+			slot->data.inactive_at = GetCurrentTimestamp();
+
+			/*
+			 * XXX: Can inactive_count of type uint64 ever overflow? It takes
+			 * about a half-billion years for inactive_count to overflow even
+			 * if slot becomes inactive for every 1 millisecond. So, using
+			 * pg_add_u64_overflow might be an overkill.
+			 */
+			slot->data.inactive_count++;
+			SpinLockRelease(&slot->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				is_logical
 				? errmsg("released logical replication slot \"%s\"",
@@ -626,6 +650,14 @@ ReplicationSlotRelease(void)
 
 		pfree(slotname);
 	}
+
+	MyReplicationSlot = NULL;
+
+	/* might not have been set when we've been a plain slot */
+	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
+	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
+	LWLockRelease(ProcArrayLock);
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 77f7134872..89262da486 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -232,10 +232,11 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 15
+#define PG_GET_REPLICATION_SLOTS_COLS 17
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
+	char		buf[256];
 
 	/*
 	 * We don't require any special permission to see this function's data
@@ -421,6 +422,18 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 				break;
 		}
 
+		if (slot_contents.data.inactive_at > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.data.inactive_at);
+		else
+			nulls[i++] = true;
+
+		/* Convert to numeric. */
+		snprintf(buf, sizeof buf, UINT64_FORMAT, slot_contents.data.inactive_count);
+		values[i++] = DirectFunctionCall3(numeric_in,
+										  CStringGetDatum(buf),
+										  ObjectIdGetDatum(0),
+										  Int32GetDatum(-1));
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 51e0f8f264..c6995876ed 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11115,9 +11115,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,invalidation_reason}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,timestamptz,numeric}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,invalidation_reason,inactive_at,inactive_count}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 9e39aaf303..dfd2f82a67 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -111,6 +111,12 @@ typedef struct ReplicationSlotPersistentData
 
 	/* plugin name */
 	NameData	plugin;
+
+	/* When did this slot become inactive last time? */
+	TimestampTz inactive_at;
+
+	/* How many times the slot has been inactive? */
+	uint64		inactive_count;
 } ReplicationSlotPersistentData;
 
 /*
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 7cca0fbc87..16807eea46 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,8 +1473,10 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.invalidation_reason
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, invalidation_reason)
+    l.invalidation_reason,
+    l.inactive_at,
+    l.inactive_count
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, invalidation_reason, inactive_at, inactive_count)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v1-0003-Add-inactive_timeout-based-replication-slot-inval.patchapplication/octet-stream; name=v1-0003-Add-inactive_timeout-based-replication-slot-inval.patchDownload

From 53ffd09c7a3b339c7dd242a2d57cb94c02c90b43 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 6 Jan 2024 14:44:23 +0000
Subject: [PATCH v1] Add inactive_timeout based replication slot invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get dropped.

To achieve the above, postgres uses replication slot metric
inactive_at (the time at which the slot became inactive), and a
new GUC inactive_replication_slot_timeout. The checkpointer then
looks at all replication slots invalidating the inactive slots
based on the timeout set.
---
 doc/src/sgml/config.sgml                      | 18 ++++
 src/backend/access/transam/xlog.c             | 10 +++
 src/backend/replication/slot.c                | 24 ++++-
 src/backend/replication/slotfuncs.c           |  3 +
 src/backend/utils/misc/guc_tables.c           | 12 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/replication/slot.h                |  3 +
 src/test/recovery/meson.build                 |  1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 87 +++++++++++++++++++
 9 files changed, 156 insertions(+), 3 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f323bba018..4293b3c182 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4404,6 +4404,24 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-inactive-replication-slot-timeout" xreflabel="inactive_replication_slot_timeout">
+      <term><varname>inactive_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>inactive_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time at the next checkpoint. If this value is specified
+        without units, it is taken as seconds. A value of zero (which is
+        default) disables the timeout mechanism. This parameter can only be
+        set in the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 478377c4a2..f7ce2cbbb4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7051,6 +7051,11 @@ CreateCheckPoint(int flags)
 	if (PriorRedoPtr != InvalidXLogRecPtr)
 		UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7495,6 +7500,11 @@ CreateRestartPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index f4a884d96e..d921ac051f 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -98,9 +98,9 @@ ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
 /* My backend's replication slot in the shared memory array */
 ReplicationSlot *MyReplicationSlot = NULL;
 
-/* GUC variable */
-int			max_replication_slots = 10; /* the maximum number of replication
-										 * slots */
+/* GUC variables */
+int			max_replication_slots = 10;
+int			inactive_replication_slot_timeout = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropAcquired(void);
@@ -1346,6 +1346,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by inactive_replication_slot_timeout."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1444,6 +1447,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						conflict = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (s->data.inactive_at > 0)
+					{
+						TimestampTz now;
+
+						Assert(s->data.persistency == RS_PERSISTENT);
+						Assert(s->active_pid == 0);
+
+						now = GetCurrentTimestamp();
+						if (TimestampDifferenceExceeds(s->data.inactive_at, now,
+													   inactive_replication_slot_timeout * 1000))
+							conflict = cause;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1589,6 +1606,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 89262da486..e094225764 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -420,6 +420,9 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			case RS_INVAL_WAL_LEVEL:
 				values[i++] = CStringGetTextDatum("wal_level_insufficient");
 				break;
+			case RS_INVAL_INACTIVE_TIMEOUT:
+				values[i++] = CStringGetTextDatum("inactive_timeout");
+				break;
 		}
 
 		if (slot_contents.data.inactive_at > 0)
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index e53ebc6dc2..c7fa14ed6b 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2892,6 +2892,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"inactive_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&inactive_replication_slot_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index b2809c711a..7984873533 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -325,6 +325,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#inactive_replication_slot_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index dfd2f82a67..ace946de62 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -50,6 +50,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 /*
@@ -216,6 +218,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
+extern PGDLLIMPORT int inactive_replication_slot_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 88fb0306f5..22e5e2e45c 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -45,6 +45,7 @@ tests += {
       't/037_invalid_database.pl',
       't/038_save_logical_slots_shutdown.pl',
       't/039_end_of_wal.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..bf1cd4bbcc
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,87 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Initialize primary node, setting wal-segsize to 1MB
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 1, extra => ['--wal-segsize=1']);
+$primary->append_conf('postgresql.conf', q{
+checkpoint_timeout = 1h
+});
+$primary->start;
+$primary->safe_psql('postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb1_slot');
+]);
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name,
+	has_streaming => 1);
+$standby1->append_conf('postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+});
+$standby1->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# The inactive replication slot info should be null when the slot is active
+my $result = $primary->safe_psql('postgres', qq[
+	SELECT inactive_at IS NULL, inactive_count = 0 AS OK
+		FROM pg_replication_slots WHERE slot_name = 'sb1_slot';
+]);
+is($result, "t|t", 'check the inactive replication slot info for an active slot');
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql('postgres', qq[
+    ALTER SYSTEM SET inactive_replication_slot_timeout TO '1s';
+]);
+$primary->reload;
+
+my $logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby1->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until('postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE inactive_at IS NOT NULL AND
+		inactive_count = 1 AND slot_name = 'sb1_slot';
+]) or die "Timed out while waiting for inactive replication slot info to be updated";
+
+my $invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb1_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb1_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until('postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'inactive_timeout';
+]) or die "Timed out while waiting for inactive replication slot sb1_slot to be invalidated";
+
+done_testing();
-- 
2.34.1

v1-0004-Add-XID-based-replication-slot-invalidation.patchapplication/octet-stream; name=v1-0004-Add-XID-based-replication-slot-invalidation.patchDownload

From 2d98e0f46e502f530bdf644c23f8fa2c2983ca12 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 6 Jan 2024 14:45:13 +0000
Subject: [PATCH v1] Add XID based replication slot invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres uses replication slot xmin (the
oldest transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain), and a new GUC
max_slot_xid_age. The checkpointer then looks at all replication
slots invalidating the slots based on the age set.
---
 doc/src/sgml/config.sgml                      | 21 +++++
 src/backend/access/transam/xlog.c             | 10 +++
 src/backend/replication/slot.c                | 41 ++++++++++
 src/backend/replication/slotfuncs.c           |  3 +
 src/backend/utils/misc/guc_tables.c           | 10 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/replication/slot.h                |  3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 81 +++++++++++++++++++
 8 files changed, 170 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 4293b3c182..f0b3a3bf2b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4422,6 +4422,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age">
+      <term><varname>max_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>max_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f7ce2cbbb4..a69099247a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7056,6 +7056,11 @@ CreateCheckPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7505,6 +7510,11 @@ CreateRestartPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d921ac051f..cffd84c23b 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -101,6 +101,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10;
 int			inactive_replication_slot_timeout = 0;
+int			max_slot_xid_age = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropAcquired(void);
@@ -1349,6 +1350,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_INACTIVE_TIMEOUT:
 			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by inactive_replication_slot_timeout."));
 			break;
+		case RS_INVAL_XID_AGE:
+			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1461,6 +1465,42 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 							conflict = cause;
 					}
 					break;
+				case RS_INVAL_XID_AGE:
+					{
+						TransactionId xid_cur = ReadNextTransactionId();
+						TransactionId xid_limit;
+						TransactionId xid_slot;
+
+						if (TransactionIdIsNormal(s->data.xmin))
+						{
+							xid_slot = s->data.xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+						if (TransactionIdIsNormal(s->data.catalog_xmin))
+						{
+							xid_slot = s->data.catalog_xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1607,6 +1647,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index e094225764..4b56f11b57 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -423,6 +423,9 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			case RS_INVAL_INACTIVE_TIMEOUT:
 				values[i++] = CStringGetTextDatum("inactive_timeout");
 				break;
+			case RS_INVAL_XID_AGE:
+				values[i++] = CStringGetTextDatum("xid_aged");
+				break;
 		}
 
 		if (slot_contents.data.inactive_at > 0)
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index c7fa14ed6b..ce79436b4d 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2904,6 +2904,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"max_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&max_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 7984873533..2f3b777b5c 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -326,6 +326,7 @@
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
 #inactive_replication_slot_timeout = 0	# in seconds; 0 disables
+#max_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index ace946de62..ad7e32678b 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -52,6 +52,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* inactive slot timeout has occurred */
 	RS_INVAL_INACTIVE_TIMEOUT,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 /*
@@ -219,6 +221,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT int inactive_replication_slot_timeout;
+extern PGDLLIMPORT int max_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index bf1cd4bbcc..e7da98412c 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -84,4 +84,85 @@ $primary->poll_query_until('postgres', qq[
 		invalidation_reason = 'inactive_timeout';
 ]) or die "Timed out while waiting for inactive replication slot sb1_slot to be invalidated";
 
+$primary->safe_psql('postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb2_slot');
+]);
+
+$primary->safe_psql('postgres', qq[
+    ALTER SYSTEM SET inactive_replication_slot_timeout TO 0;
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name,
+	has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby2->append_conf('postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+$standby2->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+$primary->poll_query_until('postgres', qq[
+	SELECT xmin IS NOT NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb2_slot';
+]) or die "Timed out waiting for slot xmin to advance";
+
+$primary->safe_psql('postgres', qq[
+	ALTER SYSTEM SET max_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby2->stop;
+
+# Do some work to advance xmin
+$primary->safe_psql(
+	'postgres', q{
+do $$
+begin
+  for i in 10000..11000 loop
+    -- use an exception block so that each iteration eats an XID
+    begin
+      insert into tab_int values (i);
+    exception
+      when division_by_zero then null;
+    end;
+  end loop;
+end$$;
+});
+
+$invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb2_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb2_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until('postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb2_slot' AND
+		invalidation_reason = 'xid_aged';
+]) or die "Timed out while waiting for replication slot sb2_slot to be invalidated";
+
 done_testing();
-- 
2.34.1

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#1)

4 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Jan 11, 2024 at 10:48 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

Replication slots in postgres will prevent removal of required
resources when there is no connection using them (inactive). This
consumes storage because neither required WAL nor required rows from
the user tables/system catalogs can be removed by VACUUM as long as
they are required by a replication slot. In extreme cases this could
cause the transaction ID wraparound.

Currently postgres has the ability to invalidate inactive replication
slots based on the amount of WAL (set via max_slot_wal_keep_size GUC)
that will be needed for the slots in case they become active. However,
the wraparound issue isn't effectively covered by
max_slot_wal_keep_size - one can't tell postgres to invalidate a
replication slot if it is blocking VACUUM. Also, it is often tricky to
choose a default value for max_slot_wal_keep_size, because the amount
of WAL that gets generated and allocated storage for the database can
vary.

Therefore, it is often easy for developers to do the following:
a) set an XID age (age of slot's xmin or catalog_xmin) of say 1 or 1.5
billion, after which the slots get invalidated.
b) set a timeout of say 1 or 2 or 3 days, after which the inactive
slots get invalidated.

To implement (a), postgres needs a new GUC called max_slot_xid_age.
The checkpointer then invalidates all the slots whose xmin (the oldest
transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain) has reached the age
specified by this setting.

To implement (b), first postgres needs to track the replication slot
metrics like the time at which the slot became inactive (inactive_at
timestamptz) and the total number of times the slot became inactive in
its lifetime (inactive_count numeric) in ReplicationSlotPersistentData
structure. And, then it needs a new timeout GUC called
inactive_replication_slot_timeout. Whenever a slot becomes inactive,
the current timestamp and inactive count are stored in
ReplicationSlotPersistentData structure and persisted to disk. The
checkpointer then invalidates all the slots that are lying inactive
for about inactive_replication_slot_timeout duration starting from
inactive_at.

In addition to implementing (b), these two new metrics enable
developers to improve their monitoring tools as the metrics are
exposed via pg_replication_slots system view. For instance, one can
build a monitoring tool that signals when replication slots are lying
inactive for a day or so using inactive_at metric, and/or when a
replication slot is becoming inactive too frequently using inactive_at
metric.

I’m attaching the v1 patch set as described below:
0001 - Tracks invalidation_reason in pg_replication_slots. This is
needed because slots now have multiple reasons for slot invalidation.
0002 - Tracks inactive replication slot information inactive_at and
inactive_timeout.
0003 - Adds inactive_timeout based replication slot invalidation.
0004 - Adds XID based replication slot invalidation.

Thoughts?

Needed a rebase due to c393308b. Please find the attached v2 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v2-0001-Track-invalidation_reason-in-pg_replication_slots.patchapplication/x-patch; name=v2-0001-Track-invalidation_reason-in-pg_replication_slots.patchDownload

From 26c5b7762abc8fd92df0376ec68c81d7064891fb Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 26 Jan 2024 18:11:01 +0000
Subject: [PATCH v2] Track invalidation_reason in pg_replication_slots

Currently the reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
conflict_reason to show the reasons for slot invalidation, but
only for logical slots. This commit renames conflict_reason to
invalidation_reason, and adds the support to show invalidation
reasons for both physical and logical slots.
---
 doc/src/sgml/system-views.sgml                | 11 +++---
 src/backend/catalog/system_views.sql          |  2 +-
 src/backend/replication/slotfuncs.c           | 37 ++++++++-----------
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  2 +-
 .../t/035_standby_logical_decoding.pl         | 32 ++++++++--------
 src/test/regress/expected/rules.out           |  4 +-
 7 files changed, 44 insertions(+), 48 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index dd468b31ea..c61312793c 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,13 +2525,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflict_reason</structfield> <type>text</type>
+       <structfield>invalidation_reason</structfield> <type>text</type>
       </para>
       <para>
-       The reason for the logical slot's conflict with recovery. It is always
-       NULL for physical slots, as well as for logical slots which are not
-       invalidated. The non-NULL values indicate that the slot is marked
-       as invalidated. Possible values are:
+       The reason for the slot's invalidation. <literal>NULL</literal> if the
+       slot is currently actively being used. The non-NULL values indicate that
+       the slot is marked as invalidated. In case of logical slots, it
+       represents the reason for the logical slot's conflict with recovery.
+       Possible values are:
        <itemizedlist spacing="compact">
         <listitem>
          <para>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index c62aa0074a..d78077b936 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflict_reason,
+            L.invalidation_reason,
             L.failover
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index eb685089b3..e53aeb37c9 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -407,28 +407,23 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.data.database == InvalidOid)
-			nulls[i++] = true;
-		else
+		switch (slot_contents.data.invalidated)
 		{
-			switch (slot_contents.data.invalidated)
-			{
-				case RS_INVAL_NONE:
-					nulls[i++] = true;
-					break;
-
-				case RS_INVAL_WAL_REMOVED:
-					values[i++] = CStringGetTextDatum("wal_removed");
-					break;
-
-				case RS_INVAL_HORIZON:
-					values[i++] = CStringGetTextDatum("rows_removed");
-					break;
-
-				case RS_INVAL_WAL_LEVEL:
-					values[i++] = CStringGetTextDatum("wal_level_insufficient");
-					break;
-			}
+			case RS_INVAL_NONE:
+				nulls[i++] = true;
+				break;
+
+			case RS_INVAL_WAL_REMOVED:
+				values[i++] = CStringGetTextDatum("wal_removed");
+				break;
+
+			case RS_INVAL_HORIZON:
+				values[i++] = CStringGetTextDatum("rows_removed");
+				break;
+
+			case RS_INVAL_WAL_LEVEL:
+				values[i++] = CStringGetTextDatum("wal_level_insufficient");
+				break;
 		}
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 183c2f84eb..9683c91d4a 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -667,13 +667,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+							"(CASE WHEN invalidation_reason IS NOT NULL THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 29af4ce65d..de1115baa0 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11129,7 +11129,7 @@
   proargtypes => '',
   proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,invalidation_reason,failover}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index cebfa52d0f..f2c58a8a06 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,7 +168,7 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflict_reason in pg_replication_slots.
+# Check invalidation_reason in pg_replication_slots.
 sub check_slots_conflict_reason
 {
 	my ($slot_prefix, $reason) = @_;
@@ -178,15 +178,15 @@ sub check_slots_conflict_reason
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$active_slot';));
 
-	is($res, "$reason", "$active_slot conflict_reason is $reason");
+	is($res, "$reason", "$active_slot invalidation_reason is $reason");
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$inactive_slot';));
 
-	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
+	is($res, "$reason", "$inactive_slot invalidation_reason is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -293,13 +293,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflict_reason is NULL for physical slot
+# Check invalidation_reason is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT invalidation_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflict_reason as NULL");
+is($res, 't', "Physical slot reports invalidation_reason as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -512,7 +512,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 $handle =
@@ -531,7 +531,7 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify conflict_reason is retained across a restart.
+# Verify invalidation_reason is retained across a restart.
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
@@ -540,7 +540,7 @@ check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
+	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and invalidation_reason is not null;"
 );
 
 chomp($restart_lsn);
@@ -591,7 +591,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
@@ -627,7 +627,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
@@ -680,7 +680,7 @@ ok( $node_standby->poll_query_until(
 is( $node_standby->safe_psql(
 		'postgres',
 		q[select bool_or(conflicting) from
-		  (select conflict_reason is not NULL as conflicting
+		  (select invalidation_reason is not NULL as conflicting
 		   from pg_replication_slots WHERE slot_type = 'logical')]),
 	'f',
 	'Logical slots are reported as non conflicting');
@@ -719,7 +719,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
@@ -763,7 +763,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+# Verify invalidation_reason is 'wal_level_insufficient' in pg_replication_slots
 check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index abc944e8b8..022f9bccb0 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,9 +1473,9 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflict_reason,
+    l.invalidation_reason,
     l.failover
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, invalidation_reason, failover)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v2-0002-Track-inactive-replication-slot-information.patchapplication/x-patch; name=v2-0002-Track-inactive-replication-slot-information.patchDownload

From b58f5afd7e2d863445b4ecf9fbd750ac0b2606cb Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 26 Jan 2024 18:20:17 +0000
Subject: [PATCH v2] Track inactive replication slot information

Currently postgres doesn't track metrics like the time at which
the slot became inactive, and the total number of times the slot
became inactive in its lifetime. This commit adds two new metrics
inactive_at of type timestamptz and inactive_count of type numeric
to ReplicationSlotPersistentData. Whenever a slot becomes
inactive, the current timestamp and inactive count are persisted
to disk.

These metrics are useful in the following ways:

- To improve replication slot monitoring tools. For instance, one
can build a monitoring tool that signals a) when replication slots
is lying inactive for a day or so using inactive_at metric,
b) when a replication slot is becoming inactive too frequently
using inactive_at metric.

- To implement timeout-based inactive replication slot management
capability in postgres.

Increases SLOT_VERSION due to the added two new metrics.
---
 doc/src/sgml/system-views.sgml       | 20 +++++++++++
 src/backend/catalog/system_views.sql |  4 ++-
 src/backend/replication/slot.c       | 50 +++++++++++++++++++++++-----
 src/backend/replication/slotfuncs.c  | 15 ++++++++-
 src/include/catalog/pg_proc.dat      |  6 ++--
 src/include/replication/slot.h       |  6 ++++
 src/test/regress/expected/rules.out  |  6 ++--
 7 files changed, 91 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index c61312793c..75f99f4ca0 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,6 +2566,26 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        Always false for physical slots.
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_at</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_count</structfield> <type>numeric</type>
+      </para>
+      <para>
+        The total number of times the slot became inactive in its lifetime.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index d78077b936..caa6db720c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1024,7 +1024,9 @@ CREATE VIEW pg_replication_slots AS
             L.safe_wal_size,
             L.two_phase,
             L.invalidation_reason,
-            L.failover
+            L.failover,
+            L.inactive_at,
+            L.inactive_count
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 02a14ec210..bf7429ba3f 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -90,7 +90,7 @@ typedef struct ReplicationSlotOnDisk
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	4		/* version for new files */
+#define SLOT_VERSION	5		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -315,6 +315,8 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase = two_phase;
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
+	slot->data.inactive_at = 0;
+	slot->data.inactive_count = 0;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
@@ -544,6 +546,17 @@ retry:
 
 	if (am_walsender)
 	{
+		if (s->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&s->mutex);
+			s->data.inactive_at = 0;
+			SpinLockRelease(&s->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				SlotIsLogical(s)
 				? errmsg("acquired logical replication slot \"%s\"",
@@ -611,16 +624,27 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
-	MyReplicationSlot = NULL;
-
-	/* might not have been set when we've been a plain slot */
-	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
-	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
-	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
-	LWLockRelease(ProcArrayLock);
-
 	if (am_walsender)
 	{
+		if (slot->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&slot->mutex);
+			slot->data.inactive_at = GetCurrentTimestamp();
+
+			/*
+			 * XXX: Can inactive_count of type uint64 ever overflow? It takes
+			 * about a half-billion years for inactive_count to overflow even
+			 * if slot becomes inactive for every 1 millisecond. So, using
+			 * pg_add_u64_overflow might be an overkill.
+			 */
+			slot->data.inactive_count++;
+			SpinLockRelease(&slot->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				is_logical
 				? errmsg("released logical replication slot \"%s\"",
@@ -630,6 +654,14 @@ ReplicationSlotRelease(void)
 
 		pfree(slotname);
 	}
+
+	MyReplicationSlot = NULL;
+
+	/* might not have been set when we've been a plain slot */
+	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
+	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
+	LWLockRelease(ProcArrayLock);
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index e53aeb37c9..3c53f4ac48 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -237,10 +237,11 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 16
+#define PG_GET_REPLICATION_SLOTS_COLS 18
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
+	char		buf[256];
 
 	/*
 	 * We don't require any special permission to see this function's data
@@ -428,6 +429,18 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
+		if (slot_contents.data.inactive_at > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.data.inactive_at);
+		else
+			nulls[i++] = true;
+
+		/* Convert to numeric. */
+		snprintf(buf, sizeof buf, UINT64_FORMAT, slot_contents.data.inactive_count);
+		values[i++] = DirectFunctionCall3(numeric_in,
+										  CStringGetDatum(buf),
+										  ObjectIdGetDatum(0),
+										  Int32GetDatum(-1));
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index de1115baa0..52e9fc4971 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11127,9 +11127,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,invalidation_reason,failover}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,timestamptz,numeric}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,invalidation_reason,failover,inactive_at,inactive_count}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index db9bb22266..a7372d3bd5 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -117,6 +117,12 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* When did this slot become inactive last time? */
+	TimestampTz inactive_at;
+
+	/* How many times the slot has been inactive? */
+	uint64		inactive_count;
 } ReplicationSlotPersistentData;
 
 /*
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 022f9bccb0..4a3cb182e6 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1474,8 +1474,10 @@ pg_replication_slots| SELECT l.slot_name,
     l.safe_wal_size,
     l.two_phase,
     l.invalidation_reason,
-    l.failover
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, invalidation_reason, failover)
+    l.failover,
+    l.inactive_at,
+    l.inactive_count
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, invalidation_reason, failover, inactive_at, inactive_count)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v2-0003-Add-inactive_timeout-based-replication-slot-inval.patchapplication/x-patch; name=v2-0003-Add-inactive_timeout-based-replication-slot-inval.patchDownload

From 6fe224c7c52d47528e7db444dd1624fed6631ecf Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 26 Jan 2024 18:22:12 +0000
Subject: [PATCH v2] Add inactive_timeout based replication slot invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get dropped.

To achieve the above, postgres uses replication slot metric
inactive_at (the time at which the slot became inactive), and a
new GUC inactive_replication_slot_timeout. The checkpointer then
looks at all replication slots invalidating the inactive slots
based on the timeout set.
---
 doc/src/sgml/config.sgml                      | 18 ++++
 src/backend/access/transam/xlog.c             | 10 +++
 src/backend/replication/slot.c                | 24 ++++-
 src/backend/replication/slotfuncs.c           |  3 +
 src/backend/utils/misc/guc_tables.c           | 12 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/replication/slot.h                |  3 +
 src/test/recovery/meson.build                 |  1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 87 +++++++++++++++++++
 9 files changed, 156 insertions(+), 3 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 61038472c5..099b3fc5cc 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4405,6 +4405,24 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-inactive-replication-slot-timeout" xreflabel="inactive_replication_slot_timeout">
+      <term><varname>inactive_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>inactive_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time at the next checkpoint. If this value is specified
+        without units, it is taken as seconds. A value of zero (which is
+        default) disables the timeout mechanism. This parameter can only be
+        set in the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 478377c4a2..f7ce2cbbb4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7051,6 +7051,11 @@ CreateCheckPoint(int flags)
 	if (PriorRedoPtr != InvalidXLogRecPtr)
 		UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7495,6 +7500,11 @@ CreateRestartPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index bf7429ba3f..caee3c7790 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -98,9 +98,9 @@ ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
 /* My backend's replication slot in the shared memory array */
 ReplicationSlot *MyReplicationSlot = NULL;
 
-/* GUC variable */
-int			max_replication_slots = 10; /* the maximum number of replication
-										 * slots */
+/* GUC variables */
+int			max_replication_slots = 10;
+int			inactive_replication_slot_timeout = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropAcquired(void);
@@ -1350,6 +1350,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by inactive_replication_slot_timeout."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1448,6 +1451,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						conflict = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (s->data.inactive_at > 0)
+					{
+						TimestampTz now;
+
+						Assert(s->data.persistency == RS_PERSISTENT);
+						Assert(s->active_pid == 0);
+
+						now = GetCurrentTimestamp();
+						if (TimestampDifferenceExceeds(s->data.inactive_at, now,
+													   inactive_replication_slot_timeout * 1000))
+							conflict = cause;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1593,6 +1610,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 3c53f4ac48..972c7b2baf 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -425,6 +425,9 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			case RS_INVAL_WAL_LEVEL:
 				values[i++] = CStringGetTextDatum("wal_level_insufficient");
 				break;
+			case RS_INVAL_INACTIVE_TIMEOUT:
+				values[i++] = CStringGetTextDatum("inactive_timeout");
+				break;
 		}
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 7fe58518d7..f08563479b 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2902,6 +2902,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"inactive_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&inactive_replication_slot_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index da10b43dac..9fc1f2faed 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -325,6 +325,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#inactive_replication_slot_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index a7372d3bd5..3094f36173 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -50,6 +50,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 /*
@@ -222,6 +224,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
+extern PGDLLIMPORT int inactive_replication_slot_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 88fb0306f5..22e5e2e45c 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -45,6 +45,7 @@ tests += {
       't/037_invalid_database.pl',
       't/038_save_logical_slots_shutdown.pl',
       't/039_end_of_wal.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..bf1cd4bbcc
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,87 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Initialize primary node, setting wal-segsize to 1MB
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 1, extra => ['--wal-segsize=1']);
+$primary->append_conf('postgresql.conf', q{
+checkpoint_timeout = 1h
+});
+$primary->start;
+$primary->safe_psql('postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb1_slot');
+]);
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name,
+	has_streaming => 1);
+$standby1->append_conf('postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+});
+$standby1->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# The inactive replication slot info should be null when the slot is active
+my $result = $primary->safe_psql('postgres', qq[
+	SELECT inactive_at IS NULL, inactive_count = 0 AS OK
+		FROM pg_replication_slots WHERE slot_name = 'sb1_slot';
+]);
+is($result, "t|t", 'check the inactive replication slot info for an active slot');
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql('postgres', qq[
+    ALTER SYSTEM SET inactive_replication_slot_timeout TO '1s';
+]);
+$primary->reload;
+
+my $logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby1->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until('postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE inactive_at IS NOT NULL AND
+		inactive_count = 1 AND slot_name = 'sb1_slot';
+]) or die "Timed out while waiting for inactive replication slot info to be updated";
+
+my $invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb1_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb1_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until('postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'inactive_timeout';
+]) or die "Timed out while waiting for inactive replication slot sb1_slot to be invalidated";
+
+done_testing();
-- 
2.34.1

v2-0004-Add-XID-based-replication-slot-invalidation.patchapplication/x-patch; name=v2-0004-Add-XID-based-replication-slot-invalidation.patchDownload

From af06f663845601fbbb316b8aeaffeae288231c95 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 26 Jan 2024 18:23:27 +0000
Subject: [PATCH v2] Add XID based replication slot invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres uses replication slot xmin (the
oldest transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain), and a new GUC
max_slot_xid_age. The checkpointer then looks at all replication
slots invalidating the slots based on the age set.
---
 doc/src/sgml/config.sgml                      | 21 +++++
 src/backend/access/transam/xlog.c             | 10 +++
 src/backend/replication/slot.c                | 41 ++++++++++
 src/backend/replication/slotfuncs.c           |  3 +
 src/backend/utils/misc/guc_tables.c           | 10 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/replication/slot.h                |  3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 81 +++++++++++++++++++
 8 files changed, 170 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 099b3fc5cc..0204b1c86a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4423,6 +4423,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age">
+      <term><varname>max_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>max_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f7ce2cbbb4..a69099247a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7056,6 +7056,11 @@ CreateCheckPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7505,6 +7510,11 @@ CreateRestartPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index caee3c7790..428c9fa24d 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -101,6 +101,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10;
 int			inactive_replication_slot_timeout = 0;
+int			max_slot_xid_age = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropAcquired(void);
@@ -1353,6 +1354,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_INACTIVE_TIMEOUT:
 			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by inactive_replication_slot_timeout."));
 			break;
+		case RS_INVAL_XID_AGE:
+			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1465,6 +1469,42 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 							conflict = cause;
 					}
 					break;
+				case RS_INVAL_XID_AGE:
+					{
+						TransactionId xid_cur = ReadNextTransactionId();
+						TransactionId xid_limit;
+						TransactionId xid_slot;
+
+						if (TransactionIdIsNormal(s->data.xmin))
+						{
+							xid_slot = s->data.xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+						if (TransactionIdIsNormal(s->data.catalog_xmin))
+						{
+							xid_slot = s->data.catalog_xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1611,6 +1651,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 972c7b2baf..21cd76d708 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -428,6 +428,9 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			case RS_INVAL_INACTIVE_TIMEOUT:
 				values[i++] = CStringGetTextDatum("inactive_timeout");
 				break;
+			case RS_INVAL_XID_AGE:
+				values[i++] = CStringGetTextDatum("xid_aged");
+				break;
 		}
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index f08563479b..f2bf3d64d9 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2914,6 +2914,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"max_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&max_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 9fc1f2faed..8743426b12 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -326,6 +326,7 @@
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
 #inactive_replication_slot_timeout = 0	# in seconds; 0 disables
+#max_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 3094f36173..a33632d80b 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -52,6 +52,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* inactive slot timeout has occurred */
 	RS_INVAL_INACTIVE_TIMEOUT,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 /*
@@ -225,6 +227,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT int inactive_replication_slot_timeout;
+extern PGDLLIMPORT int max_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index bf1cd4bbcc..e7da98412c 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -84,4 +84,85 @@ $primary->poll_query_until('postgres', qq[
 		invalidation_reason = 'inactive_timeout';
 ]) or die "Timed out while waiting for inactive replication slot sb1_slot to be invalidated";
 
+$primary->safe_psql('postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb2_slot');
+]);
+
+$primary->safe_psql('postgres', qq[
+    ALTER SYSTEM SET inactive_replication_slot_timeout TO 0;
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name,
+	has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby2->append_conf('postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+$standby2->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+$primary->poll_query_until('postgres', qq[
+	SELECT xmin IS NOT NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb2_slot';
+]) or die "Timed out waiting for slot xmin to advance";
+
+$primary->safe_psql('postgres', qq[
+	ALTER SYSTEM SET max_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby2->stop;
+
+# Do some work to advance xmin
+$primary->safe_psql(
+	'postgres', q{
+do $$
+begin
+  for i in 10000..11000 loop
+    -- use an exception block so that each iteration eats an XID
+    begin
+      insert into tab_int values (i);
+    exception
+      when division_by_zero then null;
+    end;
+  end loop;
+end$$;
+});
+
+$invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb2_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb2_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until('postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb2_slot' AND
+		invalidation_reason = 'xid_aged';
+]) or die "Timed out while waiting for replication slot sb2_slot to be invalidated";
+
 done_testing();
-- 
2.34.1

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#2)

4 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sat, Jan 27, 2024 at 1:18 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Jan 11, 2024 at 10:48 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

Replication slots in postgres will prevent removal of required
resources when there is no connection using them (inactive). This
consumes storage because neither required WAL nor required rows from
the user tables/system catalogs can be removed by VACUUM as long as
they are required by a replication slot. In extreme cases this could
cause the transaction ID wraparound.

Currently postgres has the ability to invalidate inactive replication
slots based on the amount of WAL (set via max_slot_wal_keep_size GUC)
that will be needed for the slots in case they become active. However,
the wraparound issue isn't effectively covered by
max_slot_wal_keep_size - one can't tell postgres to invalidate a
replication slot if it is blocking VACUUM. Also, it is often tricky to
choose a default value for max_slot_wal_keep_size, because the amount
of WAL that gets generated and allocated storage for the database can
vary.

Therefore, it is often easy for developers to do the following:
a) set an XID age (age of slot's xmin or catalog_xmin) of say 1 or 1.5
billion, after which the slots get invalidated.
b) set a timeout of say 1 or 2 or 3 days, after which the inactive
slots get invalidated.

To implement (a), postgres needs a new GUC called max_slot_xid_age.
The checkpointer then invalidates all the slots whose xmin (the oldest
transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain) has reached the age
specified by this setting.

To implement (b), first postgres needs to track the replication slot
metrics like the time at which the slot became inactive (inactive_at
timestamptz) and the total number of times the slot became inactive in
its lifetime (inactive_count numeric) in ReplicationSlotPersistentData
structure. And, then it needs a new timeout GUC called
inactive_replication_slot_timeout. Whenever a slot becomes inactive,
the current timestamp and inactive count are stored in
ReplicationSlotPersistentData structure and persisted to disk. The
checkpointer then invalidates all the slots that are lying inactive
for about inactive_replication_slot_timeout duration starting from
inactive_at.

In addition to implementing (b), these two new metrics enable
developers to improve their monitoring tools as the metrics are
exposed via pg_replication_slots system view. For instance, one can
build a monitoring tool that signals when replication slots are lying
inactive for a day or so using inactive_at metric, and/or when a
replication slot is becoming inactive too frequently using inactive_at
metric.

I’m attaching the v1 patch set as described below:
0001 - Tracks invalidation_reason in pg_replication_slots. This is
needed because slots now have multiple reasons for slot invalidation.
0002 - Tracks inactive replication slot information inactive_at and
inactive_timeout.
0003 - Adds inactive_timeout based replication slot invalidation.
0004 - Adds XID based replication slot invalidation.

Thoughts?

Needed a rebase due to c393308b. Please find the attached v2 patch set.

Needed a rebase due to commit 776621a (conflict in
src/test/recovery/meson.build for new TAP test file added). Please
find the attached v3 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v3-0004-Add-XID-based-replication-slot-invalidation.patchapplication/x-patch; name=v3-0004-Add-XID-based-replication-slot-invalidation.patchDownload

From ecfb669fa1f4356d75ef9a8ef0560de804cdaf56 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 31 Jan 2024 12:16:33 +0000
Subject: [PATCH v3 4/4] Add XID based replication slot invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres uses replication slot xmin (the
oldest transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain), and a new GUC
max_slot_xid_age. The checkpointer then looks at all replication
slots invalidating the slots based on the age set.
---
 doc/src/sgml/config.sgml                      | 21 +++++
 src/backend/access/transam/xlog.c             | 10 +++
 src/backend/replication/slot.c                | 41 ++++++++++
 src/backend/replication/slotfuncs.c           |  3 +
 src/backend/utils/misc/guc_tables.c           | 10 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/replication/slot.h                |  3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 81 +++++++++++++++++++
 8 files changed, 170 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 099b3fc5cc..0204b1c86a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4423,6 +4423,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age">
+      <term><varname>max_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>max_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f7ce2cbbb4..a69099247a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7056,6 +7056,11 @@ CreateCheckPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7505,6 +7510,11 @@ CreateRestartPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 57bca68547..644ff6f701 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -101,6 +101,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10;
 int			inactive_replication_slot_timeout = 0;
+int			max_slot_xid_age = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropAcquired(void);
@@ -1375,6 +1376,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_INACTIVE_TIMEOUT:
 			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by inactive_replication_slot_timeout."));
 			break;
+		case RS_INVAL_XID_AGE:
+			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1487,6 +1491,42 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 							conflict = cause;
 					}
 					break;
+				case RS_INVAL_XID_AGE:
+					{
+						TransactionId xid_cur = ReadNextTransactionId();
+						TransactionId xid_limit;
+						TransactionId xid_slot;
+
+						if (TransactionIdIsNormal(s->data.xmin))
+						{
+							xid_slot = s->data.xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+						if (TransactionIdIsNormal(s->data.catalog_xmin))
+						{
+							xid_slot = s->data.catalog_xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1633,6 +1673,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 972c7b2baf..21cd76d708 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -428,6 +428,9 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			case RS_INVAL_INACTIVE_TIMEOUT:
 				values[i++] = CStringGetTextDatum("inactive_timeout");
 				break;
+			case RS_INVAL_XID_AGE:
+				values[i++] = CStringGetTextDatum("xid_aged");
+				break;
 		}
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index f08563479b..f2bf3d64d9 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2914,6 +2914,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"max_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&max_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 9fc1f2faed..8743426b12 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -326,6 +326,7 @@
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
 #inactive_replication_slot_timeout = 0	# in seconds; 0 disables
+#max_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index b6607aa97b..71bc610ec9 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -52,6 +52,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* inactive slot timeout has occurred */
 	RS_INVAL_INACTIVE_TIMEOUT,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 /*
@@ -225,6 +227,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT int inactive_replication_slot_timeout;
+extern PGDLLIMPORT int max_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index bf1cd4bbcc..e7da98412c 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -84,4 +84,85 @@ $primary->poll_query_until('postgres', qq[
 		invalidation_reason = 'inactive_timeout';
 ]) or die "Timed out while waiting for inactive replication slot sb1_slot to be invalidated";
 
+$primary->safe_psql('postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb2_slot');
+]);
+
+$primary->safe_psql('postgres', qq[
+    ALTER SYSTEM SET inactive_replication_slot_timeout TO 0;
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name,
+	has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby2->append_conf('postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+$standby2->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+$primary->poll_query_until('postgres', qq[
+	SELECT xmin IS NOT NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb2_slot';
+]) or die "Timed out waiting for slot xmin to advance";
+
+$primary->safe_psql('postgres', qq[
+	ALTER SYSTEM SET max_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby2->stop;
+
+# Do some work to advance xmin
+$primary->safe_psql(
+	'postgres', q{
+do $$
+begin
+  for i in 10000..11000 loop
+    -- use an exception block so that each iteration eats an XID
+    begin
+      insert into tab_int values (i);
+    exception
+      when division_by_zero then null;
+    end;
+  end loop;
+end$$;
+});
+
+$invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb2_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb2_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until('postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb2_slot' AND
+		invalidation_reason = 'xid_aged';
+]) or die "Timed out while waiting for replication slot sb2_slot to be invalidated";
+
 done_testing();
-- 
2.34.1

v3-0002-Track-inactive-replication-slot-information.patchapplication/x-patch; name=v3-0002-Track-inactive-replication-slot-information.patchDownload

From cc5ff196a3861a3e4c27b6d5925f2a09530de689 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 31 Jan 2024 12:15:04 +0000
Subject: [PATCH v3 2/4] Track inactive replication slot information

Currently postgres doesn't track metrics like the time at which
the slot became inactive, and the total number of times the slot
became inactive in its lifetime. This commit adds two new metrics
inactive_at of type timestamptz and inactive_count of type numeric
to ReplicationSlotPersistentData. Whenever a slot becomes
inactive, the current timestamp and inactive count are persisted
to disk.

These metrics are useful in the following ways:

- To improve replication slot monitoring tools. For instance, one
can build a monitoring tool that signals a) when replication slots
is lying inactive for a day or so using inactive_at metric,
b) when a replication slot is becoming inactive too frequently
using inactive_at metric.

- To implement timeout-based inactive replication slot management
capability in postgres.

Increases SLOT_VERSION due to the added two new metrics.
---
 doc/src/sgml/system-views.sgml       | 20 +++++++++++
 src/backend/catalog/system_views.sql |  4 ++-
 src/backend/replication/slot.c       | 50 +++++++++++++++++++++++-----
 src/backend/replication/slotfuncs.c  | 15 ++++++++-
 src/include/catalog/pg_proc.dat      |  6 ++--
 src/include/replication/slot.h       |  6 ++++
 src/test/regress/expected/rules.out  |  6 ++--
 7 files changed, 91 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index c61312793c..75f99f4ca0 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,6 +2566,26 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        Always false for physical slots.
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_at</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_count</structfield> <type>numeric</type>
+      </para>
+      <para>
+        The total number of times the slot became inactive in its lifetime.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 9d401003e8..eba97e8494 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1024,7 +1024,9 @@ CREATE VIEW pg_replication_slots AS
             L.safe_wal_size,
             L.two_phase,
             L.invalidation_reason,
-            L.failover
+            L.failover,
+            L.inactive_at,
+            L.inactive_count
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 110cb59783..9662b7f70d 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -90,7 +90,7 @@ typedef struct ReplicationSlotOnDisk
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	4		/* version for new files */
+#define SLOT_VERSION	5		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -315,6 +315,8 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase = two_phase;
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
+	slot->data.inactive_at = 0;
+	slot->data.inactive_count = 0;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
@@ -541,6 +543,17 @@ retry:
 
 	if (am_walsender)
 	{
+		if (s->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&s->mutex);
+			s->data.inactive_at = 0;
+			SpinLockRelease(&s->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				SlotIsLogical(s)
 				? errmsg("acquired logical replication slot \"%s\"",
@@ -608,16 +621,27 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
-	MyReplicationSlot = NULL;
-
-	/* might not have been set when we've been a plain slot */
-	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
-	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
-	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
-	LWLockRelease(ProcArrayLock);
-
 	if (am_walsender)
 	{
+		if (slot->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&slot->mutex);
+			slot->data.inactive_at = GetCurrentTimestamp();
+
+			/*
+			 * XXX: Can inactive_count of type uint64 ever overflow? It takes
+			 * about a half-billion years for inactive_count to overflow even
+			 * if slot becomes inactive for every 1 millisecond. So, using
+			 * pg_add_u64_overflow might be an overkill.
+			 */
+			slot->data.inactive_count++;
+			SpinLockRelease(&slot->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				is_logical
 				? errmsg("released logical replication slot \"%s\"",
@@ -627,6 +651,14 @@ ReplicationSlotRelease(void)
 
 		pfree(slotname);
 	}
+
+	MyReplicationSlot = NULL;
+
+	/* might not have been set when we've been a plain slot */
+	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
+	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
+	LWLockRelease(ProcArrayLock);
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index e53aeb37c9..3c53f4ac48 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -237,10 +237,11 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 16
+#define PG_GET_REPLICATION_SLOTS_COLS 18
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
+	char		buf[256];
 
 	/*
 	 * We don't require any special permission to see this function's data
@@ -428,6 +429,18 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
+		if (slot_contents.data.inactive_at > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.data.inactive_at);
+		else
+			nulls[i++] = true;
+
+		/* Convert to numeric. */
+		snprintf(buf, sizeof buf, UINT64_FORMAT, slot_contents.data.inactive_count);
+		values[i++] = DirectFunctionCall3(numeric_in,
+										  CStringGetDatum(buf),
+										  ObjectIdGetDatum(0),
+										  Int32GetDatum(-1));
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index de1115baa0..52e9fc4971 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11127,9 +11127,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,invalidation_reason,failover}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,timestamptz,numeric}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,invalidation_reason,failover,inactive_at,inactive_count}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index da4c776492..380dcc90ca 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -117,6 +117,12 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* When did this slot become inactive last time? */
+	TimestampTz inactive_at;
+
+	/* How many times the slot has been inactive? */
+	uint64		inactive_count;
 } ReplicationSlotPersistentData;
 
 /*
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 022f9bccb0..4a3cb182e6 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1474,8 +1474,10 @@ pg_replication_slots| SELECT l.slot_name,
     l.safe_wal_size,
     l.two_phase,
     l.invalidation_reason,
-    l.failover
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, invalidation_reason, failover)
+    l.failover,
+    l.inactive_at,
+    l.inactive_count
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, invalidation_reason, failover, inactive_at, inactive_count)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v3-0001-Track-invalidation_reason-in-pg_replication_slots.patchapplication/x-patch; name=v3-0001-Track-invalidation_reason-in-pg_replication_slots.patchDownload

From 6c4760366ba9867a2baca9cedb3b58ef8924a1fe Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 31 Jan 2024 12:14:47 +0000
Subject: [PATCH v3 1/4] Track invalidation_reason in pg_replication_slots

Currently the reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
conflict_reason to show the reasons for slot invalidation, but
only for logical slots. This commit renames conflict_reason to
invalidation_reason, and adds the support to show invalidation
reasons for both physical and logical slots.
---
 doc/src/sgml/system-views.sgml                | 11 +++---
 src/backend/catalog/system_views.sql          |  2 +-
 src/backend/replication/slotfuncs.c           | 37 ++++++++-----------
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  2 +-
 .../t/035_standby_logical_decoding.pl         | 32 ++++++++--------
 src/test/regress/expected/rules.out           |  4 +-
 7 files changed, 44 insertions(+), 48 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index dd468b31ea..c61312793c 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,13 +2525,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflict_reason</structfield> <type>text</type>
+       <structfield>invalidation_reason</structfield> <type>text</type>
       </para>
       <para>
-       The reason for the logical slot's conflict with recovery. It is always
-       NULL for physical slots, as well as for logical slots which are not
-       invalidated. The non-NULL values indicate that the slot is marked
-       as invalidated. Possible values are:
+       The reason for the slot's invalidation. <literal>NULL</literal> if the
+       slot is currently actively being used. The non-NULL values indicate that
+       the slot is marked as invalidated. In case of logical slots, it
+       represents the reason for the logical slot's conflict with recovery.
+       Possible values are:
        <itemizedlist spacing="compact">
         <listitem>
          <para>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 6791bff9dd..9d401003e8 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflict_reason,
+            L.invalidation_reason,
             L.failover
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index eb685089b3..e53aeb37c9 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -407,28 +407,23 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.data.database == InvalidOid)
-			nulls[i++] = true;
-		else
+		switch (slot_contents.data.invalidated)
 		{
-			switch (slot_contents.data.invalidated)
-			{
-				case RS_INVAL_NONE:
-					nulls[i++] = true;
-					break;
-
-				case RS_INVAL_WAL_REMOVED:
-					values[i++] = CStringGetTextDatum("wal_removed");
-					break;
-
-				case RS_INVAL_HORIZON:
-					values[i++] = CStringGetTextDatum("rows_removed");
-					break;
-
-				case RS_INVAL_WAL_LEVEL:
-					values[i++] = CStringGetTextDatum("wal_level_insufficient");
-					break;
-			}
+			case RS_INVAL_NONE:
+				nulls[i++] = true;
+				break;
+
+			case RS_INVAL_WAL_REMOVED:
+				values[i++] = CStringGetTextDatum("wal_removed");
+				break;
+
+			case RS_INVAL_HORIZON:
+				values[i++] = CStringGetTextDatum("rows_removed");
+				break;
+
+			case RS_INVAL_WAL_LEVEL:
+				values[i++] = CStringGetTextDatum("wal_level_insufficient");
+				break;
 		}
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 183c2f84eb..9683c91d4a 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -667,13 +667,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+							"(CASE WHEN invalidation_reason IS NOT NULL THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 29af4ce65d..de1115baa0 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11129,7 +11129,7 @@
   proargtypes => '',
   proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,invalidation_reason,failover}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index cebfa52d0f..f2c58a8a06 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,7 +168,7 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflict_reason in pg_replication_slots.
+# Check invalidation_reason in pg_replication_slots.
 sub check_slots_conflict_reason
 {
 	my ($slot_prefix, $reason) = @_;
@@ -178,15 +178,15 @@ sub check_slots_conflict_reason
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$active_slot';));
 
-	is($res, "$reason", "$active_slot conflict_reason is $reason");
+	is($res, "$reason", "$active_slot invalidation_reason is $reason");
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$inactive_slot';));
 
-	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
+	is($res, "$reason", "$inactive_slot invalidation_reason is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -293,13 +293,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflict_reason is NULL for physical slot
+# Check invalidation_reason is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT invalidation_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflict_reason as NULL");
+is($res, 't', "Physical slot reports invalidation_reason as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -512,7 +512,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 $handle =
@@ -531,7 +531,7 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify conflict_reason is retained across a restart.
+# Verify invalidation_reason is retained across a restart.
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
@@ -540,7 +540,7 @@ check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
+	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and invalidation_reason is not null;"
 );
 
 chomp($restart_lsn);
@@ -591,7 +591,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
@@ -627,7 +627,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
@@ -680,7 +680,7 @@ ok( $node_standby->poll_query_until(
 is( $node_standby->safe_psql(
 		'postgres',
 		q[select bool_or(conflicting) from
-		  (select conflict_reason is not NULL as conflicting
+		  (select invalidation_reason is not NULL as conflicting
 		   from pg_replication_slots WHERE slot_type = 'logical')]),
 	'f',
 	'Logical slots are reported as non conflicting');
@@ -719,7 +719,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
@@ -763,7 +763,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+# Verify invalidation_reason is 'wal_level_insufficient' in pg_replication_slots
 check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index abc944e8b8..022f9bccb0 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,9 +1473,9 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflict_reason,
+    l.invalidation_reason,
     l.failover
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, invalidation_reason, failover)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v3-0003-Add-inactive_timeout-based-replication-slot-inval.patchapplication/x-patch; name=v3-0003-Add-inactive_timeout-based-replication-slot-inval.patchDownload

From 0850c7762bed95dee650e35433a8e9d2ab54d50e Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 31 Jan 2024 12:16:14 +0000
Subject: [PATCH v3 3/4] Add inactive_timeout based replication slot
 invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get dropped.

To achieve the above, postgres uses replication slot metric
inactive_at (the time at which the slot became inactive), and a
new GUC inactive_replication_slot_timeout. The checkpointer then
looks at all replication slots invalidating the inactive slots
based on the timeout set.
---
 doc/src/sgml/config.sgml                      | 18 ++++
 src/backend/access/transam/xlog.c             | 10 +++
 src/backend/replication/slot.c                | 24 ++++-
 src/backend/replication/slotfuncs.c           |  3 +
 src/backend/utils/misc/guc_tables.c           | 12 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/replication/slot.h                |  3 +
 src/test/recovery/meson.build                 |  1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 87 +++++++++++++++++++
 9 files changed, 156 insertions(+), 3 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 61038472c5..099b3fc5cc 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4405,6 +4405,24 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-inactive-replication-slot-timeout" xreflabel="inactive_replication_slot_timeout">
+      <term><varname>inactive_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>inactive_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time at the next checkpoint. If this value is specified
+        without units, it is taken as seconds. A value of zero (which is
+        default) disables the timeout mechanism. This parameter can only be
+        set in the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 478377c4a2..f7ce2cbbb4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7051,6 +7051,11 @@ CreateCheckPoint(int flags)
 	if (PriorRedoPtr != InvalidXLogRecPtr)
 		UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7495,6 +7500,11 @@ CreateRestartPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 9662b7f70d..57bca68547 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -98,9 +98,9 @@ ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
 /* My backend's replication slot in the shared memory array */
 ReplicationSlot *MyReplicationSlot = NULL;
 
-/* GUC variable */
-int			max_replication_slots = 10; /* the maximum number of replication
-										 * slots */
+/* GUC variables */
+int			max_replication_slots = 10;
+int			inactive_replication_slot_timeout = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropAcquired(void);
@@ -1372,6 +1372,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by inactive_replication_slot_timeout."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1470,6 +1473,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						conflict = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (s->data.inactive_at > 0)
+					{
+						TimestampTz now;
+
+						Assert(s->data.persistency == RS_PERSISTENT);
+						Assert(s->active_pid == 0);
+
+						now = GetCurrentTimestamp();
+						if (TimestampDifferenceExceeds(s->data.inactive_at, now,
+													   inactive_replication_slot_timeout * 1000))
+							conflict = cause;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1615,6 +1632,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 3c53f4ac48..972c7b2baf 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -425,6 +425,9 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			case RS_INVAL_WAL_LEVEL:
 				values[i++] = CStringGetTextDatum("wal_level_insufficient");
 				break;
+			case RS_INVAL_INACTIVE_TIMEOUT:
+				values[i++] = CStringGetTextDatum("inactive_timeout");
+				break;
 		}
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 7fe58518d7..f08563479b 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2902,6 +2902,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"inactive_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&inactive_replication_slot_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index da10b43dac..9fc1f2faed 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -325,6 +325,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#inactive_replication_slot_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 380dcc90ca..b6607aa97b 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -50,6 +50,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 /*
@@ -222,6 +224,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
+extern PGDLLIMPORT int inactive_replication_slot_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index bf087ac2a9..e07b941d73 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -46,6 +46,7 @@ tests += {
       't/038_save_logical_slots_shutdown.pl',
       't/039_end_of_wal.pl',
       't/040_standby_failover_slots_sync.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..bf1cd4bbcc
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,87 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Initialize primary node, setting wal-segsize to 1MB
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 1, extra => ['--wal-segsize=1']);
+$primary->append_conf('postgresql.conf', q{
+checkpoint_timeout = 1h
+});
+$primary->start;
+$primary->safe_psql('postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb1_slot');
+]);
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name,
+	has_streaming => 1);
+$standby1->append_conf('postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+});
+$standby1->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# The inactive replication slot info should be null when the slot is active
+my $result = $primary->safe_psql('postgres', qq[
+	SELECT inactive_at IS NULL, inactive_count = 0 AS OK
+		FROM pg_replication_slots WHERE slot_name = 'sb1_slot';
+]);
+is($result, "t|t", 'check the inactive replication slot info for an active slot');
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql('postgres', qq[
+    ALTER SYSTEM SET inactive_replication_slot_timeout TO '1s';
+]);
+$primary->reload;
+
+my $logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby1->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until('postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE inactive_at IS NOT NULL AND
+		inactive_count = 1 AND slot_name = 'sb1_slot';
+]) or die "Timed out while waiting for inactive replication slot info to be updated";
+
+my $invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb1_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb1_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until('postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'inactive_timeout';
+]) or die "Timed out while waiting for inactive replication slot sb1_slot to be invalidated";
+
+done_testing();
-- 
2.34.1

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#1)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Thu, Jan 11, 2024 at 10:48:13AM +0530, Bharath Rupireddy wrote:

Hi,

Therefore, it is often easy for developers to do the following:
a) set an XID age (age of slot's xmin or catalog_xmin) of say 1 or 1.5
billion, after which the slots get invalidated.
b) set a timeout of say 1 or 2 or 3 days, after which the inactive
slots get invalidated.

To implement (a), postgres needs a new GUC called max_slot_xid_age.
The checkpointer then invalidates all the slots whose xmin (the oldest
transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain) has reached the age
specified by this setting.

To implement (b), first postgres needs to track the replication slot
metrics like the time at which the slot became inactive (inactive_at
timestamptz) and the total number of times the slot became inactive in
its lifetime (inactive_count numeric) in ReplicationSlotPersistentData
structure. And, then it needs a new timeout GUC called
inactive_replication_slot_timeout. Whenever a slot becomes inactive,
the current timestamp and inactive count are stored in
ReplicationSlotPersistentData structure and persisted to disk. The
checkpointer then invalidates all the slots that are lying inactive
for about inactive_replication_slot_timeout duration starting from
inactive_at.

In addition to implementing (b), these two new metrics enable
developers to improve their monitoring tools as the metrics are
exposed via pg_replication_slots system view. For instance, one can
build a monitoring tool that signals when replication slots are lying
inactive for a day or so using inactive_at metric, and/or when a
replication slot is becoming inactive too frequently using inactive_at
metric.

Thanks for the patch and +1 for the idea, I think adding those new
"invalidation reasons" make sense.

I’m attaching the v1 patch set as described below:
0001 - Tracks invalidation_reason in pg_replication_slots. This is
needed because slots now have multiple reasons for slot invalidation.
0002 - Tracks inactive replication slot information inactive_at and
inactive_timeout.
0003 - Adds inactive_timeout based replication slot invalidation.
0004 - Adds XID based replication slot invalidation.

I think it's better to have the XID one being discussed/implemented before the
inactive_timeout one: what about changing the 0002, 0003 and 0004 ordering?

0004 -> 0002
0002 -> 0003
0003 -> 0004

As far 0001:

"
This commit renames conflict_reason to
invalidation_reason, and adds the support to show invalidation
reasons for both physical and logical slots.
"

I'm not sure I like the fact that "invalidations" and "conflicts" are merged
into a single field. I'd vote to keep conflict_reason as it is and add a new
invalidation_reason (and put "conflict" as value when it is the case). The reason
is that I think they are 2 different concepts (could be linked though) and that
it would be easier to check for conflicts (means conflict_reason is not NULL).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Dilip Kumar

dilipbalaut@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#1)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Jan 11, 2024 at 10:48 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

Replication slots in postgres will prevent removal of required
resources when there is no connection using them (inactive). This
consumes storage because neither required WAL nor required rows from
the user tables/system catalogs can be removed by VACUUM as long as
they are required by a replication slot. In extreme cases this could
cause the transaction ID wraparound.

Currently postgres has the ability to invalidate inactive replication
slots based on the amount of WAL (set via max_slot_wal_keep_size GUC)
that will be needed for the slots in case they become active. However,
the wraparound issue isn't effectively covered by
max_slot_wal_keep_size - one can't tell postgres to invalidate a
replication slot if it is blocking VACUUM. Also, it is often tricky to
choose a default value for max_slot_wal_keep_size, because the amount
of WAL that gets generated and allocated storage for the database can
vary.

Therefore, it is often easy for developers to do the following:
a) set an XID age (age of slot's xmin or catalog_xmin) of say 1 or 1.5
billion, after which the slots get invalidated.
b) set a timeout of say 1 or 2 or 3 days, after which the inactive
slots get invalidated.

To implement (a), postgres needs a new GUC called max_slot_xid_age.
The checkpointer then invalidates all the slots whose xmin (the oldest
transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain) has reached the age
specified by this setting.

To implement (b), first postgres needs to track the replication slot
metrics like the time at which the slot became inactive (inactive_at
timestamptz) and the total number of times the slot became inactive in
its lifetime (inactive_count numeric) in ReplicationSlotPersistentData
structure. And, then it needs a new timeout GUC called
inactive_replication_slot_timeout. Whenever a slot becomes inactive,
the current timestamp and inactive count are stored in
ReplicationSlotPersistentData structure and persisted to disk. The
checkpointer then invalidates all the slots that are lying inactive
for about inactive_replication_slot_timeout duration starting from
inactive_at.

In addition to implementing (b), these two new metrics enable
developers to improve their monitoring tools as the metrics are
exposed via pg_replication_slots system view. For instance, one can
build a monitoring tool that signals when replication slots are lying
inactive for a day or so using inactive_at metric, and/or when a
replication slot is becoming inactive too frequently using inactive_at
metric.

I’m attaching the v1 patch set as described below:
0001 - Tracks invalidation_reason in pg_replication_slots. This is
needed because slots now have multiple reasons for slot invalidation.
0002 - Tracks inactive replication slot information inactive_at and
inactive_timeout.
0003 - Adds inactive_timeout based replication slot invalidation.
0004 - Adds XID based replication slot invalidation.

Thoughts?

+1 for the idea, here are some comments on 0002, I will review other
patches soon and respond.

1.
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_at</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>

Maybe we can change the field name to 'last_inactive_at'? or maybe the
comment can explain timestampt at which slot was last inactivated.
I think since we are already maintaining the inactive_count so better
to explicitly say this is the last invaliding time.

2.
+ /*
+ * XXX: Can inactive_count of type uint64 ever overflow? It takes
+ * about a half-billion years for inactive_count to overflow even
+ * if slot becomes inactive for every 1 millisecond. So, using
+ * pg_add_u64_overflow might be an overkill.
+ */

Correct we don't need to use pg_add_u64_overflow for this counter.

+
+ /* Convert to numeric. */
+ snprintf(buf, sizeof buf, UINT64_FORMAT, slot_contents.data.inactive_count);
+ values[i++] = DirectFunctionCall3(numeric_in,
+   CStringGetDatum(buf),
+   ObjectIdGetDatum(0),
+   Int32GetDatum(-1));

What is the purpose of doing this? I mean inactive_count is 8 byte
integer and you can define function outparameter as 'int8' which is 8
byte integer. Then you don't need to convert int to string and then
to numeric?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Dilip Kumar (#5)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Feb 6, 2024 at 2:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

Thoughts?

+1 for the idea, here are some comments on 0002, I will review other
patches soon and respond.

Thanks for looking at it.

+ <structfield>inactive_at</structfield> <type>timestamptz</type>

Maybe we can change the field name to 'last_inactive_at'? or maybe the
comment can explain timestampt at which slot was last inactivated.
I think since we are already maintaining the inactive_count so better
to explicitly say this is the last invaliding time.

last_inactive_at looks better, so will use that in the next version of
the patch.

2.
+ /*
+ * XXX: Can inactive_count of type uint64 ever overflow? It takes
+ * about a half-billion years for inactive_count to overflow even
+ * if slot becomes inactive for every 1 millisecond. So, using
+ * pg_add_u64_overflow might be an overkill.
+ */

Correct we don't need to use pg_add_u64_overflow for this counter.

Will remove this comment in the next version of the patch.

+ /* Convert to numeric. */
+ snprintf(buf, sizeof buf, UINT64_FORMAT, slot_contents.data.inactive_count);
+ values[i++] = DirectFunctionCall3(numeric_in,
+   CStringGetDatum(buf),
+   ObjectIdGetDatum(0),
+   Int32GetDatum(-1));
What is the purpose of doing this? I mean inactive_count is 8 byte
integer and you can define function outparameter as 'int8' which is 8
byte integer. Then you don't need to convert int to string and then
to numeric?

Nope, it's of type uint64, so reporting it as numeric is a way
typically used elsewhere - see code around /* Convert to numeric. */.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#4)

4 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Feb 5, 2024 at 3:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Thanks for the patch and +1 for the idea, I think adding those new
"invalidation reasons" make sense.

Thanks for looking at it.

I think it's better to have the XID one being discussed/implemented before the
inactive_timeout one: what about changing the 0002, 0003 and 0004 ordering?

0004 -> 0002
0002 -> 0003
0003 -> 0004

Done that way.

As far 0001:

"
This commit renames conflict_reason to
invalidation_reason, and adds the support to show invalidation
reasons for both physical and logical slots.
"

I'm not sure I like the fact that "invalidations" and "conflicts" are merged
into a single field. I'd vote to keep conflict_reason as it is and add a new
invalidation_reason (and put "conflict" as value when it is the case). The reason
is that I think they are 2 different concepts (could be linked though) and that
it would be easier to check for conflicts (means conflict_reason is not NULL).

So, do you want conflict_reason for only logical slots, and a separate
column for invalidation_reason for both logical and physical slots? Is
there any strong reason to have two properties "conflict" and
"invalidated" for slots? They both are the same internally, so why
confuse the users?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v4-0004-Add-inactive_timeout-based-replication-slot.patchapplication/x-patch; name=v4-0004-Add-inactive_timeout-based-replication-slot.patchDownload

From 5c965e485f0abb3e7c55484b136a986597975ef6 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 6 Feb 2024 16:27:12 +0000
Subject: [PATCH v4 4/4] Add inactive_timeout based replication slot 
 invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get dropped.

To achieve the above, postgres uses replication slot metric
inactive_at (the time at which the slot became inactive), and a
new GUC inactive_replication_slot_timeout. The checkpointer then
looks at all replication slots invalidating the inactive slots
based on the timeout set.
---
 doc/src/sgml/config.sgml                      | 18 +++++
 src/backend/access/transam/xlog.c             | 10 +++
 src/backend/replication/slot.c                | 19 ++++++
 src/backend/replication/slotfuncs.c           |  3 +
 src/backend/utils/misc/guc_tables.c           | 12 ++++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/replication/slot.h                |  3 +
 src/test/recovery/meson.build                 |  1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 68 +++++++++++++++++++
 9 files changed, 135 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index bc8c039b06..0ae3a15400 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4426,6 +4426,24 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-inactive-replication-slot-timeout" xreflabel="inactive_replication_slot_timeout">
+      <term><varname>inactive_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>inactive_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time at the next checkpoint. If this value is specified
+        without units, it is taken as seconds. A value of zero (which is
+        default) disables the timeout mechanism. This parameter can only be
+        set in the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index dbf2fa5911..4f5ee71638 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7056,6 +7056,11 @@ CreateCheckPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7505,6 +7510,11 @@ CreateRestartPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 112b52a6dc..94b232189b 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -102,6 +102,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 int			max_slot_xid_age = 0;
+int			inactive_replication_slot_timeout = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropAcquired(void);
@@ -1369,6 +1370,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_XID_AGE:
 			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by inactive_replication_slot_timeout."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1503,6 +1507,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 						}
 					}
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (s->data.last_inactive_at > 0)
+					{
+						TimestampTz now;
+
+						Assert(s->data.persistency == RS_PERSISTENT);
+						Assert(s->active_pid == 0);
+
+						now = GetCurrentTimestamp();
+						if (TimestampDifferenceExceeds(s->data.last_inactive_at, now,
+													   inactive_replication_slot_timeout * 1000))
+							conflict = cause;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1649,6 +1667,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index c402fa4c82..5cc6752265 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -429,6 +429,9 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			case RS_INVAL_XID_AGE:
 				values[i++] = CStringGetTextDatum("xid_aged");
 				break;
+			case RS_INVAL_INACTIVE_TIMEOUT:
+				values[i++] = CStringGetTextDatum("inactive_timeout");
+				break;
 		}
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2a6ad9abbb..2232e62e4b 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2912,6 +2912,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"inactive_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&inactive_replication_slot_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 6bd8959849..a0b4f309fc 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -252,6 +252,7 @@
 #recovery_prefetch = try	# prefetch pages referenced in the WAL?
 #wal_decode_buffer_size = 512kB	# lookahead window used for prefetching
 				# (change requires restart)
+#inactive_replication_slot_timeout = 0	# in seconds; 0 disables
 
 # - Archiving -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 0de54c3d65..c14ae5f6c0 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -52,6 +52,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* slot's xmin or catalog_xmin has reached the age */
 	RS_INVAL_XID_AGE,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 /*
@@ -225,6 +227,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT int max_slot_xid_age;
+extern PGDLLIMPORT int inactive_replication_slot_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index bf087ac2a9..e07b941d73 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -46,6 +46,7 @@ tests += {
       't/038_save_logical_slots_shutdown.pl',
       't/039_end_of_wal.pl',
       't/040_standby_failover_slots_sync.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index e2d0cb5993..3c9b3e6a7e 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -98,4 +98,72 @@ $primary->poll_query_until('postgres', qq[
 		invalidation_reason = 'xid_aged';
 ]) or die "Timed out while waiting for replication slot sb1_slot to be invalidated";
 
+$primary->safe_psql('postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb2_slot');
+]);
+
+$primary->safe_psql('postgres', qq[
+    ALTER SYSTEM SET max_slot_xid_age = 0;
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name,
+	has_streaming => 1);
+$standby2->append_conf('postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+});
+$standby2->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+# The inactive replication slot info should be null when the slot is active
+my $result = $primary->safe_psql('postgres', qq[
+	SELECT last_inactive_at IS NULL, inactive_count = 0 AS OK
+		FROM pg_replication_slots WHERE slot_name = 'sb2_slot';
+]);
+is($result, "t|t", 'check the inactive replication slot info for an active slot');
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql('postgres', qq[
+    ALTER SYSTEM SET inactive_replication_slot_timeout TO '1s';
+]);
+$primary->reload;
+
+$logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby2->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until('postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_at IS NOT NULL AND
+		inactive_count = 1 AND slot_name = 'sb2_slot';
+]) or die "Timed out while waiting for inactive replication slot info to be updated";
+
+$invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb2_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb2_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until('postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb2_slot' AND
+		invalidation_reason = 'inactive_timeout';
+]) or die "Timed out while waiting for inactive replication slot sb2_slot to be invalidated";
+
 done_testing();
-- 
2.34.1

v4-0003-Track-inactive-replication-slot-information.patchapplication/x-patch; name=v4-0003-Track-inactive-replication-slot-information.patchDownload

From b5a4ac46d35e19c310da3f9a36a49e7f91079713 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 6 Feb 2024 16:18:35 +0000
Subject: [PATCH v4 3/4] Track inactive replication slot information

Currently postgres doesn't track metrics like the time at which
the slot became inactive, and the total number of times the slot
became inactive in its lifetime. This commit adds two new metrics
last_inactive_at of type timestamptz and inactive_count of type numeric
to ReplicationSlotPersistentData. Whenever a slot becomes
inactive, the current timestamp and inactive count are persisted
to disk.

These metrics are useful in the following ways:

- To improve replication slot monitoring tools. For instance, one
can build a monitoring tool that signals a) when replication slots
is lying inactive for a day or so using last_inactive_at metric,
b) when a replication slot is becoming inactive too frequently
using last_inactive_at metric.

- To implement timeout-based inactive replication slot management
capability in postgres.

Increases SLOT_VERSION due to the added two new metrics.
---
 doc/src/sgml/system-views.sgml       | 20 +++++++++++++
 src/backend/catalog/system_views.sql |  4 ++-
 src/backend/replication/slot.c       | 43 ++++++++++++++++++++++------
 src/backend/replication/slotfuncs.c  | 15 +++++++++-
 src/include/catalog/pg_proc.dat      |  6 ++--
 src/include/replication/slot.h       |  6 ++++
 src/test/regress/expected/rules.out  |  6 ++--
 7 files changed, 84 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index c61312793c..98de8ca0c2 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,6 +2566,26 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        Always false for physical slots.
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_at</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_count</structfield> <type>numeric</type>
+      </para>
+      <para>
+        The total number of times the slot became inactive in its lifetime.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 9d401003e8..195238f4e1 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1024,7 +1024,9 @@ CREATE VIEW pg_replication_slots AS
             L.safe_wal_size,
             L.two_phase,
             L.invalidation_reason,
-            L.failover
+            L.failover,
+            L.last_inactive_at,
+            L.inactive_count
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index b4bfa7bfd3..112b52a6dc 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -90,7 +90,7 @@ typedef struct ReplicationSlotOnDisk
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	4		/* version for new files */
+#define SLOT_VERSION	5		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -316,6 +316,8 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase = two_phase;
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
+	slot->data.last_inactive_at = 0;
+	slot->data.inactive_count = 0;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
@@ -542,6 +544,17 @@ retry:
 
 	if (am_walsender)
 	{
+		if (s->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&s->mutex);
+			s->data.last_inactive_at = 0;
+			SpinLockRelease(&s->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				SlotIsLogical(s)
 				? errmsg("acquired logical replication slot \"%s\"",
@@ -609,16 +622,20 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
-	MyReplicationSlot = NULL;
-
-	/* might not have been set when we've been a plain slot */
-	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
-	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
-	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
-	LWLockRelease(ProcArrayLock);
-
 	if (am_walsender)
 	{
+		if (slot->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&slot->mutex);
+			slot->data.last_inactive_at = GetCurrentTimestamp();
+			slot->data.inactive_count++;
+			SpinLockRelease(&slot->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				is_logical
 				? errmsg("released logical replication slot \"%s\"",
@@ -628,6 +645,14 @@ ReplicationSlotRelease(void)
 
 		pfree(slotname);
 	}
+
+	MyReplicationSlot = NULL;
+
+	/* might not have been set when we've been a plain slot */
+	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
+	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
+	LWLockRelease(ProcArrayLock);
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index ece5e0e3f7..c402fa4c82 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -237,10 +237,11 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 16
+#define PG_GET_REPLICATION_SLOTS_COLS 18
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
+	char		buf[256];
 
 	/*
 	 * We don't require any special permission to see this function's data
@@ -432,6 +433,18 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
+		if (slot_contents.data.last_inactive_at > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.data.last_inactive_at);
+		else
+			nulls[i++] = true;
+
+		/* Convert to numeric. */
+		snprintf(buf, sizeof buf, UINT64_FORMAT, slot_contents.data.inactive_count);
+		values[i++] = DirectFunctionCall3(numeric_in,
+										  CStringGetDatum(buf),
+										  ObjectIdGetDatum(0),
+										  Int32GetDatum(-1));
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index de1115baa0..3a4a1ce1e6 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11127,9 +11127,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,invalidation_reason,failover}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,timestamptz,numeric}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,invalidation_reason,failover,last_inactive_at,inactive_count}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 25b922171d..0de54c3d65 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -119,6 +119,12 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* When did this slot become inactive last time? */
+	TimestampTz last_inactive_at;
+
+	/* How many times the slot has been inactive? */
+	uint64		inactive_count;
 } ReplicationSlotPersistentData;
 
 /*
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 022f9bccb0..6d1fcc77d1 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1474,8 +1474,10 @@ pg_replication_slots| SELECT l.slot_name,
     l.safe_wal_size,
     l.two_phase,
     l.invalidation_reason,
-    l.failover
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, invalidation_reason, failover)
+    l.failover,
+    l.last_inactive_at,
+    l.inactive_count
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, invalidation_reason, failover, last_inactive_at, inactive_count)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v4-0002-Add-XID-based-replication-slot-invalidation.patchapplication/x-patch; name=v4-0002-Add-XID-based-replication-slot-invalidation.patchDownload

From c74b9c265d6ec8e69ea33971c79a3d472c000ec3 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 6 Feb 2024 16:17:28 +0000
Subject: [PATCH v4 2/4] Add XID based replication slot invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres uses replication slot xmin (the
oldest transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain), and a new GUC
max_slot_xid_age. The checkpointer then looks at all replication
slots invalidating the slots based on the age set.
---
 doc/src/sgml/config.sgml                      |  21 ++++
 src/backend/access/transam/xlog.c             |  10 ++
 src/backend/replication/slot.c                |  41 +++++++
 src/backend/replication/slotfuncs.c           |   4 +
 src/backend/utils/misc/guc_tables.c           |  10 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 101 ++++++++++++++++++
 8 files changed, 191 insertions(+)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 61038472c5..bc8c039b06 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4405,6 +4405,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age">
+      <term><varname>max_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>max_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 478377c4a2..dbf2fa5911 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7051,6 +7051,11 @@ CreateCheckPoint(int flags)
 	if (PriorRedoPtr != InvalidXLogRecPtr)
 		UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7495,6 +7500,11 @@ CreateRestartPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 110cb59783..b4bfa7bfd3 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -101,6 +101,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variable */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			max_slot_xid_age = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropAcquired(void);
@@ -1340,6 +1341,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_XID_AGE:
+			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1438,6 +1442,42 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						conflict = cause;
 					break;
+				case RS_INVAL_XID_AGE:
+					{
+						TransactionId xid_cur = ReadNextTransactionId();
+						TransactionId xid_limit;
+						TransactionId xid_slot;
+
+						if (TransactionIdIsNormal(s->data.xmin))
+						{
+							xid_slot = s->data.xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+						if (TransactionIdIsNormal(s->data.catalog_xmin))
+						{
+							xid_slot = s->data.catalog_xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1583,6 +1623,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index e53aeb37c9..ece5e0e3f7 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -424,6 +424,10 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			case RS_INVAL_WAL_LEVEL:
 				values[i++] = CStringGetTextDatum("wal_level_insufficient");
 				break;
+
+			case RS_INVAL_XID_AGE:
+				values[i++] = CStringGetTextDatum("xid_aged");
+				break;
 		}
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 7fe58518d7..2a6ad9abbb 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2902,6 +2902,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"max_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&max_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index da10b43dac..6bd8959849 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -325,6 +325,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#max_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index da4c776492..25b922171d 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -50,6 +50,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 /*
@@ -216,6 +218,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
+extern PGDLLIMPORT int max_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..e2d0cb5993
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,101 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Initialize primary node, setting wal-segsize to 1MB
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 1, extra => ['--wal-segsize=1']);
+$primary->append_conf('postgresql.conf', q{
+checkpoint_timeout = 1h
+});
+$primary->start;
+$primary->safe_psql('postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb1_slot');
+]);
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name,
+	has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby1->append_conf('postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+$standby1->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+$primary->poll_query_until('postgres', qq[
+	SELECT xmin IS NOT NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb1_slot';
+]) or die "Timed out waiting for slot xmin to advance";
+
+$primary->safe_psql('postgres', qq[
+	ALTER SYSTEM SET max_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby1->stop;
+
+my $logstart = -s $primary->logfile;
+
+# Do some work to advance xmin
+$primary->safe_psql(
+	'postgres', q{
+do $$
+begin
+  for i in 10000..11000 loop
+    -- use an exception block so that each iteration eats an XID
+    begin
+      insert into tab_int values (i);
+    exception
+      when division_by_zero then null;
+    end;
+  end loop;
+end$$;
+});
+
+my $invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb1_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb1_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until('postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'xid_aged';
+]) or die "Timed out while waiting for replication slot sb1_slot to be invalidated";
+
+done_testing();
-- 
2.34.1

v4-0001-Track-invalidation_reason-in-pg_replication_slots.patchapplication/x-patch; name=v4-0001-Track-invalidation_reason-in-pg_replication_slots.patchDownload

From d64a2759b8723fea913b4e0a9b2d9fcd05fed72b Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 6 Feb 2024 15:45:43 +0000
Subject: [PATCH v4 1/4] Track invalidation_reason in pg_replication_slots

Currently the reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
conflict_reason to show the reasons for slot invalidation, but
only for logical slots. This commit renames conflict_reason to
invalidation_reason, and adds the support to show invalidation
reasons for both physical and logical slots.
---
 doc/src/sgml/system-views.sgml                | 11 +++---
 src/backend/catalog/system_views.sql          |  2 +-
 src/backend/replication/slotfuncs.c           | 37 ++++++++-----------
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  2 +-
 .../t/035_standby_logical_decoding.pl         | 32 ++++++++--------
 src/test/regress/expected/rules.out           |  4 +-
 7 files changed, 44 insertions(+), 48 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index dd468b31ea..c61312793c 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,13 +2525,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflict_reason</structfield> <type>text</type>
+       <structfield>invalidation_reason</structfield> <type>text</type>
       </para>
       <para>
-       The reason for the logical slot's conflict with recovery. It is always
-       NULL for physical slots, as well as for logical slots which are not
-       invalidated. The non-NULL values indicate that the slot is marked
-       as invalidated. Possible values are:
+       The reason for the slot's invalidation. <literal>NULL</literal> if the
+       slot is currently actively being used. The non-NULL values indicate that
+       the slot is marked as invalidated. In case of logical slots, it
+       represents the reason for the logical slot's conflict with recovery.
+       Possible values are:
        <itemizedlist spacing="compact">
         <listitem>
          <para>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 6791bff9dd..9d401003e8 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflict_reason,
+            L.invalidation_reason,
             L.failover
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index eb685089b3..e53aeb37c9 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -407,28 +407,23 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.data.database == InvalidOid)
-			nulls[i++] = true;
-		else
+		switch (slot_contents.data.invalidated)
 		{
-			switch (slot_contents.data.invalidated)
-			{
-				case RS_INVAL_NONE:
-					nulls[i++] = true;
-					break;
-
-				case RS_INVAL_WAL_REMOVED:
-					values[i++] = CStringGetTextDatum("wal_removed");
-					break;
-
-				case RS_INVAL_HORIZON:
-					values[i++] = CStringGetTextDatum("rows_removed");
-					break;
-
-				case RS_INVAL_WAL_LEVEL:
-					values[i++] = CStringGetTextDatum("wal_level_insufficient");
-					break;
-			}
+			case RS_INVAL_NONE:
+				nulls[i++] = true;
+				break;
+
+			case RS_INVAL_WAL_REMOVED:
+				values[i++] = CStringGetTextDatum("wal_removed");
+				break;
+
+			case RS_INVAL_HORIZON:
+				values[i++] = CStringGetTextDatum("rows_removed");
+				break;
+
+			case RS_INVAL_WAL_LEVEL:
+				values[i++] = CStringGetTextDatum("wal_level_insufficient");
+				break;
 		}
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 183c2f84eb..9683c91d4a 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -667,13 +667,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+							"(CASE WHEN invalidation_reason IS NOT NULL THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 29af4ce65d..de1115baa0 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11129,7 +11129,7 @@
   proargtypes => '',
   proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,invalidation_reason,failover}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index cebfa52d0f..f2c58a8a06 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,7 +168,7 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflict_reason in pg_replication_slots.
+# Check invalidation_reason in pg_replication_slots.
 sub check_slots_conflict_reason
 {
 	my ($slot_prefix, $reason) = @_;
@@ -178,15 +178,15 @@ sub check_slots_conflict_reason
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$active_slot';));
 
-	is($res, "$reason", "$active_slot conflict_reason is $reason");
+	is($res, "$reason", "$active_slot invalidation_reason is $reason");
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$inactive_slot';));
 
-	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
+	is($res, "$reason", "$inactive_slot invalidation_reason is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -293,13 +293,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflict_reason is NULL for physical slot
+# Check invalidation_reason is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT invalidation_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflict_reason as NULL");
+is($res, 't', "Physical slot reports invalidation_reason as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -512,7 +512,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 $handle =
@@ -531,7 +531,7 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify conflict_reason is retained across a restart.
+# Verify invalidation_reason is retained across a restart.
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
@@ -540,7 +540,7 @@ check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
+	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and invalidation_reason is not null;"
 );
 
 chomp($restart_lsn);
@@ -591,7 +591,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
@@ -627,7 +627,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
@@ -680,7 +680,7 @@ ok( $node_standby->poll_query_until(
 is( $node_standby->safe_psql(
 		'postgres',
 		q[select bool_or(conflicting) from
-		  (select conflict_reason is not NULL as conflicting
+		  (select invalidation_reason is not NULL as conflicting
 		   from pg_replication_slots WHERE slot_type = 'logical')]),
 	'f',
 	'Logical slots are reported as non conflicting');
@@ -719,7 +719,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
@@ -763,7 +763,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+# Verify invalidation_reason is 'wal_level_insufficient' in pg_replication_slots
 check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index abc944e8b8..022f9bccb0 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,9 +1473,9 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflict_reason,
+    l.invalidation_reason,
     l.failover
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, invalidation_reason, failover)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#7)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Feb 07, 2024 at 12:22:07AM +0530, Bharath Rupireddy wrote:

On Mon, Feb 5, 2024 at 3:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

I'm not sure I like the fact that "invalidations" and "conflicts" are merged
into a single field. I'd vote to keep conflict_reason as it is and add a new
invalidation_reason (and put "conflict" as value when it is the case). The reason
is that I think they are 2 different concepts (could be linked though) and that
it would be easier to check for conflicts (means conflict_reason is not NULL).

So, do you want conflict_reason for only logical slots, and a separate
column for invalidation_reason for both logical and physical slots?

Yes, with "conflict" as value in case of conflicts (and one would need to refer
to the conflict_reason reason to see the reason).

Is there any strong reason to have two properties "conflict" and
"invalidated" for slots?

I think "conflict" is an important topic and does contain several reasons. The
slot "first" conflict and then leads to slot "invalidation".

They both are the same internally, so why
confuse the users?

I don't think that would confuse the users, I do think that would be easier to
check for conflicting slots.

I did not look closely at the code, just played a bit with the patch and was able
to produce something like:

does that make sense to have an "active/working" slot "ivalidated"?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#8)

4 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Feb 9, 2024 at 1:12 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

I think "conflict" is an important topic and does contain several reasons. The
slot "first" conflict and then leads to slot "invalidation".

They both are the same internally, so why
confuse the users?

I don't think that would confuse the users, I do think that would be easier to
check for conflicting slots.

I've added a separate column for invalidation reasons for now. I'll
see how others think on this as the time goes by.

I did not look closely at the code, just played a bit with the patch and was able
to produce something like:

postgres=# select slot_name,slot_type,active,active_pid,wal_status,invalidation_reason from pg_replication_slots;
slot_name | slot_type | active | active_pid | wal_status | invalidation_reason
-------------+-----------+--------+------------+------------+---------------------
rep1 | physical | f | | reserved |
master_slot | physical | t | 1482441 | unreserved | wal_removed
(2 rows)

does that make sense to have an "active/working" slot "ivalidated"?

Thanks. Can you please provide the steps to generate this error? Are
you setting max_slot_wal_keep_size on primary to generate
"wal_removed"?

Attached v5 patch set after rebasing and addressing review comments.
Please review it further.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v5-0001-Track-invalidation_reason-in-pg_replication_slots.patchapplication/octet-stream; name=v5-0001-Track-invalidation_reason-in-pg_replication_slots.patchDownload

From 7aac2652261475cce0dcffd359f17bdbc25ac418 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 20 Feb 2024 05:40:44 +0000
Subject: [PATCH v5 1/4] Track invalidation_reason in pg_replication_slots

Currently the reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
conflict_reason to show the reasons for slot invalidation, but
only for logical slots. This commit adds invalidation_reason to
pg_replication_slots to show invalidation reasons for both
physical and logical slots.
---
 doc/src/sgml/system-views.sgml       | 32 ++++++++++++++++++++++++++++
 src/backend/catalog/system_views.sql |  3 ++-
 src/backend/replication/slotfuncs.c  | 21 +++++++++++++++++-
 src/include/catalog/pg_proc.dat      |  6 +++---
 src/test/regress/expected/rules.out  |  5 +++--
 5 files changed, 60 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be90edd0e2..cce88c14bb 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2581,6 +2581,38 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>invalidation_reason</structfield> <type>text</type>
+      </para>
+      <para>
+       The reason for the slot's invalidation. <literal>NULL</literal> if the
+       slot is currently actively being used. The non-NULL values indicate that
+       the slot is marked as invalidated. Possible values are:
+       <itemizedlist spacing="compact">
+        <listitem>
+         <para>
+          <literal>wal_removed</literal> means that the required WAL has been
+          removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>rows_removed</literal> means that the required rows have
+          been removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>wal_level_insufficient</literal> means that the
+          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
+          perform logical decoding.
+         </para>
+        </listitem>
+       </itemizedlist>
+      </para></entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 04227a72d1..c39f0d73d3 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1025,7 +1025,8 @@ CREATE VIEW pg_replication_slots AS
             L.two_phase,
             L.conflict_reason,
             L.failover,
-            L.synced
+            L.synced,
+            L.invalidation_reason
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index d2fa5e669a..472248e569 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 17
+#define PG_GET_REPLICATION_SLOTS_COLS 18
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -437,6 +437,25 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.synced);
 
+		switch (slot_contents.data.invalidated)
+		{
+			case RS_INVAL_NONE:
+				nulls[i++] = true;
+				break;
+
+			case RS_INVAL_WAL_REMOVED:
+				values[i++] = CStringGetTextDatum(SLOT_INVAL_WAL_REMOVED_TEXT);
+				break;
+
+			case RS_INVAL_HORIZON:
+				values[i++] = CStringGetTextDatum(SLOT_INVAL_HORIZON_TEXT);
+				break;
+
+			case RS_INVAL_WAL_LEVEL:
+				values[i++] = CStringGetTextDatum(SLOT_INVAL_WAL_LEVEL_TEXT);
+				break;
+		}
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9c120fc2b7..a6bfc36426 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11127,9 +11127,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool,text}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced,invalidation_reason}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index b7488d760e..0646874236 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1475,8 +1475,9 @@ pg_replication_slots| SELECT l.slot_name,
     l.two_phase,
     l.conflict_reason,
     l.failover,
-    l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced)
+    l.synced,
+    l.invalidation_reason
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced, invalidation_reason)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v5-0002-Add-XID-based-replication-slot-invalidation.patchapplication/octet-stream; name=v5-0002-Add-XID-based-replication-slot-invalidation.patchDownload

From cdd1fcb57b757c2f46eb3c398a29bcde47761f6e Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 20 Feb 2024 05:43:58 +0000
Subject: [PATCH v5 2/4] Add XID based replication slot invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres uses replication slot xmin (the
oldest transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain), and a new GUC
max_slot_xid_age. The checkpointer then looks at all replication
slots invalidating the slots based on the age set.
---
 doc/src/sgml/config.sgml                      |  21 ++++
 src/backend/access/transam/xlog.c             |  10 ++
 src/backend/replication/slot.c                |  43 +++++++
 src/backend/replication/slotfuncs.c           |   8 ++
 src/backend/utils/misc/guc_tables.c           |  10 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   4 +
 src/test/recovery/t/050_invalidate_slots.pl   | 108 ++++++++++++++++++
 8 files changed, 205 insertions(+)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index ffd711b7f2..6c1c5f421f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4405,6 +4405,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age">
+      <term><varname>max_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>max_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 50c347a679..5fe76c46a1 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7159,6 +7159,11 @@ CreateCheckPoint(int flags)
 	if (PriorRedoPtr != InvalidXLogRecPtr)
 		UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7603,6 +7608,11 @@ CreateRestartPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index a142855bd3..67d6bd849e 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -102,6 +102,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variable */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			max_slot_xid_age = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
@@ -1416,6 +1417,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_XID_AGE:
+			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1532,6 +1536,42 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						conflict = cause;
 					break;
+				case RS_INVAL_XID_AGE:
+					{
+						TransactionId xid_cur = ReadNextTransactionId();
+						TransactionId xid_limit;
+						TransactionId xid_slot;
+
+						if (TransactionIdIsNormal(s->data.xmin))
+						{
+							xid_slot = s->data.xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+						if (TransactionIdIsNormal(s->data.catalog_xmin))
+						{
+							xid_slot = s->data.catalog_xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1686,6 +1726,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -2304,6 +2345,8 @@ GetSlotInvalidationCause(char *conflict_reason)
 		return RS_INVAL_HORIZON;
 	else if (strcmp(conflict_reason, SLOT_INVAL_WAL_LEVEL_TEXT) == 0)
 		return RS_INVAL_WAL_LEVEL;
+	else if (strcmp(conflict_reason, SLOT_INVAL_XID_AGED_TEXT) == 0)
+		return RS_INVAL_XID_AGE;
 	else
 		Assert(0);
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 472248e569..7c1145bb75 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -430,6 +430,10 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 				case RS_INVAL_WAL_LEVEL:
 					values[i++] = CStringGetTextDatum(SLOT_INVAL_WAL_LEVEL_TEXT);
 					break;
+
+				case RS_INVAL_XID_AGE:
+					values[i++] = CStringGetTextDatum(SLOT_INVAL_XID_AGED_TEXT);
+					break;
 			}
 		}
 
@@ -454,6 +458,10 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			case RS_INVAL_WAL_LEVEL:
 				values[i++] = CStringGetTextDatum(SLOT_INVAL_WAL_LEVEL_TEXT);
 				break;
+
+			case RS_INVAL_XID_AGE:
+				values[i++] = CStringGetTextDatum(SLOT_INVAL_XID_AGED_TEXT);
+				break;
 		}
 
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 70652f0a3f..298bb8ea85 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2913,6 +2913,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"max_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&max_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e10755972a..96fe198c23 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -325,6 +325,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#max_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index e706ca834c..8216c35481 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -50,6 +50,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 /*
@@ -59,6 +61,7 @@ typedef enum ReplicationSlotInvalidationCause
 #define SLOT_INVAL_WAL_REMOVED_TEXT "wal_removed"
 #define SLOT_INVAL_HORIZON_TEXT     "rows_removed"
 #define SLOT_INVAL_WAL_LEVEL_TEXT   "wal_level_insufficient"
+#define SLOT_INVAL_XID_AGED_TEXT	"xid_aged"
 
 /*
  * On-Disk data of a replication slot, preserved across restarts.
@@ -229,6 +232,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
+extern PGDLLIMPORT int max_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..2f482b56e8
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,108 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Initialize primary node, setting wal-segsize to 1MB
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 1, extra => ['--wal-segsize=1']);
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+});
+$primary->start;
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb1_slot');
+]);
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby1->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+$standby1->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NOT NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb1_slot';
+]) or die "Timed out waiting for slot xmin to advance";
+
+$primary->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET max_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby1->stop;
+
+my $logstart = -s $primary->logfile;
+
+# Do some work to advance xmin
+$primary->safe_psql(
+	'postgres', q{
+do $$
+begin
+  for i in 10000..11000 loop
+    -- use an exception block so that each iteration eats an XID
+    begin
+      insert into tab_int values (i);
+    exception
+      when division_by_zero then null;
+    end;
+  end loop;
+end$$;
+});
+
+my $invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb1_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb1_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'xid_aged';
+])
+  or die
+  "Timed out while waiting for replication slot sb1_slot to be invalidated";
+
+done_testing();
-- 
2.34.1

v5-0003-Track-inactive-replication-slot-information.patchapplication/octet-stream; name=v5-0003-Track-inactive-replication-slot-information.patchDownload

From 4f8e82845a03015f5870933e7712fde5a10ebdfd Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 20 Feb 2024 05:44:19 +0000
Subject: [PATCH v5 3/4] Track inactive replication slot information

Currently postgres doesn't track metrics like the time at which
the slot became inactive, and the total number of times the slot
became inactive in its lifetime. This commit adds two new metrics
last_inactive_at of type timestamptz and inactive_count of type numeric
to ReplicationSlotPersistentData. Whenever a slot becomes
inactive, the current timestamp and inactive count are persisted
to disk.

These metrics are useful in the following ways:
- To improve replication slot monitoring tools. For instance, one
can build a monitoring tool that signals a) when replication slots
is lying inactive for a day or so using last_inactive_at metric,
b) when a replication slot is becoming inactive too frequently
using last_inactive_at metric.

- To implement timeout-based inactive replication slot management
capability in postgres.

Increases SLOT_VERSION due to the added two new metrics.
---
 doc/src/sgml/system-views.sgml       | 20 +++++++++++++
 src/backend/catalog/system_views.sql |  4 ++-
 src/backend/replication/slot.c       | 43 ++++++++++++++++++++++------
 src/backend/replication/slotfuncs.c  | 15 +++++++++-
 src/include/catalog/pg_proc.dat      |  6 ++--
 src/include/replication/slot.h       |  6 ++++
 src/test/regress/expected/rules.out  |  6 ++--
 7 files changed, 84 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index cce88c14bb..0dfd472b02 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2771,6 +2771,26 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        ID of role
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_at</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_count</structfield> <type>numeric</type>
+      </para>
+      <para>
+        The total number of times the slot became inactive in its lifetime.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index c39f0d73d3..a5a78a9910 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1026,7 +1026,9 @@ CREATE VIEW pg_replication_slots AS
             L.conflict_reason,
             L.failover,
             L.synced,
-            L.invalidation_reason
+            L.invalidation_reason,
+            L.last_inactive_at,
+            L.inactive_count
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 67d6bd849e..ce51c6d909 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -91,7 +91,7 @@ typedef struct ReplicationSlotOnDisk
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	5		/* version for new files */
+#define SLOT_VERSION	6		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -346,6 +346,8 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
 	slot->data.synced = synced;
+	slot->data.last_inactive_at = 0;
+	slot->data.inactive_count = 0;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
@@ -572,6 +574,17 @@ retry:
 
 	if (am_walsender)
 	{
+		if (s->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&s->mutex);
+			s->data.last_inactive_at = 0;
+			SpinLockRelease(&s->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				SlotIsLogical(s)
 				? errmsg("acquired logical replication slot \"%s\"",
@@ -639,16 +652,20 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
-	MyReplicationSlot = NULL;
-
-	/* might not have been set when we've been a plain slot */
-	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
-	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
-	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
-	LWLockRelease(ProcArrayLock);
-
 	if (am_walsender)
 	{
+		if (slot->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&slot->mutex);
+			slot->data.last_inactive_at = GetCurrentTimestamp();
+			slot->data.inactive_count++;
+			SpinLockRelease(&slot->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				is_logical
 				? errmsg("released logical replication slot \"%s\"",
@@ -658,6 +675,14 @@ ReplicationSlotRelease(void)
 
 		pfree(slotname);
 	}
+
+	MyReplicationSlot = NULL;
+
+	/* might not have been set when we've been a plain slot */
+	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
+	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
+	LWLockRelease(ProcArrayLock);
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 7c1145bb75..6a12db27fd 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,10 +239,11 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 18
+#define PG_GET_REPLICATION_SLOTS_COLS 20
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
+	char		buf[256];
 
 	/*
 	 * We don't require any special permission to see this function's data
@@ -464,6 +465,18 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 				break;
 		}
 
+		if (slot_contents.data.last_inactive_at > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.data.last_inactive_at);
+		else
+			nulls[i++] = true;
+
+		/* Convert to numeric. */
+		snprintf(buf, sizeof buf, UINT64_FORMAT, slot_contents.data.inactive_count);
+		values[i++] = DirectFunctionCall3(numeric_in,
+										  CStringGetDatum(buf),
+										  ObjectIdGetDatum(0),
+										  Int32GetDatum(-1));
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a6bfc36426..3d4ace624e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11127,9 +11127,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool,text}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced,invalidation_reason}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool,text,timestamptz,numeric}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced,invalidation_reason,last_inactive_at,inactive_count}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 8216c35481..87a56aa28a 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -133,6 +133,12 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* When did this slot become inactive last time? */
+	TimestampTz last_inactive_at;
+
+	/* How many times the slot has been inactive? */
+	uint64		inactive_count;
 } ReplicationSlotPersistentData;
 
 /*
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 0646874236..35fcf9d3d0 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1476,8 +1476,10 @@ pg_replication_slots| SELECT l.slot_name,
     l.conflict_reason,
     l.failover,
     l.synced,
-    l.invalidation_reason
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced, invalidation_reason)
+    l.invalidation_reason,
+    l.last_inactive_at,
+    l.inactive_count
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced, invalidation_reason, last_inactive_at, inactive_count)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v5-0004-Add-inactive_timeout-based-replication-slot-inval.patchapplication/octet-stream; name=v5-0004-Add-inactive_timeout-based-replication-slot-inval.patchDownload

From 598e036d72f47375b1071c01b2de546dea5c1681 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 20 Feb 2024 05:45:54 +0000
Subject: [PATCH v5 4/4] Add inactive_timeout based replication slot
 invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get dropped.

To achieve the above, postgres uses replication slot metric
inactive_at (the time at which the slot became inactive), and a
new GUC inactive_replication_slot_timeout. The checkpointer then
looks at all replication slots invalidating the inactive slots
based on the timeout set.
---
 doc/src/sgml/config.sgml                      | 18 +++++
 src/backend/access/transam/xlog.c             | 10 +++
 src/backend/replication/slot.c                | 21 +++++
 src/backend/replication/slotfuncs.c           |  8 ++
 src/backend/utils/misc/guc_tables.c           | 12 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/replication/slot.h                |  4 +
 src/test/recovery/meson.build                 |  1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 79 +++++++++++++++++++
 9 files changed, 154 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 6c1c5f421f..4904541607 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4426,6 +4426,24 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-inactive-replication-slot-timeout" xreflabel="inactive_replication_slot_timeout">
+      <term><varname>inactive_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>inactive_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time at the next checkpoint. If this value is specified
+        without units, it is taken as seconds. A value of zero (which is
+        default) disables the timeout mechanism. This parameter can only be
+        set in the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 5fe76c46a1..ad8786c9f7 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7164,6 +7164,11 @@ CreateCheckPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7613,6 +7618,11 @@ CreateRestartPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index ce51c6d909..aef027e4f6 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -103,6 +103,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 int			max_slot_xid_age = 0;
+int			inactive_replication_slot_timeout = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
@@ -1445,6 +1446,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_XID_AGE:
 			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by inactive_replication_slot_timeout."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1597,6 +1601,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 						}
 					}
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (s->data.last_inactive_at > 0)
+					{
+						TimestampTz now;
+
+						Assert(s->data.persistency == RS_PERSISTENT);
+						Assert(s->active_pid == 0);
+
+						now = GetCurrentTimestamp();
+						if (TimestampDifferenceExceeds(s->data.last_inactive_at, now,
+													   inactive_replication_slot_timeout * 1000))
+							conflict = cause;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1752,6 +1770,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -2372,6 +2391,8 @@ GetSlotInvalidationCause(char *conflict_reason)
 		return RS_INVAL_WAL_LEVEL;
 	else if (strcmp(conflict_reason, SLOT_INVAL_XID_AGED_TEXT) == 0)
 		return RS_INVAL_XID_AGE;
+	else if (strcmp(conflict_reason, SLOT_INVAL_INACTIVE_TIMEOUT) == 0)
+		return RS_INVAL_INACTIVE_TIMEOUT;
 	else
 		Assert(0);
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 6a12db27fd..b5f077f9f3 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -435,6 +435,10 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 				case RS_INVAL_XID_AGE:
 					values[i++] = CStringGetTextDatum(SLOT_INVAL_XID_AGED_TEXT);
 					break;
+
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					values[i++] = CStringGetTextDatum(SLOT_INVAL_INACTIVE_TIMEOUT);
+					break;
 			}
 		}
 
@@ -463,6 +467,10 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			case RS_INVAL_XID_AGE:
 				values[i++] = CStringGetTextDatum(SLOT_INVAL_XID_AGED_TEXT);
 				break;
+
+			case RS_INVAL_INACTIVE_TIMEOUT:
+				values[i++] = CStringGetTextDatum(SLOT_INVAL_INACTIVE_TIMEOUT);
+				break;
 		}
 
 		if (slot_contents.data.last_inactive_at > 0)
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 298bb8ea85..28780b8c87 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2923,6 +2923,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"inactive_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&inactive_replication_slot_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 96fe198c23..ccb79e5a67 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -252,6 +252,7 @@
 #recovery_prefetch = try	# prefetch pages referenced in the WAL?
 #wal_decode_buffer_size = 512kB	# lookahead window used for prefetching
 				# (change requires restart)
+#inactive_replication_slot_timeout = 0	# in seconds; 0 disables
 
 # - Archiving -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 87a56aa28a..c5174f7c8d 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -52,6 +52,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* slot's xmin or catalog_xmin has reached the age */
 	RS_INVAL_XID_AGE,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 /*
@@ -62,6 +64,7 @@ typedef enum ReplicationSlotInvalidationCause
 #define SLOT_INVAL_HORIZON_TEXT     "rows_removed"
 #define SLOT_INVAL_WAL_LEVEL_TEXT   "wal_level_insufficient"
 #define SLOT_INVAL_XID_AGED_TEXT	"xid_aged"
+#define SLOT_INVAL_INACTIVE_TIMEOUT	"inactive_timeout"
 
 /*
  * On-Disk data of a replication slot, preserved across restarts.
@@ -239,6 +242,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT int max_slot_xid_age;
+extern PGDLLIMPORT int inactive_replication_slot_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index bf087ac2a9..e07b941d73 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -46,6 +46,7 @@ tests += {
       't/038_save_logical_slots_shutdown.pl',
       't/039_end_of_wal.pl',
       't/040_standby_failover_slots_sync.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index 2f482b56e8..4c66dd4a4e 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -105,4 +105,83 @@ $primary->poll_query_until(
   or die
   "Timed out while waiting for replication slot sb1_slot to be invalidated";
 
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb2_slot');
+]);
+
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET max_slot_xid_age = 0;
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+$standby2->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+});
+$standby2->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+# The inactive replication slot info should be null when the slot is active
+my $result = $primary->safe_psql(
+	'postgres', qq[
+	SELECT last_inactive_at IS NULL, inactive_count = 0 AS OK
+		FROM pg_replication_slots WHERE slot_name = 'sb2_slot';
+]);
+is($result, "t|t",
+	'check the inactive replication slot info for an active slot');
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET inactive_replication_slot_timeout TO '1s';
+]);
+$primary->reload;
+
+$logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby2->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_at IS NOT NULL AND
+		inactive_count = 1 AND slot_name = 'sb2_slot';
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+$invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb2_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb2_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb2_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for inactive replication slot sb2_slot to be invalidated";
+
 done_testing();
-- 
2.34.1

#10

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#9)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Feb 20, 2024 at 12:05 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

[...] and was able to produce something like:

postgres=# select slot_name,slot_type,active,active_pid,wal_status,invalidation_reason from pg_replication_slots;
slot_name | slot_type | active | active_pid | wal_status | invalidation_reason
-------------+-----------+--------+------------+------------+---------------------
rep1 | physical | f | | reserved |
master_slot | physical | t | 1482441 | unreserved | wal_removed
(2 rows)

does that make sense to have an "active/working" slot "ivalidated"?

Thanks. Can you please provide the steps to generate this error? Are
you setting max_slot_wal_keep_size on primary to generate
"wal_removed"?

I'm able to reproduce [1]./initdb -D db17 echo "max_wal_size = 128MB max_slot_wal_keep_size = 64MB archive_mode = on archive_command='cp %p /home/ubuntu/postgres/pg17/bin/archived_wal/%f'" | tee -a db17/postgresql.conf the state [2]postgres=# SELECT * FROM pg_replication_slots; -[ RECORD 1 ]-------+------------- slot_name | sb_repl_slot plugin | slot_type | physical datoid | database | temporary | f active | t active_pid | 710667 xmin | catalog_xmin | restart_lsn | 0/115D21A0 confirmed_flush_lsn | wal_status | unreserved safe_wal_size | 77782624 two_phase | f conflict_reason | failover | f synced | f invalidation_reason | wal_removed last_inactive_at | inactive_count | 1 where the slot got invalidated
first, then its wal_status became unreserved, but still the slot is
serving after the standby comes up online after it catches up with the
primary getting the WAL files from the archive. There's a good reason
for this state -
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/slotfuncs.c;h=d2fa5e669a32f19989b0d987d3c7329851a1272e;hb=ff9e1e764fcce9a34467d614611a34d4d2a91b50#l351.
This intermittent state can only happen for physical slots, not for
logical slots because logical subscribers can't get the missing
changes from the WAL stored in the archive.

And, the fact looks to be that an invalidated slot can never become
normal but still can serve a standby if the standby is able to catch
up by fetching required WAL (this is the WAL the slot couldn't keep
for the standby) from elsewhere (archive via restore_command).

As far as the 0001 patch is concerned, it reports the
invalidation_reason as long as slot_contents.data.invalidated !=
RS_INVAL_NONE. I think this is okay.

Thoughts?

[1]: ./initdb -D db17 echo "max_wal_size = 128MB max_slot_wal_keep_size = 64MB archive_mode = on archive_command='cp %p /home/ubuntu/postgres/pg17/bin/archived_wal/%f'" | tee -a db17/postgresql.conf
./initdb -D db17
echo "max_wal_size = 128MB
max_slot_wal_keep_size = 64MB
archive_mode = on
archive_command='cp %p
/home/ubuntu/postgres/pg17/bin/archived_wal/%f'" | tee -a
db17/postgresql.conf

./pg_ctl -D db17 -l logfile17 start

./psql -d postgres -p 5432 -c "SELECT
pg_create_physical_replication_slot('sb_repl_slot', true, false);"

rm -rf sbdata logfilesbdata
./pg_basebackup -D sbdata

echo "port=5433
primary_conninfo='host=localhost port=5432 dbname=postgres user=ubuntu'
primary_slot_name='sb_repl_slot'
restore_command='cp /home/ubuntu/postgres/pg17/bin/archived_wal/%f
%p'" | tee -a sbdata/postgresql.conf

touch sbdata/standby.signal

./pg_ctl -D sbdata -l logfilesbdata start
./psql -d postgres -p 5433 -c "SELECT pg_is_in_recovery();"

./pg_ctl -D sbdata -l logfilesbdata stop

./psql -d postgres -p 5432 -c "SELECT pg_logical_emit_message(true,
'mymessage', repeat('aaaa', 10000000));"
./psql -d postgres -p 5432 -c "CHECKPOINT;"
./pg_ctl -D sbdata -l logfilesbdata start
./psql -d postgres -p 5432 -xc "SELECT * FROM pg_replication_slots;"

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#11

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#10)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Feb 21, 2024 at 10:55:00AM +0530, Bharath Rupireddy wrote:

On Tue, Feb 20, 2024 at 12:05 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

[...] and was able to produce something like:

postgres=# select slot_name,slot_type,active,active_pid,wal_status,invalidation_reason from pg_replication_slots;
slot_name | slot_type | active | active_pid | wal_status | invalidation_reason
-------------+-----------+--------+------------+------------+---------------------
rep1 | physical | f | | reserved |
master_slot | physical | t | 1482441 | unreserved | wal_removed
(2 rows)

does that make sense to have an "active/working" slot "ivalidated"?

Thanks. Can you please provide the steps to generate this error? Are
you setting max_slot_wal_keep_size on primary to generate
"wal_removed"?

I'm able to reproduce [1] the state [2] where the slot got invalidated
first, then its wal_status became unreserved, but still the slot is
serving after the standby comes up online after it catches up with the
primary getting the WAL files from the archive. There's a good reason
for this state -
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/slotfuncs.c;h=d2fa5e669a32f19989b0d987d3c7329851a1272e;hb=ff9e1e764fcce9a34467d614611a34d4d2a91b50#l351.
This intermittent state can only happen for physical slots, not for
logical slots because logical subscribers can't get the missing
changes from the WAL stored in the archive.

And, the fact looks to be that an invalidated slot can never become
normal but still can serve a standby if the standby is able to catch
up by fetching required WAL (this is the WAL the slot couldn't keep
for the standby) from elsewhere (archive via restore_command).

As far as the 0001 patch is concerned, it reports the
invalidation_reason as long as slot_contents.data.invalidated !=
RS_INVAL_NONE. I think this is okay.

Thoughts?

Yeah, looking at the code I agree that looks ok. OTOH, that looks confusing,
maybe we should add a few words about it in the doc?

Looking at v5-0001:

+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>invalidation_reason</structfield> <type>text</type>
+      </para>
+      <para>

My initial thought was to put "conflict" value in this new field in case of
conflict (not to mention the conflict reason in it). With the current proposal
invalidation_reason could report the same as conflict_reason, which sounds weird
to me.

Does that make sense to you to use "conflict" as value in "invalidation_reason"
when the slot has "conflict_reason" not NULL?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#12

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#11)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Feb 21, 2024 at 5:55 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

As far as the 0001 patch is concerned, it reports the
invalidation_reason as long as slot_contents.data.invalidated !=
RS_INVAL_NONE. I think this is okay.

Thoughts?

Yeah, looking at the code I agree that looks ok. OTOH, that looks confusing,
maybe we should add a few words about it in the doc?

I'll think about it.

Looking at v5-0001:
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>invalidation_reason</structfield> <type>text</type>
+      </para>
+      <para>
My initial thought was to put "conflict" value in this new field in case of
conflict (not to mention the conflict reason in it). With the current proposal
invalidation_reason could report the same as conflict_reason, which sounds weird
to me.

Does that make sense to you to use "conflict" as value in "invalidation_reason"
when the slot has "conflict_reason" not NULL?

I'm thinking the other way around - how about we revert
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=007693f2a3ac2ac19affcb03ad43cdb36ccff5b5,
that is, put in place "conflict" as a boolean and introduce
invalidation_reason the text form. So, for logical slots, whenever the
"conflict" column is true, the reason is found in invaldiation_reason
column? How does it sound? Again the debate might be "conflict" vs
"invalidation", but that looks clean IMHO.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#13

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#12)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Feb 21, 2024 at 08:10:00PM +0530, Bharath Rupireddy wrote:

On Wed, Feb 21, 2024 at 5:55 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

My initial thought was to put "conflict" value in this new field in case of
conflict (not to mention the conflict reason in it). With the current proposal
invalidation_reason could report the same as conflict_reason, which sounds weird
to me.

Does that make sense to you to use "conflict" as value in "invalidation_reason"
when the slot has "conflict_reason" not NULL?

I'm thinking the other way around - how about we revert
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=007693f2a3ac2ac19affcb03ad43cdb36ccff5b5,
that is, put in place "conflict" as a boolean and introduce
invalidation_reason the text form. So, for logical slots, whenever the
"conflict" column is true, the reason is found in invaldiation_reason
column? How does it sound?

Yeah, I think that looks fine too. We would need more change (like take care of
ddd5f4f54a for example).

CC'ing Amit, Hou-San and Shveta to get their point of view (as the ones behind
007693f2a3 and ddd5f4f54a).

Regarding,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#14

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#13)

4 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Feb 22, 2024 at 1:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Does that make sense to you to use "conflict" as value in "invalidation_reason"
when the slot has "conflict_reason" not NULL?

I'm thinking the other way around - how about we revert
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=007693f2a3ac2ac19affcb03ad43cdb36ccff5b5,
that is, put in place "conflict" as a boolean and introduce
invalidation_reason the text form. So, for logical slots, whenever the
"conflict" column is true, the reason is found in invaldiation_reason
column? How does it sound?

Yeah, I think that looks fine too. We would need more change (like take care of
ddd5f4f54a for example).

CC'ing Amit, Hou-San and Shveta to get their point of view (as the ones behind
007693f2a3 and ddd5f4f54a).

Yeah, let's wait for what others think about it.

FWIW, I've had to rebase the patches due to 943f7ae1c. Please see the
attached v6 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v6-0001-Track-invalidation_reason-in-pg_replication_slots.patchapplication/octet-stream; name=v6-0001-Track-invalidation_reason-in-pg_replication_slots.patchDownload

From 88db89c46d43bbe76daac70dd517be5a797665c3 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 29 Feb 2024 18:22:45 +0000
Subject: [PATCH v6 1/4] Track invalidation_reason in pg_replication_slots

Currently the reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
conflict_reason to show the reasons for slot invalidation, but
only for logical slots. This commit adds invalidation_reason to
pg_replication_slots to show invalidation reasons for both
physical and logical slots.
---
 doc/src/sgml/system-views.sgml       | 32 ++++++++++++++++++++++++++++
 src/backend/catalog/system_views.sql |  3 ++-
 src/backend/replication/slotfuncs.c  | 12 ++++++++---
 src/include/catalog/pg_proc.dat      |  6 +++---
 src/test/regress/expected/rules.out  |  5 +++--
 5 files changed, 49 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be90edd0e2..cce88c14bb 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2581,6 +2581,38 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>invalidation_reason</structfield> <type>text</type>
+      </para>
+      <para>
+       The reason for the slot's invalidation. <literal>NULL</literal> if the
+       slot is currently actively being used. The non-NULL values indicate that
+       the slot is marked as invalidated. Possible values are:
+       <itemizedlist spacing="compact">
+        <listitem>
+         <para>
+          <literal>wal_removed</literal> means that the required WAL has been
+          removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>rows_removed</literal> means that the required rows have
+          been removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>wal_level_insufficient</literal> means that the
+          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
+          perform logical decoding.
+         </para>
+        </listitem>
+       </itemizedlist>
+      </para></entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 04227a72d1..c39f0d73d3 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1025,7 +1025,8 @@ CREATE VIEW pg_replication_slots AS
             L.two_phase,
             L.conflict_reason,
             L.failover,
-            L.synced
+            L.synced,
+            L.invalidation_reason
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 768a304723..a7a250b7c5 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 17
+#define PG_GET_REPLICATION_SLOTS_COLS 18
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -263,6 +263,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		bool		nulls[PG_GET_REPLICATION_SLOTS_COLS];
 		WALAvailability walstate;
 		int			i;
+		ReplicationSlotInvalidationCause cause;
 
 		if (!slot->in_use)
 			continue;
@@ -409,12 +410,12 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
+		cause = slot_contents.data.invalidated;
+
 		if (slot_contents.data.database == InvalidOid)
 			nulls[i++] = true;
 		else
 		{
-			ReplicationSlotInvalidationCause cause = slot_contents.data.invalidated;
-
 			if (cause == RS_INVAL_NONE)
 				nulls[i++] = true;
 			else
@@ -425,6 +426,11 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.synced);
 
+		if (cause == RS_INVAL_NONE)
+			nulls[i++] = true;
+		else
+			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9c120fc2b7..a6bfc36426 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11127,9 +11127,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool,text}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced,invalidation_reason}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 0cd2c64fca..e77bb36afe 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1475,8 +1475,9 @@ pg_replication_slots| SELECT l.slot_name,
     l.two_phase,
     l.conflict_reason,
     l.failover,
-    l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced)
+    l.synced,
+    l.invalidation_reason
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced, invalidation_reason)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v6-0002-Add-XID-based-replication-slot-invalidation.patchapplication/octet-stream; name=v6-0002-Add-XID-based-replication-slot-invalidation.patchDownload

From 17281892a283eab27ac32dd426a0cdfecc0b7ee1 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 29 Feb 2024 18:28:35 +0000
Subject: [PATCH v6 2/4] Add XID based replication slot invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres uses replication slot xmin (the
oldest transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain), and a new GUC
max_slot_xid_age. The checkpointer then looks at all replication
slots invalidating the slots based on the age set.
---
 doc/src/sgml/config.sgml                      |  21 ++++
 src/backend/access/transam/xlog.c             |  10 ++
 src/backend/replication/slot.c                |  44 ++++++-
 src/backend/utils/misc/guc_tables.c           |  10 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 108 ++++++++++++++++++
 7 files changed, 196 insertions(+), 1 deletion(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 43b1a132a2..de40d6237c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4544,6 +4544,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age">
+      <term><varname>max_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>max_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index c1162d55bf..4bfd07a408 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7159,6 +7159,11 @@ CreateCheckPoint(int flags)
 	if (PriorRedoPtr != InvalidXLogRecPtr)
 		UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7603,6 +7608,11 @@ CreateRestartPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 0f173f63a2..324c9d2398 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -85,10 +85,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_XID_AGE] = "xid_aged",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -118,6 +119,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variable */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			max_slot_xid_age = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
@@ -1446,6 +1448,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_XID_AGE:
+			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1562,6 +1567,42 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						conflict = cause;
 					break;
+				case RS_INVAL_XID_AGE:
+					{
+						TransactionId xid_cur = ReadNextTransactionId();
+						TransactionId xid_limit;
+						TransactionId xid_slot;
+
+						if (TransactionIdIsNormal(s->data.xmin))
+						{
+							xid_slot = s->data.xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+						if (TransactionIdIsNormal(s->data.catalog_xmin))
+						{
+							xid_slot = s->data.catalog_xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1716,6 +1757,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 93ded31ed9..e5c71591e1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2954,6 +2954,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"max_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&max_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index edcc0282b2..50019d7c25 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -334,6 +334,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#max_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index acbf567150..ad9fd1e94b 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -226,6 +228,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
+extern PGDLLIMPORT int max_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..2f482b56e8
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,108 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Initialize primary node, setting wal-segsize to 1MB
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 1, extra => ['--wal-segsize=1']);
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+});
+$primary->start;
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb1_slot');
+]);
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby1->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+$standby1->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NOT NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb1_slot';
+]) or die "Timed out waiting for slot xmin to advance";
+
+$primary->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET max_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby1->stop;
+
+my $logstart = -s $primary->logfile;
+
+# Do some work to advance xmin
+$primary->safe_psql(
+	'postgres', q{
+do $$
+begin
+  for i in 10000..11000 loop
+    -- use an exception block so that each iteration eats an XID
+    begin
+      insert into tab_int values (i);
+    exception
+      when division_by_zero then null;
+    end;
+  end loop;
+end$$;
+});
+
+my $invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb1_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb1_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'xid_aged';
+])
+  or die
+  "Timed out while waiting for replication slot sb1_slot to be invalidated";
+
+done_testing();
-- 
2.34.1

v6-0003-Track-inactive-replication-slot-information.patchapplication/octet-stream; name=v6-0003-Track-inactive-replication-slot-information.patchDownload

From 3a13b197aa926acaa3f916151a92073cc2e30fe0 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 29 Feb 2024 18:29:54 +0000
Subject: [PATCH v6 3/4] Track inactive replication slot information

Currently postgres doesn't track metrics like the time at which
the slot became inactive, and the total number of times the slot
became inactive in its lifetime. This commit adds two new metrics
last_inactive_at of type timestamptz and inactive_count of type numeric
to ReplicationSlotPersistentData. Whenever a slot becomes
inactive, the current timestamp and inactive count are persisted
to disk.

These metrics are useful in the following ways:
- To improve replication slot monitoring tools. For instance, one
can build a monitoring tool that signals a) when replication slots
is lying inactive for a day or so using last_inactive_at metric,
b) when a replication slot is becoming inactive too frequently
using last_inactive_at metric.

- To implement timeout-based inactive replication slot management
capability in postgres.

Increases SLOT_VERSION due to the added two new metrics.
---
 doc/src/sgml/system-views.sgml       | 20 +++++++++++++
 src/backend/catalog/system_views.sql |  4 ++-
 src/backend/replication/slot.c       | 43 ++++++++++++++++++++++------
 src/backend/replication/slotfuncs.c  | 15 +++++++++-
 src/include/catalog/pg_proc.dat      |  6 ++--
 src/include/replication/slot.h       |  6 ++++
 src/test/regress/expected/rules.out  |  6 ++--
 7 files changed, 84 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index cce88c14bb..0dfd472b02 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2771,6 +2771,26 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        ID of role
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_at</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_count</structfield> <type>numeric</type>
+      </para>
+      <para>
+        The total number of times the slot became inactive in its lifetime.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index c39f0d73d3..a5a78a9910 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1026,7 +1026,9 @@ CREATE VIEW pg_replication_slots AS
             L.conflict_reason,
             L.failover,
             L.synced,
-            L.invalidation_reason
+            L.invalidation_reason,
+            L.last_inactive_at,
+            L.inactive_count
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 324c9d2398..828f40cfca 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -108,7 +108,7 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	5		/* version for new files */
+#define SLOT_VERSION	6		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -363,6 +363,8 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
 	slot->data.synced = synced;
+	slot->data.last_inactive_at = 0;
+	slot->data.inactive_count = 0;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
@@ -589,6 +591,17 @@ retry:
 
 	if (am_walsender)
 	{
+		if (s->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&s->mutex);
+			s->data.last_inactive_at = 0;
+			SpinLockRelease(&s->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				SlotIsLogical(s)
 				? errmsg("acquired logical replication slot \"%s\"",
@@ -656,16 +669,20 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
-	MyReplicationSlot = NULL;
-
-	/* might not have been set when we've been a plain slot */
-	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
-	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
-	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
-	LWLockRelease(ProcArrayLock);
-
 	if (am_walsender)
 	{
+		if (slot->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&slot->mutex);
+			slot->data.last_inactive_at = GetCurrentTimestamp();
+			slot->data.inactive_count++;
+			SpinLockRelease(&slot->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				is_logical
 				? errmsg("released logical replication slot \"%s\"",
@@ -675,6 +692,14 @@ ReplicationSlotRelease(void)
 
 		pfree(slotname);
 	}
+
+	MyReplicationSlot = NULL;
+
+	/* might not have been set when we've been a plain slot */
+	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
+	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
+	LWLockRelease(ProcArrayLock);
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index a7a250b7c5..3bb4e9223e 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,10 +239,11 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 18
+#define PG_GET_REPLICATION_SLOTS_COLS 20
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
+	char		buf[256];
 
 	/*
 	 * We don't require any special permission to see this function's data
@@ -431,6 +432,18 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		else
 			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
 
+		if (slot_contents.data.last_inactive_at > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.data.last_inactive_at);
+		else
+			nulls[i++] = true;
+
+		/* Convert to numeric. */
+		snprintf(buf, sizeof buf, UINT64_FORMAT, slot_contents.data.inactive_count);
+		values[i++] = DirectFunctionCall3(numeric_in,
+										  CStringGetDatum(buf),
+										  ObjectIdGetDatum(0),
+										  Int32GetDatum(-1));
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a6bfc36426..3d4ace624e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11127,9 +11127,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool,text}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced,invalidation_reason}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool,text,timestamptz,numeric}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced,invalidation_reason,last_inactive_at,inactive_count}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index ad9fd1e94b..83b47425ea 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -129,6 +129,12 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* When did this slot become inactive last time? */
+	TimestampTz last_inactive_at;
+
+	/* How many times the slot has been inactive? */
+	uint64		inactive_count;
 } ReplicationSlotPersistentData;
 
 /*
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index e77bb36afe..b451c324f9 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1476,8 +1476,10 @@ pg_replication_slots| SELECT l.slot_name,
     l.conflict_reason,
     l.failover,
     l.synced,
-    l.invalidation_reason
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced, invalidation_reason)
+    l.invalidation_reason,
+    l.last_inactive_at,
+    l.inactive_count
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced, invalidation_reason, last_inactive_at, inactive_count)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v6-0004-Add-inactive_timeout-based-replication-slot-inval.patchapplication/octet-stream; name=v6-0004-Add-inactive_timeout-based-replication-slot-inval.patchDownload

From ff6dfbf63dce8fe103f68116f11be929cbd15eaf Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 29 Feb 2024 18:33:01 +0000
Subject: [PATCH v6 4/4] Add inactive_timeout based replication slot
 invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get dropped.

To achieve the above, postgres uses replication slot metric
inactive_at (the time at which the slot became inactive), and a
new GUC inactive_replication_slot_timeout. The checkpointer then
looks at all replication slots invalidating the inactive slots
based on the timeout set.
---
 doc/src/sgml/config.sgml                      | 18 +++++
 src/backend/access/transam/xlog.c             | 10 +++
 src/backend/replication/slot.c                | 22 +++++-
 src/backend/utils/misc/guc_tables.c           | 12 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/replication/slot.h                |  3 +
 src/test/recovery/meson.build                 |  1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 79 +++++++++++++++++++
 8 files changed, 145 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index de40d6237c..cb847ab8b1 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4565,6 +4565,24 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-inactive-replication-slot-timeout" xreflabel="inactive_replication_slot_timeout">
+      <term><varname>inactive_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>inactive_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time at the next checkpoint. If this value is specified
+        without units, it is taken as seconds. A value of zero (which is
+        default) disables the timeout mechanism. This parameter can only be
+        set in the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 4bfd07a408..2e7188f0d4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7164,6 +7164,11 @@ CreateCheckPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7613,6 +7618,11 @@ CreateRestartPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 828f40cfca..3e6e779094 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -86,10 +86,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
 	[RS_INVAL_XID_AGE] = "xid_aged",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -120,6 +121,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 int			max_slot_xid_age = 0;
+int			inactive_replication_slot_timeout = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
@@ -1476,6 +1478,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_XID_AGE:
 			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by inactive_replication_slot_timeout."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1628,6 +1633,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 						}
 					}
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (s->data.last_inactive_at > 0)
+					{
+						TimestampTz now;
+
+						Assert(s->data.persistency == RS_PERSISTENT);
+						Assert(s->active_pid == 0);
+
+						now = GetCurrentTimestamp();
+						if (TimestampDifferenceExceeds(s->data.last_inactive_at, now,
+													   inactive_replication_slot_timeout * 1000))
+							conflict = cause;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1783,6 +1802,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index e5c71591e1..62ab048192 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2964,6 +2964,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"inactive_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&inactive_replication_slot_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 50019d7c25..092aaf1bec 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -261,6 +261,7 @@
 #recovery_prefetch = try	# prefetch pages referenced in the WAL?
 #wal_decode_buffer_size = 512kB	# lookahead window used for prefetching
 				# (change requires restart)
+#inactive_replication_slot_timeout = 0	# in seconds; 0 disables
 
 # - Archiving -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 83b47425ea..708cbee324 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -55,6 +55,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* slot's xmin or catalog_xmin has reached the age */
 	RS_INVAL_XID_AGE,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -235,6 +237,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT int max_slot_xid_age;
+extern PGDLLIMPORT int inactive_replication_slot_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index bf087ac2a9..e07b941d73 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -46,6 +46,7 @@ tests += {
       't/038_save_logical_slots_shutdown.pl',
       't/039_end_of_wal.pl',
       't/040_standby_failover_slots_sync.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index 2f482b56e8..4c66dd4a4e 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -105,4 +105,83 @@ $primary->poll_query_until(
   or die
   "Timed out while waiting for replication slot sb1_slot to be invalidated";
 
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb2_slot');
+]);
+
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET max_slot_xid_age = 0;
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+$standby2->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+});
+$standby2->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+# The inactive replication slot info should be null when the slot is active
+my $result = $primary->safe_psql(
+	'postgres', qq[
+	SELECT last_inactive_at IS NULL, inactive_count = 0 AS OK
+		FROM pg_replication_slots WHERE slot_name = 'sb2_slot';
+]);
+is($result, "t|t",
+	'check the inactive replication slot info for an active slot');
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET inactive_replication_slot_timeout TO '1s';
+]);
+$primary->reload;
+
+$logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby2->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_at IS NOT NULL AND
+		inactive_count = 1 AND slot_name = 'sb2_slot';
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+$invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb2_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb2_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb2_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for inactive replication slot sb2_slot to be invalidated";
+
 done_testing();
-- 
2.34.1

#15

Nathan Bossart

nathandbossart@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#12)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Feb 21, 2024 at 08:10:00PM +0530, Bharath Rupireddy wrote:

I'm thinking the other way around - how about we revert
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=007693f2a3ac2ac19affcb03ad43cdb36ccff5b5,
that is, put in place "conflict" as a boolean and introduce
invalidation_reason the text form. So, for logical slots, whenever the
"conflict" column is true, the reason is found in invaldiation_reason
column? How does it sound? Again the debate might be "conflict" vs
"invalidation", but that looks clean IMHO.

Would you ever see "conflict" as false and "invalidation_reason" as
non-null for a logical slot?

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#16

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Nathan Bossart (#15)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sat, Mar 2, 2024 at 3:41 AM Nathan Bossart <nathandbossart@gmail.com> wrote:

[....] how about we revert
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=007693f2a3ac2ac19affcb03ad43cdb36ccff5b5,

Would you ever see "conflict" as false and "invalidation_reason" as
non-null for a logical slot?

No. Because both conflict and invalidation_reason are decided based on
the invalidation reason i.e. value of slot_contents.data.invalidated.
IOW, a logical slot that reports conflict as true must have been
invalidated.

Do you have any thoughts on reverting 007693f and introducing
invalidation_reason?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#17

Nathan Bossart

nathandbossart@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#16)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sun, Mar 03, 2024 at 11:40:00PM +0530, Bharath Rupireddy wrote:

On Sat, Mar 2, 2024 at 3:41 AM Nathan Bossart <nathandbossart@gmail.com> wrote:

Would you ever see "conflict" as false and "invalidation_reason" as
non-null for a logical slot?

No. Because both conflict and invalidation_reason are decided based on
the invalidation reason i.e. value of slot_contents.data.invalidated.
IOW, a logical slot that reports conflict as true must have been
invalidated.

Do you have any thoughts on reverting 007693f and introducing
invalidation_reason?

Unless I am misinterpreting some details, ISTM we could rename this column
to invalidation_reason and use it for both logical and physical slots. I'm
not seeing a strong need for another column. Perhaps I am missing
something...

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#18

Michael Paquier

michael@paquier.xyz

almost 2 years ago

In reply to: Nathan Bossart (#17)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote:

On Sun, Mar 03, 2024 at 11:40:00PM +0530, Bharath Rupireddy wrote:

Do you have any thoughts on reverting 007693f and introducing
invalidation_reason?

Unless I am misinterpreting some details, ISTM we could rename this column
to invalidation_reason and use it for both logical and physical slots. I'm
not seeing a strong need for another column. Perhaps I am missing
something...

And also, please don't be hasty in taking a decision that would
involve a revert of 007693f without informing the committer of this
commit about that. I am adding Amit Kapila in CC of this thread for
awareness.
--
Michael

#19

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Nathan Bossart (#17)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote:

On Sun, Mar 03, 2024 at 11:40:00PM +0530, Bharath Rupireddy wrote:

On Sat, Mar 2, 2024 at 3:41 AM Nathan Bossart <nathandbossart@gmail.com> wrote:

Would you ever see "conflict" as false and "invalidation_reason" as
non-null for a logical slot?

No. Because both conflict and invalidation_reason are decided based on
the invalidation reason i.e. value of slot_contents.data.invalidated.
IOW, a logical slot that reports conflict as true must have been
invalidated.

Do you have any thoughts on reverting 007693f and introducing
invalidation_reason?

Unless I am misinterpreting some details, ISTM we could rename this column
to invalidation_reason and use it for both logical and physical slots. I'm
not seeing a strong need for another column.

Yeah having two columns was more for convenience purpose. Without the "conflict"
one, a slot conflicting with recovery would be "a logical slot having a non NULL
invalidation_reason".

I'm also fine with one column if most of you prefer that way.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#20

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#19)

4 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote:

On Sun, Mar 03, 2024 at 11:40:00PM +0530, Bharath Rupireddy wrote:

On Sat, Mar 2, 2024 at 3:41 AM Nathan Bossart <nathandbossart@gmail.com> wrote:

Would you ever see "conflict" as false and "invalidation_reason" as
non-null for a logical slot?

No. Because both conflict and invalidation_reason are decided based on
the invalidation reason i.e. value of slot_contents.data.invalidated.
IOW, a logical slot that reports conflict as true must have been
invalidated.

Do you have any thoughts on reverting 007693f and introducing
invalidation_reason?

Unless I am misinterpreting some details, ISTM we could rename this column
to invalidation_reason and use it for both logical and physical slots. I'm
not seeing a strong need for another column.

Yeah having two columns was more for convenience purpose. Without the "conflict"
one, a slot conflicting with recovery would be "a logical slot having a non NULL
invalidation_reason".

I'm also fine with one column if most of you prefer that way.

While we debate on the above, please find the attached v7 patch set
after rebasing.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v7-0001-Track-invalidation_reason-in-pg_replication_slots.patchapplication/x-patch; name=v7-0001-Track-invalidation_reason-in-pg_replication_slots.patchDownload

From 906f8829f7b6bf1da4b37edf2e4d5a46a7227400 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 5 Mar 2024 18:56:00 +0000
Subject: [PATCH v7 1/4] Track invalidation_reason in pg_replication_slots

Currently the reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
conflict_reason to show the reasons for slot invalidation, but
only for logical slots. This commit adds invalidation_reason to
pg_replication_slots to show invalidation reasons for both
physical and logical slots.
---
 doc/src/sgml/system-views.sgml       | 32 ++++++++++++++++++++++++++++
 src/backend/catalog/system_views.sql |  3 ++-
 src/backend/replication/slotfuncs.c  | 12 ++++++++---
 src/include/catalog/pg_proc.dat      |  6 +++---
 src/test/regress/expected/rules.out  |  5 +++--
 5 files changed, 49 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be90edd0e2..cce88c14bb 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2581,6 +2581,38 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>invalidation_reason</structfield> <type>text</type>
+      </para>
+      <para>
+       The reason for the slot's invalidation. <literal>NULL</literal> if the
+       slot is currently actively being used. The non-NULL values indicate that
+       the slot is marked as invalidated. Possible values are:
+       <itemizedlist spacing="compact">
+        <listitem>
+         <para>
+          <literal>wal_removed</literal> means that the required WAL has been
+          removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>rows_removed</literal> means that the required rows have
+          been removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>wal_level_insufficient</literal> means that the
+          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
+          perform logical decoding.
+         </para>
+        </listitem>
+       </itemizedlist>
+      </para></entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 04227a72d1..c39f0d73d3 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1025,7 +1025,8 @@ CREATE VIEW pg_replication_slots AS
             L.two_phase,
             L.conflict_reason,
             L.failover,
-            L.synced
+            L.synced,
+            L.invalidation_reason
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 768a304723..a7a250b7c5 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 17
+#define PG_GET_REPLICATION_SLOTS_COLS 18
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -263,6 +263,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		bool		nulls[PG_GET_REPLICATION_SLOTS_COLS];
 		WALAvailability walstate;
 		int			i;
+		ReplicationSlotInvalidationCause cause;
 
 		if (!slot->in_use)
 			continue;
@@ -409,12 +410,12 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
+		cause = slot_contents.data.invalidated;
+
 		if (slot_contents.data.database == InvalidOid)
 			nulls[i++] = true;
 		else
 		{
-			ReplicationSlotInvalidationCause cause = slot_contents.data.invalidated;
-
 			if (cause == RS_INVAL_NONE)
 				nulls[i++] = true;
 			else
@@ -425,6 +426,11 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.synced);
 
+		if (cause == RS_INVAL_NONE)
+			nulls[i++] = true;
+		else
+			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 291ed876fc..17eae8847b 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11120,9 +11120,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool,text}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced,invalidation_reason}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 0cd2c64fca..e77bb36afe 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1475,8 +1475,9 @@ pg_replication_slots| SELECT l.slot_name,
     l.two_phase,
     l.conflict_reason,
     l.failover,
-    l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced)
+    l.synced,
+    l.invalidation_reason
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced, invalidation_reason)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v7-0002-Add-XID-based-replication-slot-invalidation.patchapplication/x-patch; name=v7-0002-Add-XID-based-replication-slot-invalidation.patchDownload

From af48594e7a99277219384f0da702f26a48527aa3 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 5 Mar 2024 18:57:25 +0000
Subject: [PATCH v7 2/4] Add XID based replication slot invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres uses replication slot xmin (the
oldest transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain), and a new GUC
max_slot_xid_age. The checkpointer then looks at all replication
slots invalidating the slots based on the age set.
---
 doc/src/sgml/config.sgml                      |  21 ++++
 src/backend/access/transam/xlog.c             |  10 ++
 src/backend/replication/slot.c                |  44 ++++++-
 src/backend/utils/misc/guc_tables.c           |  10 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   3 +
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 108 ++++++++++++++++++
 8 files changed, 197 insertions(+), 1 deletion(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b38cbd714a..7a8360cd32 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4544,6 +4544,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age">
+      <term><varname>max_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>max_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 20a5f86209..36ae2ac6a4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7147,6 +7147,11 @@ CreateCheckPoint(int flags)
 	if (PriorRedoPtr != InvalidXLogRecPtr)
 		UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7597,6 +7602,11 @@ CreateRestartPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 2614f98ddd..febe57ff47 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -85,10 +85,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_XID_AGE] = "xid_aged",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -118,6 +119,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variable */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			max_slot_xid_age = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
@@ -1446,6 +1448,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_XID_AGE:
+			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1562,6 +1567,42 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						conflict = cause;
 					break;
+				case RS_INVAL_XID_AGE:
+					{
+						TransactionId xid_cur = ReadNextTransactionId();
+						TransactionId xid_limit;
+						TransactionId xid_slot;
+
+						if (TransactionIdIsNormal(s->data.xmin))
+						{
+							xid_slot = s->data.xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+						if (TransactionIdIsNormal(s->data.catalog_xmin))
+						{
+							xid_slot = s->data.catalog_xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1716,6 +1757,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 45013582a7..3ed642dcaf 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2954,6 +2954,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"max_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&max_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index edcc0282b2..50019d7c25 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -334,6 +334,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#max_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index acbf567150..ad9fd1e94b 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -226,6 +228,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
+extern PGDLLIMPORT int max_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index c67249500e..d698c3ec73 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -50,6 +50,7 @@ tests += {
       't/039_end_of_wal.pl',
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..2f482b56e8
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,108 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Initialize primary node, setting wal-segsize to 1MB
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 1, extra => ['--wal-segsize=1']);
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+});
+$primary->start;
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb1_slot');
+]);
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby1->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+$standby1->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NOT NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb1_slot';
+]) or die "Timed out waiting for slot xmin to advance";
+
+$primary->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET max_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby1->stop;
+
+my $logstart = -s $primary->logfile;
+
+# Do some work to advance xmin
+$primary->safe_psql(
+	'postgres', q{
+do $$
+begin
+  for i in 10000..11000 loop
+    -- use an exception block so that each iteration eats an XID
+    begin
+      insert into tab_int values (i);
+    exception
+      when division_by_zero then null;
+    end;
+  end loop;
+end$$;
+});
+
+my $invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb1_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb1_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'xid_aged';
+])
+  or die
+  "Timed out while waiting for replication slot sb1_slot to be invalidated";
+
+done_testing();
-- 
2.34.1

v7-0003-Track-inactive-replication-slot-information.patchapplication/x-patch; name=v7-0003-Track-inactive-replication-slot-information.patchDownload

From c4501bcffd6149174245d16b00ca2571e46bb6cb Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 5 Mar 2024 18:57:42 +0000
Subject: [PATCH v7 3/4] Track inactive replication slot information

Currently postgres doesn't track metrics like the time at which
the slot became inactive, and the total number of times the slot
became inactive in its lifetime. This commit adds two new metrics
last_inactive_at of type timestamptz and inactive_count of type numeric
to ReplicationSlotPersistentData. Whenever a slot becomes
inactive, the current timestamp and inactive count are persisted
to disk.

These metrics are useful in the following ways:
- To improve replication slot monitoring tools. For instance, one
can build a monitoring tool that signals a) when replication slots
is lying inactive for a day or so using last_inactive_at metric,
b) when a replication slot is becoming inactive too frequently
using last_inactive_at metric.

- To implement timeout-based inactive replication slot management
capability in postgres.

Increases SLOT_VERSION due to the added two new metrics.
---
 doc/src/sgml/system-views.sgml       | 20 +++++++++++++
 src/backend/catalog/system_views.sql |  4 ++-
 src/backend/replication/slot.c       | 43 ++++++++++++++++++++++------
 src/backend/replication/slotfuncs.c  | 15 +++++++++-
 src/include/catalog/pg_proc.dat      |  6 ++--
 src/include/replication/slot.h       |  6 ++++
 src/test/regress/expected/rules.out  |  6 ++--
 7 files changed, 84 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index cce88c14bb..0dfd472b02 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2771,6 +2771,26 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        ID of role
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_at</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_count</structfield> <type>numeric</type>
+      </para>
+      <para>
+        The total number of times the slot became inactive in its lifetime.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index c39f0d73d3..a5a78a9910 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1026,7 +1026,9 @@ CREATE VIEW pg_replication_slots AS
             L.conflict_reason,
             L.failover,
             L.synced,
-            L.invalidation_reason
+            L.invalidation_reason,
+            L.last_inactive_at,
+            L.inactive_count
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index febe57ff47..8066ea3b28 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -108,7 +108,7 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	5		/* version for new files */
+#define SLOT_VERSION	6		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -363,6 +363,8 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
 	slot->data.synced = synced;
+	slot->data.last_inactive_at = 0;
+	slot->data.inactive_count = 0;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
@@ -589,6 +591,17 @@ retry:
 
 	if (am_walsender)
 	{
+		if (s->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&s->mutex);
+			s->data.last_inactive_at = 0;
+			SpinLockRelease(&s->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				SlotIsLogical(s)
 				? errmsg("acquired logical replication slot \"%s\"",
@@ -656,16 +669,20 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
-	MyReplicationSlot = NULL;
-
-	/* might not have been set when we've been a plain slot */
-	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
-	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
-	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
-	LWLockRelease(ProcArrayLock);
-
 	if (am_walsender)
 	{
+		if (slot->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&slot->mutex);
+			slot->data.last_inactive_at = GetCurrentTimestamp();
+			slot->data.inactive_count++;
+			SpinLockRelease(&slot->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				is_logical
 				? errmsg("released logical replication slot \"%s\"",
@@ -675,6 +692,14 @@ ReplicationSlotRelease(void)
 
 		pfree(slotname);
 	}
+
+	MyReplicationSlot = NULL;
+
+	/* might not have been set when we've been a plain slot */
+	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
+	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
+	LWLockRelease(ProcArrayLock);
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index a7a250b7c5..3bb4e9223e 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,10 +239,11 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 18
+#define PG_GET_REPLICATION_SLOTS_COLS 20
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
+	char		buf[256];
 
 	/*
 	 * We don't require any special permission to see this function's data
@@ -431,6 +432,18 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		else
 			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
 
+		if (slot_contents.data.last_inactive_at > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.data.last_inactive_at);
+		else
+			nulls[i++] = true;
+
+		/* Convert to numeric. */
+		snprintf(buf, sizeof buf, UINT64_FORMAT, slot_contents.data.inactive_count);
+		values[i++] = DirectFunctionCall3(numeric_in,
+										  CStringGetDatum(buf),
+										  ObjectIdGetDatum(0),
+										  Int32GetDatum(-1));
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 17eae8847b..681e329293 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11120,9 +11120,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool,text}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced,invalidation_reason}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool,text,timestamptz,numeric}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced,invalidation_reason,last_inactive_at,inactive_count}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index ad9fd1e94b..83b47425ea 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -129,6 +129,12 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* When did this slot become inactive last time? */
+	TimestampTz last_inactive_at;
+
+	/* How many times the slot has been inactive? */
+	uint64		inactive_count;
 } ReplicationSlotPersistentData;
 
 /*
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index e77bb36afe..b451c324f9 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1476,8 +1476,10 @@ pg_replication_slots| SELECT l.slot_name,
     l.conflict_reason,
     l.failover,
     l.synced,
-    l.invalidation_reason
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced, invalidation_reason)
+    l.invalidation_reason,
+    l.last_inactive_at,
+    l.inactive_count
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced, invalidation_reason, last_inactive_at, inactive_count)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v7-0004-Add-inactive_timeout-based-replication-slot-inval.patchapplication/x-patch; name=v7-0004-Add-inactive_timeout-based-replication-slot-inval.patchDownload

From d29ac5e3004dc512300c9d07dc9d026ba979d066 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 5 Mar 2024 18:59:05 +0000
Subject: [PATCH v7 4/4] Add inactive_timeout based replication slot
 invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get dropped.

To achieve the above, postgres uses replication slot metric
inactive_at (the time at which the slot became inactive), and a
new GUC inactive_replication_slot_timeout. The checkpointer then
looks at all replication slots invalidating the inactive slots
based on the timeout set.
---
 doc/src/sgml/config.sgml                      | 18 +++++
 src/backend/access/transam/xlog.c             | 10 +++
 src/backend/replication/slot.c                | 22 +++++-
 src/backend/utils/misc/guc_tables.c           | 12 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/replication/slot.h                |  3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 79 +++++++++++++++++++
 7 files changed, 144 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 7a8360cd32..f5c299ef73 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4565,6 +4565,24 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-inactive-replication-slot-timeout" xreflabel="inactive_replication_slot_timeout">
+      <term><varname>inactive_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>inactive_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time at the next checkpoint. If this value is specified
+        without units, it is taken as seconds. A value of zero (which is
+        default) disables the timeout mechanism. This parameter can only be
+        set in the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 36ae2ac6a4..166c3ed794 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7152,6 +7152,11 @@ CreateCheckPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7607,6 +7612,11 @@ CreateRestartPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 8066ea3b28..060cb7d66e 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -86,10 +86,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
 	[RS_INVAL_XID_AGE] = "xid_aged",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -120,6 +121,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 int			max_slot_xid_age = 0;
+int			inactive_replication_slot_timeout = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
@@ -1476,6 +1478,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_XID_AGE:
 			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by inactive_replication_slot_timeout."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1628,6 +1633,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 						}
 					}
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (s->data.last_inactive_at > 0)
+					{
+						TimestampTz now;
+
+						Assert(s->data.persistency == RS_PERSISTENT);
+						Assert(s->active_pid == 0);
+
+						now = GetCurrentTimestamp();
+						if (TimestampDifferenceExceeds(s->data.last_inactive_at, now,
+													   inactive_replication_slot_timeout * 1000))
+							conflict = cause;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1783,6 +1802,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 3ed642dcaf..06e3e87f4a 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2964,6 +2964,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"inactive_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&inactive_replication_slot_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 50019d7c25..092aaf1bec 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -261,6 +261,7 @@
 #recovery_prefetch = try	# prefetch pages referenced in the WAL?
 #wal_decode_buffer_size = 512kB	# lookahead window used for prefetching
 				# (change requires restart)
+#inactive_replication_slot_timeout = 0	# in seconds; 0 disables
 
 # - Archiving -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 83b47425ea..708cbee324 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -55,6 +55,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* slot's xmin or catalog_xmin has reached the age */
 	RS_INVAL_XID_AGE,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -235,6 +237,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT int max_slot_xid_age;
+extern PGDLLIMPORT int inactive_replication_slot_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index 2f482b56e8..4c66dd4a4e 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -105,4 +105,83 @@ $primary->poll_query_until(
   or die
   "Timed out while waiting for replication slot sb1_slot to be invalidated";
 
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb2_slot');
+]);
+
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET max_slot_xid_age = 0;
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+$standby2->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+});
+$standby2->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+# The inactive replication slot info should be null when the slot is active
+my $result = $primary->safe_psql(
+	'postgres', qq[
+	SELECT last_inactive_at IS NULL, inactive_count = 0 AS OK
+		FROM pg_replication_slots WHERE slot_name = 'sb2_slot';
+]);
+is($result, "t|t",
+	'check the inactive replication slot info for an active slot');
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET inactive_replication_slot_timeout TO '1s';
+]);
+$primary->reload;
+
+$logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby2->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_at IS NOT NULL AND
+		inactive_count = 1 AND slot_name = 'sb2_slot';
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+$invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb2_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb2_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb2_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for inactive replication slot sb2_slot to be invalidated";
+
 done_testing();
-- 
2.34.1

#21

Nathan Bossart

nathandbossart@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#20)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 06, 2024 at 12:50:38AM +0530, Bharath Rupireddy wrote:

On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote:

Unless I am misinterpreting some details, ISTM we could rename this column
to invalidation_reason and use it for both logical and physical slots. I'm
not seeing a strong need for another column.

Yeah having two columns was more for convenience purpose. Without the "conflict"
one, a slot conflicting with recovery would be "a logical slot having a non NULL
invalidation_reason".

I'm also fine with one column if most of you prefer that way.

While we debate on the above, please find the attached v7 patch set
after rebasing.

It looks like Bertrand is okay with reusing the same column for both
logical and physical slots, which IIUC is what you initially proposed in v1
of the patch set.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#22

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Nathan Bossart (#21)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Mar 05, 2024 at 01:44:43PM -0600, Nathan Bossart wrote:

On Wed, Mar 06, 2024 at 12:50:38AM +0530, Bharath Rupireddy wrote:

On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote:

Unless I am misinterpreting some details, ISTM we could rename this column
to invalidation_reason and use it for both logical and physical slots. I'm
not seeing a strong need for another column.

Yeah having two columns was more for convenience purpose. Without the "conflict"
one, a slot conflicting with recovery would be "a logical slot having a non NULL
invalidation_reason".

I'm also fine with one column if most of you prefer that way.

While we debate on the above, please find the attached v7 patch set
after rebasing.

It looks like Bertrand is okay with reusing the same column for both
logical and physical slots

Yeah, I'm okay with one column.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#23

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#22)

4 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 6, 2024 at 2:42 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Tue, Mar 05, 2024 at 01:44:43PM -0600, Nathan Bossart wrote:

On Wed, Mar 06, 2024 at 12:50:38AM +0530, Bharath Rupireddy wrote:

On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote:

Unless I am misinterpreting some details, ISTM we could rename this column
to invalidation_reason and use it for both logical and physical slots. I'm
not seeing a strong need for another column.

Yeah having two columns was more for convenience purpose. Without the "conflict"
one, a slot conflicting with recovery would be "a logical slot having a non NULL
invalidation_reason".

I'm also fine with one column if most of you prefer that way.

While we debate on the above, please find the attached v7 patch set
after rebasing.

It looks like Bertrand is okay with reusing the same column for both
logical and physical slots

Yeah, I'm okay with one column.

Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v8-0001-Track-invalidation_reason-in-pg_replication_slots.patchapplication/octet-stream; name=v8-0001-Track-invalidation_reason-in-pg_replication_slots.patchDownload

From a03f366f5e14a4db7cf3f89f1adf8a311490651b Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 6 Mar 2024 08:44:29 +0000
Subject: [PATCH v8 1/4] Track invalidation_reason in pg_replication_slots

Currently the reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
conflict_reason to show the reasons for slot invalidation, but
only for logical slots. This commit renames conflict_reason to
invalidation_reason, and adds the support to show invalidation
reasons for both physical and logical slots.
---
 doc/src/sgml/ref/pgupgrade.sgml               |  2 +-
 doc/src/sgml/system-views.sgml                | 11 ++--
 src/backend/catalog/system_views.sql          |  2 +-
 src/backend/replication/logical/slotsync.c    |  2 +-
 src/backend/replication/slot.c                |  6 +--
 src/backend/replication/slotfuncs.c           | 11 +---
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  2 +-
 src/include/replication/slot.h                |  2 +-
 .../t/035_standby_logical_decoding.pl         | 50 +++++++++----------
 .../t/040_standby_failover_slots_sync.pl      |  4 +-
 src/test/regress/expected/rules.out           |  4 +-
 12 files changed, 47 insertions(+), 53 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 58c6c2df8b..50d13f3c1e 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -453,7 +453,7 @@ make prefix=/usr/local/pgsql.new install
       <para>
        All slots on the old cluster must be usable, i.e., there are no slots
        whose
-       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>conflict_reason</structfield>
+       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>invalidation_reason</structfield>
        is not <literal>NULL</literal>.
       </para>
      </listitem>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be90edd0e2..c519b4a7f8 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,13 +2525,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflict_reason</structfield> <type>text</type>
+       <structfield>invalidation_reason</structfield> <type>text</type>
       </para>
       <para>
-       The reason for the logical slot's conflict with recovery. It is always
-       NULL for physical slots, as well as for logical slots which are not
-       invalidated. The non-NULL values indicate that the slot is marked
-       as invalidated. Possible values are:
+       The reason for the slot's invalidation. <literal>NULL</literal> if the
+       slot is currently actively being used. The non-NULL values indicate that
+       the slot is marked as invalidated. In case of logical slots, it
+       represents the reason for the logical slot's conflict with recovery.
+       Possible values are:
        <itemizedlist spacing="compact">
         <listitem>
          <para>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 04227a72d1..1dbfcef9f1 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflict_reason,
+            L.invalidation_reason,
             L.failover,
             L.synced
     FROM pg_get_replication_slots() AS L
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index ad0fc6a04b..80ffc24213 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -664,7 +664,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, conflict_reason"
+		" database, invalidation_reason"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 02ae27499b..b0f48229cb 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -2333,17 +2333,17 @@ RestoreSlotFromDisk(const char *name)
  * ReplicationSlotInvalidationCause.
  */
 ReplicationSlotInvalidationCause
-GetSlotInvalidationCause(const char *conflict_reason)
+GetSlotInvalidationCause(const char *invalidation_reason)
 {
 	ReplicationSlotInvalidationCause cause;
 	ReplicationSlotInvalidationCause result = RS_INVAL_NONE;
 	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
 
-	Assert(conflict_reason);
+	Assert(invalidation_reason);
 
 	for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
 	{
-		if (strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0)
+		if (strcmp(SlotInvalidationCauses[cause], invalidation_reason) == 0)
 		{
 			found = true;
 			result = cause;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 768a304723..758498d29d 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -409,17 +409,10 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.data.database == InvalidOid)
+		if (slot_contents.data.invalidated == RS_INVAL_NONE)
 			nulls[i++] = true;
 		else
-		{
-			ReplicationSlotInvalidationCause cause = slot_contents.data.invalidated;
-
-			if (cause == RS_INVAL_NONE)
-				nulls[i++] = true;
-			else
-				values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
-		}
+			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[slot_contents.data.invalidated]);
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 183c2f84eb..9683c91d4a 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -667,13 +667,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+							"(CASE WHEN invalidation_reason IS NOT NULL THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 291ed876fc..69140a0bf0 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11122,7 +11122,7 @@
   proargtypes => '',
   proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,invalidation_reason,failover,synced}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index acbf567150..02a96b0e19 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -272,6 +272,6 @@ extern void CheckPointReplicationSlots(bool is_shutdown);
 extern void CheckSlotRequirements(void);
 extern void CheckSlotPermissions(void);
 extern ReplicationSlotInvalidationCause
-			GetSlotInvalidationCause(const char *conflict_reason);
+			GetSlotInvalidationCause(const char *invalidation_reason);
 
 #endif							/* SLOT_H */
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 2659d4bb52..a02ae84991 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,8 +168,8 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflict_reason in pg_replication_slots.
-sub check_slots_conflict_reason
+# Check invalidation_reason in pg_replication_slots.
+sub check_slots_invalidation_reason
 {
 	my ($slot_prefix, $reason) = @_;
 
@@ -178,15 +178,15 @@ sub check_slots_conflict_reason
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$active_slot';));
 
-	is($res, "$reason", "$active_slot conflict_reason is $reason");
+	is($res, "$reason", "$active_slot invalidation_reason is $reason");
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$inactive_slot';));
 
-	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
+	is($res, "$reason", "$inactive_slot invalidation_reason is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -293,13 +293,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflict_reason is NULL for physical slot
+# Check invalidation_reason is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT invalidation_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflict_reason as NULL");
+is($res, 't', "Physical slot reports invalidation_reason as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -512,8 +512,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
-check_slots_conflict_reason('vacuum_full_', 'rows_removed');
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
+check_slots_invalidation_reason('vacuum_full_', 'rows_removed');
 
 $handle =
   make_slot_active($node_standby, 'vacuum_full_', 0, \$stdout, \$stderr);
@@ -531,8 +531,8 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify conflict_reason is retained across a restart.
-check_slots_conflict_reason('vacuum_full_', 'rows_removed');
+# Verify invalidation_reason is retained across a restart.
+check_slots_invalidation_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
 # Verify that invalidated logical slots do not lead to retaining WAL.
@@ -540,7 +540,7 @@ check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
+	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and invalidation_reason is not null;"
 );
 
 chomp($restart_lsn);
@@ -591,8 +591,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
-check_slots_conflict_reason('row_removal_', 'rows_removed');
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
+check_slots_invalidation_reason('row_removal_', 'rows_removed');
 
 $handle =
   make_slot_active($node_standby, 'row_removal_', 0, \$stdout, \$stderr);
@@ -627,8 +627,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
-check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
+check_slots_invalidation_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
 	\$stderr);
@@ -680,7 +680,7 @@ ok( $node_standby->poll_query_until(
 is( $node_standby->safe_psql(
 		'postgres',
 		q[select bool_or(conflicting) from
-		  (select conflict_reason is not NULL as conflicting
+		  (select invalidation_reason is not NULL as conflicting
 		   from pg_replication_slots WHERE slot_type = 'logical')]),
 	'f',
 	'Logical slots are reported as non conflicting');
@@ -719,8 +719,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
-check_slots_conflict_reason('pruning_', 'rows_removed');
+# Verify invalidation_reason is 'rows_removed' in pg_replication_slots
+check_slots_invalidation_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
 
@@ -825,8 +825,8 @@ SKIP:
 			$logstart),
 		"activeslot slot invalidation is logged with injection point");
 
-	# Verify conflict_reason is 'rows_removed' in pg_replication_slots.
-	check_slots_conflict_reason('injection_', 'rows_removed');
+	# Verify invalidation_reason is 'rows_removed' in pg_replication_slots.
+	check_slots_invalidation_reason('injection_', 'rows_removed');
 
 	# Detach from the injection point
 	$node_standby->safe_psql('testdb',
@@ -875,8 +875,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
-check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
+# Verify invalidation_reason is 'wal_level_insufficient' in pg_replication_slots
+ check_slots_invalidation_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
   make_slot_active($node_standby, 'wal_level_', 0, \$stdout, \$stderr);
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 021c58f621..2e1d01f750 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -228,7 +228,7 @@ $standby1->safe_psql('postgres', "CHECKPOINT");
 # Check if the synced slot is invalidated
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'synchronized slot has been invalidated');
@@ -274,7 +274,7 @@ $standby1->wait_for_log(qr/dropped replication slot "lsub1_slot" of dbid [0-9]+/
 # flagged as 'synced'
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'logical slot is re-synced');
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 0cd2c64fca..08b0a34d55 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,10 +1473,10 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflict_reason,
+    l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v8-0002-Add-XID-age-based-replication-slot-invalidation.patchapplication/octet-stream; name=v8-0002-Add-XID-age-based-replication-slot-invalidation.patchDownload

From 4be9a47c3cce539b8b5879f8c0359f552d1d419a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 6 Mar 2024 08:45:03 +0000
Subject: [PATCH v8 2/4] Add XID age based replication slot invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres uses replication slot xmin (the
oldest transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain), and a new GUC
max_slot_xid_age. The checkpointer then looks at all replication
slots invalidating the slots based on the age set.
---
 doc/src/sgml/config.sgml                      |  21 ++++
 src/backend/access/transam/xlog.c             |  10 ++
 src/backend/replication/slot.c                |  44 ++++++-
 src/backend/utils/misc/guc_tables.c           |  10 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   3 +
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 108 ++++++++++++++++++
 8 files changed, 197 insertions(+), 1 deletion(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b38cbd714a..7a8360cd32 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4544,6 +4544,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age">
+      <term><varname>max_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>max_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 20a5f86209..36ae2ac6a4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7147,6 +7147,11 @@ CreateCheckPoint(int flags)
 	if (PriorRedoPtr != InvalidXLogRecPtr)
 		UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7597,6 +7602,11 @@ CreateRestartPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index b0f48229cb..f05990aeb8 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -86,10 +86,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_XID_AGE] = "xid_aged",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -119,6 +120,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variable */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			max_slot_xid_age = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
@@ -1447,6 +1449,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_XID_AGE:
+			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1563,6 +1568,42 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						conflict = cause;
 					break;
+				case RS_INVAL_XID_AGE:
+					{
+						TransactionId xid_cur = ReadNextTransactionId();
+						TransactionId xid_limit;
+						TransactionId xid_slot;
+
+						if (TransactionIdIsNormal(s->data.xmin))
+						{
+							xid_slot = s->data.xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+						if (TransactionIdIsNormal(s->data.catalog_xmin))
+						{
+							xid_slot = s->data.catalog_xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1725,6 +1766,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 45013582a7..3ed642dcaf 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2954,6 +2954,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"max_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&max_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index edcc0282b2..50019d7c25 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -334,6 +334,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#max_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 02a96b0e19..4b7ae36f11 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -226,6 +228,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
+extern PGDLLIMPORT int max_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index c67249500e..d698c3ec73 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -50,6 +50,7 @@ tests += {
       't/039_end_of_wal.pl',
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..2f482b56e8
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,108 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Initialize primary node, setting wal-segsize to 1MB
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 1, extra => ['--wal-segsize=1']);
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+});
+$primary->start;
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb1_slot');
+]);
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby1->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+$standby1->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NOT NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb1_slot';
+]) or die "Timed out waiting for slot xmin to advance";
+
+$primary->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET max_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby1->stop;
+
+my $logstart = -s $primary->logfile;
+
+# Do some work to advance xmin
+$primary->safe_psql(
+	'postgres', q{
+do $$
+begin
+  for i in 10000..11000 loop
+    -- use an exception block so that each iteration eats an XID
+    begin
+      insert into tab_int values (i);
+    exception
+      when division_by_zero then null;
+    end;
+  end loop;
+end$$;
+});
+
+my $invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb1_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb1_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'xid_aged';
+])
+  or die
+  "Timed out while waiting for replication slot sb1_slot to be invalidated";
+
+done_testing();
-- 
2.34.1

v8-0003-Track-inactive-replication-slot-information.patchapplication/octet-stream; name=v8-0003-Track-inactive-replication-slot-information.patchDownload

From 543713209087881e82c3a63731e149e92499260c Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 6 Mar 2024 08:45:54 +0000
Subject: [PATCH v8 3/4] Track inactive replication slot information

Currently postgres doesn't track metrics like the time at which
the slot became inactive, and the total number of times the slot
became inactive in its lifetime. This commit adds two new metrics
last_inactive_at of type timestamptz and inactive_count of type numeric
to ReplicationSlotPersistentData. Whenever a slot becomes
inactive, the current timestamp and inactive count are persisted
to disk.

These metrics are useful in the following ways:
- To improve replication slot monitoring tools. For instance, one
can build a monitoring tool that signals a) when replication slots
is lying inactive for a day or so using last_inactive_at metric,
b) when a replication slot is becoming inactive too frequently
using last_inactive_at metric.

- To implement timeout-based inactive replication slot management
capability in postgres.

Increases SLOT_VERSION due to the added two new metrics.
---
 doc/src/sgml/system-views.sgml       | 20 +++++++++++++
 src/backend/catalog/system_views.sql |  4 ++-
 src/backend/replication/slot.c       | 43 ++++++++++++++++++++++------
 src/backend/replication/slotfuncs.c  | 15 +++++++++-
 src/include/catalog/pg_proc.dat      |  6 ++--
 src/include/replication/slot.h       |  6 ++++
 src/test/regress/expected/rules.out  |  6 ++--
 7 files changed, 84 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index c519b4a7f8..7909623453 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2740,6 +2740,26 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        ID of role
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_at</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_count</structfield> <type>numeric</type>
+      </para>
+      <para>
+        The total number of times the slot became inactive in its lifetime.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 1dbfcef9f1..763a4e668b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1025,7 +1025,9 @@ CREATE VIEW pg_replication_slots AS
             L.two_phase,
             L.invalidation_reason,
             L.failover,
-            L.synced
+            L.synced,
+            L.last_inactive_at,
+            L.inactive_count
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index f05990aeb8..9e323b58b3 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -109,7 +109,7 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	5		/* version for new files */
+#define SLOT_VERSION	6		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -364,6 +364,8 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
 	slot->data.synced = synced;
+	slot->data.last_inactive_at = 0;
+	slot->data.inactive_count = 0;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
@@ -590,6 +592,17 @@ retry:
 
 	if (am_walsender)
 	{
+		if (s->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&s->mutex);
+			s->data.last_inactive_at = 0;
+			SpinLockRelease(&s->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				SlotIsLogical(s)
 				? errmsg("acquired logical replication slot \"%s\"",
@@ -657,16 +670,20 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
-	MyReplicationSlot = NULL;
-
-	/* might not have been set when we've been a plain slot */
-	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
-	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
-	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
-	LWLockRelease(ProcArrayLock);
-
 	if (am_walsender)
 	{
+		if (slot->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&slot->mutex);
+			slot->data.last_inactive_at = GetCurrentTimestamp();
+			slot->data.inactive_count++;
+			SpinLockRelease(&slot->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				is_logical
 				? errmsg("released logical replication slot \"%s\"",
@@ -676,6 +693,14 @@ ReplicationSlotRelease(void)
 
 		pfree(slotname);
 	}
+
+	MyReplicationSlot = NULL;
+
+	/* might not have been set when we've been a plain slot */
+	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
+	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
+	LWLockRelease(ProcArrayLock);
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 758498d29d..3e287cba66 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,10 +239,11 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 17
+#define PG_GET_REPLICATION_SLOTS_COLS 19
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
+	char		buf[256];
 
 	/*
 	 * We don't require any special permission to see this function's data
@@ -418,6 +419,18 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.synced);
 
+		if (slot_contents.data.last_inactive_at > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.data.last_inactive_at);
+		else
+			nulls[i++] = true;
+
+		/* Convert to numeric. */
+		snprintf(buf, sizeof buf, UINT64_FORMAT, slot_contents.data.inactive_count);
+		values[i++] = DirectFunctionCall3(numeric_in,
+										  CStringGetDatum(buf),
+										  ObjectIdGetDatum(0),
+										  Int32GetDatum(-1));
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 69140a0bf0..0071ce4cf8 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11120,9 +11120,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,invalidation_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool,timestamptz,numeric}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,invalidation_reason,failover,synced,last_inactive_at,inactive_count}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 4b7ae36f11..7d668918b0 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -129,6 +129,12 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* When did this slot become inactive last time? */
+	TimestampTz last_inactive_at;
+
+	/* How many times the slot has been inactive? */
+	uint64		inactive_count;
 } ReplicationSlotPersistentData;
 
 /*
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 08b0a34d55..b63f5ea5da 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1475,8 +1475,10 @@ pg_replication_slots| SELECT l.slot_name,
     l.two_phase,
     l.invalidation_reason,
     l.failover,
-    l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, invalidation_reason, failover, synced)
+    l.synced,
+    l.last_inactive_at,
+    l.inactive_count
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, invalidation_reason, failover, synced, last_inactive_at, inactive_count)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v8-0004-Add-inactive_timeout-based-replication-slot-inval.patchapplication/octet-stream; name=v8-0004-Add-inactive_timeout-based-replication-slot-inval.patchDownload

From 1d3d286b97607bce67cdfbf7cab6f0a9b734a204 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 6 Mar 2024 08:46:47 +0000
Subject: [PATCH v8 4/4] Add inactive_timeout based replication slot
 invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get dropped.

To achieve the above, postgres uses replication slot metric
inactive_at (the time at which the slot became inactive), and a
new GUC inactive_replication_slot_timeout. The checkpointer then
looks at all replication slots invalidating the inactive slots
based on the timeout set.
---
 doc/src/sgml/config.sgml                      | 18 +++++
 src/backend/access/transam/xlog.c             | 10 +++
 src/backend/replication/slot.c                | 22 +++++-
 src/backend/utils/misc/guc_tables.c           | 12 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/replication/slot.h                |  3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 79 +++++++++++++++++++
 7 files changed, 144 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 7a8360cd32..f5c299ef73 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4565,6 +4565,24 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-inactive-replication-slot-timeout" xreflabel="inactive_replication_slot_timeout">
+      <term><varname>inactive_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>inactive_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time at the next checkpoint. If this value is specified
+        without units, it is taken as seconds. A value of zero (which is
+        default) disables the timeout mechanism. This parameter can only be
+        set in the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 36ae2ac6a4..166c3ed794 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7152,6 +7152,11 @@ CreateCheckPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7607,6 +7612,11 @@ CreateRestartPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 9e323b58b3..2360682e05 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -87,10 +87,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
 	[RS_INVAL_XID_AGE] = "xid_aged",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -121,6 +122,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 int			max_slot_xid_age = 0;
+int			inactive_replication_slot_timeout = 0;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
@@ -1477,6 +1479,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_XID_AGE:
 			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by inactive_replication_slot_timeout."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1629,6 +1634,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 						}
 					}
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (s->data.last_inactive_at > 0)
+					{
+						TimestampTz now;
+
+						Assert(s->data.persistency == RS_PERSISTENT);
+						Assert(s->active_pid == 0);
+
+						now = GetCurrentTimestamp();
+						if (TimestampDifferenceExceeds(s->data.last_inactive_at, now,
+													   inactive_replication_slot_timeout * 1000))
+							conflict = cause;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1792,6 +1811,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 3ed642dcaf..06e3e87f4a 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2964,6 +2964,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"inactive_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&inactive_replication_slot_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 50019d7c25..092aaf1bec 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -261,6 +261,7 @@
 #recovery_prefetch = try	# prefetch pages referenced in the WAL?
 #wal_decode_buffer_size = 512kB	# lookahead window used for prefetching
 				# (change requires restart)
+#inactive_replication_slot_timeout = 0	# in seconds; 0 disables
 
 # - Archiving -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7d668918b0..7ae98046a4 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -55,6 +55,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* slot's xmin or catalog_xmin has reached the age */
 	RS_INVAL_XID_AGE,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -235,6 +237,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT int max_slot_xid_age;
+extern PGDLLIMPORT int inactive_replication_slot_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index 2f482b56e8..4c66dd4a4e 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -105,4 +105,83 @@ $primary->poll_query_until(
   or die
   "Timed out while waiting for replication slot sb1_slot to be invalidated";
 
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb2_slot');
+]);
+
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET max_slot_xid_age = 0;
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+$standby2->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+});
+$standby2->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+# The inactive replication slot info should be null when the slot is active
+my $result = $primary->safe_psql(
+	'postgres', qq[
+	SELECT last_inactive_at IS NULL, inactive_count = 0 AS OK
+		FROM pg_replication_slots WHERE slot_name = 'sb2_slot';
+]);
+is($result, "t|t",
+	'check the inactive replication slot info for an active slot');
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET inactive_replication_slot_timeout TO '1s';
+]);
+$primary->reload;
+
+$logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby2->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_at IS NOT NULL AND
+		inactive_count = 1 AND slot_name = 'sb2_slot';
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+$invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb2_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb2_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb2_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for inactive replication slot sb2_slot to be invalidated";
+
 done_testing();
-- 
2.34.1

#24

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#23)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Mar 06, 2024 at 02:46:57PM +0530, Bharath Rupireddy wrote:

On Wed, Mar 6, 2024 at 2:42 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Yeah, I'm okay with one column.

Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change.

Thanks!

A few comments:

1 ===

+       The reason for the slot's invalidation. <literal>NULL</literal> if the
+       slot is currently actively being used.

s/currently actively being used/not invalidated/ ? (I mean it could be valid
and not being used).

2 ===

+       the slot is marked as invalidated. In case of logical slots, it
+       represents the reason for the logical slot's conflict with recovery.

s/the reason for the logical slot's conflict with recovery./the recovery conflict reason./ ?

3 ===

@@ -667,13 +667,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
         * removed.
         */
        res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-                                                       "%s as caught_up, conflict_reason IS NOT NULL as invalid "
+                                                       "%s as caught_up, invalidation_reason IS NOT NULL as invalid "
                                                        "FROM pg_catalog.pg_replication_slots "
                                                        "WHERE slot_type = 'logical' AND "
                                                        "database = current_database() AND "
                                                        "temporary IS FALSE;",
                                                        live_check ? "FALSE" :
-                                                       "(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+                                                       "(CASE WHEN invalidation_reason IS NOT NULL THEN FALSE "

Yeah that's fine because there is logical slot filtering here.

4 ===

-GetSlotInvalidationCause(const char *conflict_reason)
+GetSlotInvalidationCause(const char *invalidation_reason)

Should we change the comment "Maps a conflict reason" above this function?

5 ===

-# Check conflict_reason is NULL for physical slot
+# Check invalidation_reason is NULL for physical slot
 $res = $node_primary->safe_psql(
        'postgres', qq[
-                SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+                SELECT invalidation_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );

I don't think this test is needed anymore: it does not make that much sense since
it's done after the primary database initialization and startup.

6 ===

@@ -680,7 +680,7 @@ ok( $node_standby->poll_query_until(
 is( $node_standby->safe_psql(
                'postgres',
                q[select bool_or(conflicting) from
-                 (select conflict_reason is not NULL as conflicting
+                 (select invalidation_reason is not NULL as conflicting
                   from pg_replication_slots WHERE slot_type = 'logical')]),
        'f',
        'Logical slots are reported as non conflicting');

What about?

"
# Verify slots are reported as valid in pg_replication_slots
is( $node_standby->safe_psql(
'postgres',
q[select bool_or(invalidated) from
(select invalidation_reason is not NULL as invalidated
from pg_replication_slots WHERE slot_type = 'logical')]),
'f',
'Logical slots are reported as valid');
"

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#25

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Nathan Bossart (#17)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 4, 2024 at 3:14 AM Nathan Bossart <nathandbossart@gmail.com> wrote:

On Sun, Mar 03, 2024 at 11:40:00PM +0530, Bharath Rupireddy wrote:

On Sat, Mar 2, 2024 at 3:41 AM Nathan Bossart <nathandbossart@gmail.com> wrote:

Would you ever see "conflict" as false and "invalidation_reason" as
non-null for a logical slot?

No. Because both conflict and invalidation_reason are decided based on
the invalidation reason i.e. value of slot_contents.data.invalidated.
IOW, a logical slot that reports conflict as true must have been
invalidated.

Do you have any thoughts on reverting 007693f and introducing
invalidation_reason?

Unless I am misinterpreting some details, ISTM we could rename this column
to invalidation_reason and use it for both logical and physical slots. I'm
not seeing a strong need for another column. Perhaps I am missing
something...

IIUC, the current conflict_reason is primarily used to determine
logical slots on standby that got invalidated due to recovery time
conflict. On the primary, it will also show logical slots that got
invalidated due to the corresponding WAL got removed. Is that
understanding correct? If so, we are already sort of overloading this
column. However, now adding more invalidation reasons that won't
happen during recovery conflict handling will change entirely the
purpose (as per the name we use) of this variable. I think
invalidation_reason could depict this column correctly but OTOH I
guess it would lose its original meaning/purpose.

--
With Regards,
Amit Kapila.

#26

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#23)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 6, 2024 at 2:47 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change.

@@ -1629,6 +1634,20 @@
InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  }
  }
  break;
+ case RS_INVAL_INACTIVE_TIMEOUT:
+ if (s->data.last_inactive_at > 0)
+ {
+ TimestampTz now;
+
+ Assert(s->data.persistency == RS_PERSISTENT);
+ Assert(s->active_pid == 0);
+
+ now = GetCurrentTimestamp();
+ if (TimestampDifferenceExceeds(s->data.last_inactive_at, now,
+    inactive_replication_slot_timeout * 1000))

You might want to consider its interaction with sync slots on standby.
Say, there is no activity on slots in terms of processing the changes
for slots. Now, we won't perform sync of such slots on standby showing
them inactive as per your new criteria where as same slots could still
be valid on primary as the walsender is still active. This may be more
of a theoretical point as in running system there will probably be
some activity but I think this needs some thougths.

--
With Regards,
Amit Kapila.

#27

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#25)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 6, 2024 at 4:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

IIUC, the current conflict_reason is primarily used to determine
logical slots on standby that got invalidated due to recovery time
conflict. On the primary, it will also show logical slots that got
invalidated due to the corresponding WAL got removed. Is that
understanding correct?

That's right.

If so, we are already sort of overloading this
column. However, now adding more invalidation reasons that won't
happen during recovery conflict handling will change entirely the
purpose (as per the name we use) of this variable. I think
invalidation_reason could depict this column correctly but OTOH I
guess it would lose its original meaning/purpose.

Hm. I get the concern. Are you okay with having inavlidation_reason
separately for both logical and physical slots? In such a case,
logical slots that got invalidated on the standby will have duplicate
info in conflict_reason and invalidation_reason, is this fine?

Another idea is to make 'conflict_reason text' as a 'conflicting
boolean' again (revert 007693f2a3), and have 'invalidation_reason
text' for both logical and physical slots. So, whenever 'conflicting'
is true, one can look at invalidation_reason for the reason for
conflict. How does this sound?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#28

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#26)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

You might want to consider its interaction with sync slots on standby.
Say, there is no activity on slots in terms of processing the changes
for slots. Now, we won't perform sync of such slots on standby showing
them inactive as per your new criteria where as same slots could still
be valid on primary as the walsender is still active. This may be more
of a theoretical point as in running system there will probably be
some activity but I think this needs some thougths.

I believe the xmin and catalog_xmin of the sync slots on the standby
keep advancing depending on the slots on the primary, no? If yes, the
XID age based invalidation shouldn't be a problem.

I believe there are no walsenders started for the sync slots on the
standbys, right? If yes, the inactive timeout based invalidation also
shouldn't be a problem. Because, the inactive timeouts for a slot are
tracked only for walsenders because they are the ones that typically
hold replication slots for longer durations and for real replication
use. We did a similar thing in a recent commit [1]commit 7c3fb505b14e86581b6a052075a294c78c91b123 Author: Amit Kapila <akapila@postgresql.org> Date: Tue Nov 21 07:59:53 2023 +0530.

Is my understanding right? Do you still see any problems with it?

[1]: commit 7c3fb505b14e86581b6a052075a294c78c91b123 Author: Amit Kapila <akapila@postgresql.org> Date: Tue Nov 21 07:59:53 2023 +0530
commit 7c3fb505b14e86581b6a052075a294c78c91b123
Author: Amit Kapila <akapila@postgresql.org>
Date: Tue Nov 21 07:59:53 2023 +0530

Log messages for replication slot acquisition and release.
.........
Note that these messages are emitted only for walsenders but not for
backends. This is because walsenders are the ones that typically hold
replication slots for longer durations, unlike backends which hold them
for executing replication related functions.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#29

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#27)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 8, 2024 at 8:08 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Mar 6, 2024 at 4:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

IIUC, the current conflict_reason is primarily used to determine
logical slots on standby that got invalidated due to recovery time
conflict. On the primary, it will also show logical slots that got
invalidated due to the corresponding WAL got removed. Is that
understanding correct?

That's right.

If so, we are already sort of overloading this
column. However, now adding more invalidation reasons that won't
happen during recovery conflict handling will change entirely the
purpose (as per the name we use) of this variable. I think
invalidation_reason could depict this column correctly but OTOH I
guess it would lose its original meaning/purpose.

Hm. I get the concern. Are you okay with having inavlidation_reason
separately for both logical and physical slots? In such a case,
logical slots that got invalidated on the standby will have duplicate
info in conflict_reason and invalidation_reason, is this fine?

If we have duplicate information in two columns that could be
confusing for users. BTW, isn't the recovery conflict occur only
because of rows_removed and wal_level_insufficient reasons? The
wal_removed or the new reasons you are proposing can't happen because
of recovery conflict. Am, I missing something here?

Another idea is to make 'conflict_reason text' as a 'conflicting
boolean' again (revert 007693f2a3), and have 'invalidation_reason
text' for both logical and physical slots. So, whenever 'conflicting'
is true, one can look at invalidation_reason for the reason for
conflict. How does this sound?

So, does this mean that conflicting will only be true for some of the
reasons (say wal_level_insufficient, rows_removed, wal_removed) and
logical slots but not for others? I think that will also not eliminate
the duplicate information as user could have deduced that from single
column

--
With Regards,
Amit Kapila.

#30

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#28)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 8, 2024 at 10:42 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

You might want to consider its interaction with sync slots on standby.
Say, there is no activity on slots in terms of processing the changes
for slots. Now, we won't perform sync of such slots on standby showing
them inactive as per your new criteria where as same slots could still
be valid on primary as the walsender is still active. This may be more
of a theoretical point as in running system there will probably be
some activity but I think this needs some thougths.

I believe the xmin and catalog_xmin of the sync slots on the standby
keep advancing depending on the slots on the primary, no? If yes, the
XID age based invalidation shouldn't be a problem.

I believe there are no walsenders started for the sync slots on the
standbys, right? If yes, the inactive timeout based invalidation also
shouldn't be a problem. Because, the inactive timeouts for a slot are
tracked only for walsenders because they are the ones that typically
hold replication slots for longer durations and for real replication
use. We did a similar thing in a recent commit [1].

Is my understanding right?

Yes, your understanding is correct. I wanted us to consider having new
parameters like 'inactive_replication_slot_timeout' to be at
slot-level instead of GUC. I think this new parameter doesn't seem to
be the similar as 'max_slot_wal_keep_size' which leads to truncation
of WAL at global and then invalidates the appropriate slots. OTOH, the
'inactive_replication_slot_timeout' doesn't appear to have a similar
global effect. The other thing we should consider is what if the
checkpoint happens at a timeout greater than
'inactive_replication_slot_timeout'? Shall, we consider doing it via
some other background process or do we think checkpointer is the best
we can have?

Do you still see any problems with it?

Sorry, I haven't done any detailed review yet so can't say with
confidence whether there is any problem or not w.r.t sync slots.

--
With Regards,
Amit Kapila.

#31

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#23)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 6, 2024 at 2:47 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change.

Commit message says: "Currently postgres has the ability to invalidate
inactive replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in case
they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a customer
generates, and their allocated storage will vary greatly in
production, making it difficult to pin down a one-size-fits-all value.
It is often easy for developers to set an XID age (age of slot's xmin
or catalog_xmin) of say 1 or 1.5 billion, after which the slots get
invalidated."

I don't see how it will be easier for the user to choose the default
value of 'max_slot_xid_age' compared to 'max_slot_wal_keep_size'. But,
I agree similar to 'max_slot_wal_keep_size', 'max_slot_xid_age' can be
another parameter to allow vacuum to proceed removing the rows which
otherwise it wouldn't have been as those would be required by some
slot. Now, if this understanding is correct, we should probably make
this invalidation happen by (auto)vacuum after computing the age based
on this new parameter.

--
With Regards,
Amit Kapila.

#32

Nathan Bossart

nathandbossart@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#31)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 11, 2024 at 04:09:27PM +0530, Amit Kapila wrote:

I don't see how it will be easier for the user to choose the default
value of 'max_slot_xid_age' compared to 'max_slot_wal_keep_size'. But,
I agree similar to 'max_slot_wal_keep_size', 'max_slot_xid_age' can be
another parameter to allow vacuum to proceed removing the rows which
otherwise it wouldn't have been as those would be required by some
slot.

Yeah, the idea is to help prevent transaction ID wraparound, so I would
expect max_slot_xid_age to ordinarily be set relatively high, i.e., 1.5B+.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#33

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#28)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Fri, Mar 08, 2024 at 10:42:20PM +0530, Bharath Rupireddy wrote:

On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

You might want to consider its interaction with sync slots on standby.
Say, there is no activity on slots in terms of processing the changes
for slots. Now, we won't perform sync of such slots on standby showing
them inactive as per your new criteria where as same slots could still
be valid on primary as the walsender is still active. This may be more
of a theoretical point as in running system there will probably be
some activity but I think this needs some thougths.

I believe the xmin and catalog_xmin of the sync slots on the standby
keep advancing depending on the slots on the primary, no? If yes, the
XID age based invalidation shouldn't be a problem.

I believe there are no walsenders started for the sync slots on the
standbys, right? If yes, the inactive timeout based invalidation also
shouldn't be a problem. Because, the inactive timeouts for a slot are
tracked only for walsenders because they are the ones that typically
hold replication slots for longer durations and for real replication
use. We did a similar thing in a recent commit [1].

Is my understanding right? Do you still see any problems with it?

Would that make sense to "simply" discard/prevent those kind of invalidations
for "synced" slot on standby? I mean, do they make sense given the fact that
those slots are not usable until the standby is promoted?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#34

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#33)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 12, 2024 at 1:24 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 08, 2024 at 10:42:20PM +0530, Bharath Rupireddy wrote:

On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

You might want to consider its interaction with sync slots on standby.
Say, there is no activity on slots in terms of processing the changes
for slots. Now, we won't perform sync of such slots on standby showing
them inactive as per your new criteria where as same slots could still
be valid on primary as the walsender is still active. This may be more
of a theoretical point as in running system there will probably be
some activity but I think this needs some thougths.

I believe the xmin and catalog_xmin of the sync slots on the standby
keep advancing depending on the slots on the primary, no? If yes, the
XID age based invalidation shouldn't be a problem.

I believe there are no walsenders started for the sync slots on the
standbys, right? If yes, the inactive timeout based invalidation also
shouldn't be a problem. Because, the inactive timeouts for a slot are
tracked only for walsenders because they are the ones that typically
hold replication slots for longer durations and for real replication
use. We did a similar thing in a recent commit [1].

Is my understanding right? Do you still see any problems with it?

Would that make sense to "simply" discard/prevent those kind of invalidations
for "synced" slot on standby? I mean, do they make sense given the fact that
those slots are not usable until the standby is promoted?

AFAIR, we don't prevent similar invalidations due to
'max_slot_wal_keep_size' for sync slots, so why to prevent it for
these new parameters? This will unnecessarily create inconsistency in
the invalidation behavior.

--
With Regards,
Amit Kapila.

#35

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#29)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 11, 2024 at 11:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Hm. I get the concern. Are you okay with having inavlidation_reason
separately for both logical and physical slots? In such a case,
logical slots that got invalidated on the standby will have duplicate
info in conflict_reason and invalidation_reason, is this fine?

If we have duplicate information in two columns that could be
confusing for users. BTW, isn't the recovery conflict occur only
because of rows_removed and wal_level_insufficient reasons? The
wal_removed or the new reasons you are proposing can't happen because
of recovery conflict. Am, I missing something here?

My understanding aligns with yours that the rows_removed and
wal_level_insufficient invalidations can occur only upon recovery
conflict.

FWIW, a test named 'synchronized slot has been invalidated' in
040_standby_failover_slots_sync.pl inappropriately uses
conflict_reason = 'wal_removed' logical slot on standby. As per the
above understanding, it's inappropriate to use conflict_reason here
because wal_removed invalidation doesn't conflict with recovery.

Another idea is to make 'conflict_reason text' as a 'conflicting
boolean' again (revert 007693f2a3), and have 'invalidation_reason
text' for both logical and physical slots. So, whenever 'conflicting'
is true, one can look at invalidation_reason for the reason for
conflict. How does this sound?

So, does this mean that conflicting will only be true for some of the
reasons (say wal_level_insufficient, rows_removed, wal_removed) and
logical slots but not for others? I think that will also not eliminate
the duplicate information as user could have deduced that from single
column.

So, how about we turn conflict_reason to only report the reasons that
actually cause conflict with recovery for logical slots, something
like below, and then have invalidation_cause as a generic column for
all sorts of invalidation reasons for both logical and physical slots?

ReplicationSlotInvalidationCause cause = slot_contents.data.invalidated;

if (slot_contents.data.database == InvalidOid ||
cause == RS_INVAL_NONE ||
cause != RS_INVAL_HORIZON ||
cause != RS_INVAL_WAL_LEVEL)
{
nulls[i++] = true;
}
else
{
Assert(cause == RS_INVAL_HORIZON || cause == RS_INVAL_WAL_LEVEL);

values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
}

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#36

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#34)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Mar 12, 2024 at 05:51:43PM +0530, Amit Kapila wrote:

On Tue, Mar 12, 2024 at 1:24 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 08, 2024 at 10:42:20PM +0530, Bharath Rupireddy wrote:

On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

You might want to consider its interaction with sync slots on standby.
Say, there is no activity on slots in terms of processing the changes
for slots. Now, we won't perform sync of such slots on standby showing
them inactive as per your new criteria where as same slots could still
be valid on primary as the walsender is still active. This may be more
of a theoretical point as in running system there will probably be
some activity but I think this needs some thougths.

I believe the xmin and catalog_xmin of the sync slots on the standby
keep advancing depending on the slots on the primary, no? If yes, the
XID age based invalidation shouldn't be a problem.

I believe there are no walsenders started for the sync slots on the
standbys, right? If yes, the inactive timeout based invalidation also
shouldn't be a problem. Because, the inactive timeouts for a slot are
tracked only for walsenders because they are the ones that typically
hold replication slots for longer durations and for real replication
use. We did a similar thing in a recent commit [1].

Is my understanding right? Do you still see any problems with it?

Would that make sense to "simply" discard/prevent those kind of invalidations
for "synced" slot on standby? I mean, do they make sense given the fact that
those slots are not usable until the standby is promoted?

AFAIR, we don't prevent similar invalidations due to
'max_slot_wal_keep_size' for sync slots,

Right, we'd invalidate them on the standby should the standby sync slot restart_lsn
exceeds the limit.

so why to prevent it for
these new parameters? This will unnecessarily create inconsistency in
the invalidation behavior.

Yeah, but I think wal removal has a direct impact on the slot usuability which
is probably not the case with the new XID and Timeout ones. That's why I thought
about handling them differently (but I'm also fine if that's not the case).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#37

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#34)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 12, 2024 at 5:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Would that make sense to "simply" discard/prevent those kind of invalidations
for "synced" slot on standby? I mean, do they make sense given the fact that
those slots are not usable until the standby is promoted?

AFAIR, we don't prevent similar invalidations due to
'max_slot_wal_keep_size' for sync slots, so why to prevent it for
these new parameters? This will unnecessarily create inconsistency in
the invalidation behavior.

Right. +1 to keep the behaviour consistent for all invalidations.
However, an assertion that inactive_timeout isn't set for synced slots
on the standby isn't a bad idea because we rely on the fact that
walsenders aren't started for synced slots. Again, I think it misses
the consistency in the invalidation behaviour.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#38

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#36)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 12, 2024 at 9:11 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

AFAIR, we don't prevent similar invalidations due to
'max_slot_wal_keep_size' for sync slots,

Right, we'd invalidate them on the standby should the standby sync slot restart_lsn
exceeds the limit.

Right. Help me understand this a bit - is the wal_removed invalidation
going to conflict with recovery on the standby?

Per the discussion upthread, I'm trying to understand what
invalidation reasons will exactly cause conflict with recovery? Is it
just rows_removed and wal_level_insufficient invalidations? My
understanding on the conflict with recovery and invalidation reason
has been a bit off track. Perhaps, we need to clarify these two things
in the docs for the end users as well?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#39

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#30)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 11, 2024 at 3:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Yes, your understanding is correct. I wanted us to consider having new
parameters like 'inactive_replication_slot_timeout' to be at
slot-level instead of GUC. I think this new parameter doesn't seem to
be the similar as 'max_slot_wal_keep_size' which leads to truncation
of WAL at global and then invalidates the appropriate slots. OTOH, the
'inactive_replication_slot_timeout' doesn't appear to have a similar
global effect.

last_inactive_at is tracked for each slot using which slots get
invalidated based on inactive_replication_slot_timeout. It's like
max_slot_wal_keep_size invalidating slots based on restart_lsn. In a
way, both are similar, right?

The other thing we should consider is what if the
checkpoint happens at a timeout greater than
'inactive_replication_slot_timeout'?

In such a case, the slots get invalidated upon the next checkpoint as
the (current_checkpointer_timeout - last_inactive_at) will then be
greater than inactive_replication_slot_timeout.

Shall, we consider doing it via
some other background process or do we think checkpointer is the best
we can have?

The same problem exists if we do it with some other background
process. I think the checkpointer is best because it already
invalidates slots for wal_removed cause, and flushes all replication
slots to disk. Moving this new invalidation functionality into some
other background process such as autovacuum will not only burden that
process' work but also mix up the unique functionality of that
background process.

Having said above, I'm open to ideas from others as I'm not so sure if
there's any issue with checkpointer invalidating the slots for new
reasons.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#40

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#31)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 11, 2024 at 4:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I don't see how it will be easier for the user to choose the default
value of 'max_slot_xid_age' compared to 'max_slot_wal_keep_size'. But,
I agree similar to 'max_slot_wal_keep_size', 'max_slot_xid_age' can be
another parameter to allow vacuum to proceed removing the rows which
otherwise it wouldn't have been as those would be required by some
slot. Now, if this understanding is correct, we should probably make
this invalidation happen by (auto)vacuum after computing the age based
on this new parameter.

Currently, the patch computes the XID age in the checkpointer using
the next XID (gets from ReadNextFullTransactionId()) and slot's xmin
and catalog_xmin. I think the checkpointer is best because it already
invalidates slots for wal_removed cause, and flushes all replication
slots to disk. Moving this new invalidation functionality into some
other background process such as autovacuum will not only burden that
process' work but also mix up the unique functionality of that
background process.

Having said above, I'm open to ideas from others as I'm not so sure if
there's any issue with checkpointer invalidating the slots for new
reasons.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#41

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#35)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 12, 2024 at 8:55 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Mon, Mar 11, 2024 at 11:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Hm. I get the concern. Are you okay with having inavlidation_reason
separately for both logical and physical slots? In such a case,
logical slots that got invalidated on the standby will have duplicate
info in conflict_reason and invalidation_reason, is this fine?

If we have duplicate information in two columns that could be
confusing for users. BTW, isn't the recovery conflict occur only
because of rows_removed and wal_level_insufficient reasons? The
wal_removed or the new reasons you are proposing can't happen because
of recovery conflict. Am, I missing something here?

My understanding aligns with yours that the rows_removed and
wal_level_insufficient invalidations can occur only upon recovery
conflict.

FWIW, a test named 'synchronized slot has been invalidated' in
040_standby_failover_slots_sync.pl inappropriately uses
conflict_reason = 'wal_removed' logical slot on standby. As per the
above understanding, it's inappropriate to use conflict_reason here
because wal_removed invalidation doesn't conflict with recovery.

Another idea is to make 'conflict_reason text' as a 'conflicting
boolean' again (revert 007693f2a3), and have 'invalidation_reason
text' for both logical and physical slots. So, whenever 'conflicting'
is true, one can look at invalidation_reason for the reason for
conflict. How does this sound?

So, does this mean that conflicting will only be true for some of the
reasons (say wal_level_insufficient, rows_removed, wal_removed) and
logical slots but not for others? I think that will also not eliminate
the duplicate information as user could have deduced that from single
column.

So, how about we turn conflict_reason to only report the reasons that
actually cause conflict with recovery for logical slots, something
like below, and then have invalidation_cause as a generic column for
all sorts of invalidation reasons for both logical and physical slots?

If our above understanding is correct then coflict_reason will be a
subset of invalidation_reason. If so, whatever way we arrange this
information, there will be some sort of duplicity unless we just have
one column 'invalidation_reason' and update the docs to interpret it
correctly for conflicts.

--
With Regards,
Amit Kapila.

#42

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#36)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 12, 2024 at 9:11 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Tue, Mar 12, 2024 at 05:51:43PM +0530, Amit Kapila wrote:

On Tue, Mar 12, 2024 at 1:24 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

so why to prevent it for
these new parameters? This will unnecessarily create inconsistency in
the invalidation behavior.

Yeah, but I think wal removal has a direct impact on the slot usuability which
is probably not the case with the new XID and Timeout ones.

BTW, is XID the based parameter 'max_slot_xid_age' not have similarity
with 'max_slot_wal_keep_size'? I think it will impact the rows we
removed based on xid horizons. Don't we need to consider it while
vacuum computing the xid horizons in ComputeXidHorizons() similar to
what we do for WAL w.r.t 'max_slot_wal_keep_size'?

--
With Regards,
Amit Kapila.

#43

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#39)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 12, 2024 at 10:10 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Mon, Mar 11, 2024 at 3:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Yes, your understanding is correct. I wanted us to consider having new
parameters like 'inactive_replication_slot_timeout' to be at
slot-level instead of GUC. I think this new parameter doesn't seem to
be the similar as 'max_slot_wal_keep_size' which leads to truncation
of WAL at global and then invalidates the appropriate slots. OTOH, the
'inactive_replication_slot_timeout' doesn't appear to have a similar
global effect.

last_inactive_at is tracked for each slot using which slots get
invalidated based on inactive_replication_slot_timeout. It's like
max_slot_wal_keep_size invalidating slots based on restart_lsn. In a
way, both are similar, right?

There is some similarity but 'max_slot_wal_keep_size' leads to
truncation of WAL which in turn leads to invalidation of slots. Here,
I am also trying to be cautious in adding a GUC unless it is required
or having a slot-level parameter doesn't serve the need. Having said
that, I see that there is an argument that we should follow the path
of 'max_slot_wal_keep_size' GUC and there is some value to it but
still I think avoiding a new GUC for inactivity in the slot would
outweigh.

--
With Regards,
Amit Kapila.

#44

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#23)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 6, 2024 at 2:47 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Mar 6, 2024 at 2:42 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Tue, Mar 05, 2024 at 01:44:43PM -0600, Nathan Bossart wrote:

On Wed, Mar 06, 2024 at 12:50:38AM +0530, Bharath Rupireddy wrote:

On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote:

Unless I am misinterpreting some details, ISTM we could rename this column
to invalidation_reason and use it for both logical and physical slots. I'm
not seeing a strong need for another column.

Yeah having two columns was more for convenience purpose. Without the "conflict"
one, a slot conflicting with recovery would be "a logical slot having a non NULL
invalidation_reason".

I'm also fine with one column if most of you prefer that way.

While we debate on the above, please find the attached v7 patch set
after rebasing.

It looks like Bertrand is okay with reusing the same column for both
logical and physical slots

Yeah, I'm okay with one column.

Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change.

JFYI, the patch does not apply to the head. There is a conflict in
multiple files.

thanks
Shveta

#45

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: shveta malik (#44)

Re: Introduce XID age and inactive timeout based replication slot invalidation

JFYI, the patch does not apply to the head. There is a conflict in
multiple files.

For review purposes, I applied v8 to the March 6 code-base. I have yet
to review in detail, please find my initial thoughts:

1)
I found that 'inactive_replication_slot_timeout' works only if there
was any walsender ever started for that slot . The logic is under
'am_walsender' check. Is this intentional?
If I create a slot and use only pg_logical_slot_get_changes or
pg_replication_slot_advance on it, it never gets invalidated due to
timeout. While, when I set 'max_slot_xid_age' or say
'max_slot_wal_keep_size' to a lower value, the said slot is
invalidated correctly with 'xid_aged' and 'wal_removed' reasons
respectively.

Example:
With inactive_replication_slot_timeout=1min, test1_3 is the slot for
which there is no walsender and only advance and get_changes SQL
functions were called; test1_4 is the one for which pg_recvlogical was
run for a second.

test1_3 | 785 | | reserved | | t
| |
test1_4 | 798 | | lost | inactive_timeout | t |
2024-03-13 11:52:41.58446+05:30 |

And when inactive_replication_slot_timeout=0 and max_slot_xid_age=10

test1_3 | 785 | | lost | xid_aged | t
| |
test1_4 | 798 | | lost | inactive_timeout | t |
2024-03-13 11:52:41.58446+05:30 |

2)
The msg for patch 3 says:
--------------
a) when replication slots is lying inactive for a day or so using
last_inactive_at metric,
b) when a replication slot is becoming inactive too frequently using
last_inactive_at metric.
--------------
I think in b, you want to refer to inactive_count instead of last_inactive_at?

3)
I do not see invalidation_reason updated for 2 new reasons in system-views.sgml

thanks
Shveta

#46

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#38)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Mar 12, 2024 at 09:19:35PM +0530, Bharath Rupireddy wrote:

On Tue, Mar 12, 2024 at 9:11 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

AFAIR, we don't prevent similar invalidations due to
'max_slot_wal_keep_size' for sync slots,

Right, we'd invalidate them on the standby should the standby sync slot restart_lsn
exceeds the limit.

Right. Help me understand this a bit - is the wal_removed invalidation
going to conflict with recovery on the standby?

I don't think so, as it's not directly related to recovery. The slot will
be invalided on the standby though.

Per the discussion upthread, I'm trying to understand what
invalidation reasons will exactly cause conflict with recovery? Is it
just rows_removed and wal_level_insufficient invalidations?

Yes, that's the ones added in be87200efd.

See the error messages on a standby:

== wal removal

postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub4_slot', NULL, NULL, 'include-xids', '0');
ERROR: can no longer get changes from replication slot "lsub4_slot"
DETAIL: This slot has been invalidated because it exceeded the maximum reserved size.

== wal level

postgres=# select conflict_reason from pg_replication_slots where slot_name = 'lsub5_slot';;
conflict_reason
------------------------
wal_level_insufficient
(1 row)

postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub5_slot', NULL, NULL, 'include-xids', '0');
ERROR: can no longer get changes from replication slot "lsub5_slot"
DETAIL: This slot has been invalidated because it was conflicting with recovery.

== rows removal

postgres=# select conflict_reason from pg_replication_slots where slot_name = 'lsub6_slot';;
conflict_reason
-----------------
rows_removed
(1 row)

postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub6_slot', NULL, NULL, 'include-xids', '0');
ERROR: can no longer get changes from replication slot "lsub6_slot"
DETAIL: This slot has been invalidated because it was conflicting with recovery.

As you can see, only wal level and rows removal are mentioning conflict with
recovery.

So, are we already "wrong" mentioning "wal_removed" in conflict_reason?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#47

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#28)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 8, 2024 at 10:42 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

You might want to consider its interaction with sync slots on standby.
Say, there is no activity on slots in terms of processing the changes
for slots. Now, we won't perform sync of such slots on standby showing
them inactive as per your new criteria where as same slots could still
be valid on primary as the walsender is still active. This may be more
of a theoretical point as in running system there will probably be
some activity but I think this needs some thougths.

I believe the xmin and catalog_xmin of the sync slots on the standby
keep advancing depending on the slots on the primary, no? If yes, the
XID age based invalidation shouldn't be a problem.

If the user has not enabled slot-sync worker and is relying on the SQL
function pg_sync_replication_slots(), then the xmin and catalog_xmin
of synced slots may not keep on advancing. These will be advanced only
on next run of function. But meanwhile the synced slots may be
invalidated due to 'xid_aged'. Then the next time, when user runs
pg_sync_replication_slots() again, the invalidated slots will be
dropped and will be recreated by this SQL function (provided they are
valid on primary and are invalidated on standby alone). I am not
stating that it is a problem, but we need to think if this is what we
want. Secondly, the behaviour is not same with 'inactive_timeout'
invalidation. Synced slots are immune to 'inactive_timeout'
invalidation as this invalidation happens only in walsender, while
these are not immune to 'xid_aged' invalidation. So again, needs some
thoughts here.

I believe there are no walsenders started for the sync slots on the
standbys, right? If yes, the inactive timeout based invalidation also
shouldn't be a problem. Because, the inactive timeouts for a slot are
tracked only for walsenders because they are the ones that typically
hold replication slots for longer durations and for real replication
use. We did a similar thing in a recent commit [1].

Is my understanding right? Do you still see any problems with it?

I have explained the situation above for us to think over it better.

thanks
Shveta

#48

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#41)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 13, 2024 at 9:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

So, how about we turn conflict_reason to only report the reasons that
actually cause conflict with recovery for logical slots, something
like below, and then have invalidation_cause as a generic column for
all sorts of invalidation reasons for both logical and physical slots?

If our above understanding is correct then coflict_reason will be a
subset of invalidation_reason. If so, whatever way we arrange this
information, there will be some sort of duplicity unless we just have
one column 'invalidation_reason' and update the docs to interpret it
correctly for conflicts.

Yes, there will be some sort of duplicity if we emit conflict_reason
as a text field. However, I still think the better way is to turn
conflict_reason text to conflict boolean and set it to true only on
rows_removed and wal_level_insufficient invalidations. When conflict
boolean is true, one (including all the tests that we've added
recently) can look for invalidation_reason text field for the reason.
This sounds reasonable to me as opposed to we just mentioning in the
docs that "if invalidation_reason is rows_removed or
wal_level_insufficient it's the reason for conflict with recovery".

Thoughts?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#49

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#46)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 13, 2024 at 12:51 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

See the error messages on a standby:

== wal removal

postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub4_slot', NULL, NULL, 'include-xids', '0');
ERROR: can no longer get changes from replication slot "lsub4_slot"
DETAIL: This slot has been invalidated because it exceeded the maximum reserved size.

== wal level

postgres=# select conflict_reason from pg_replication_slots where slot_name = 'lsub5_slot';;
conflict_reason
------------------------
wal_level_insufficient
(1 row)

postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub5_slot', NULL, NULL, 'include-xids', '0');
ERROR: can no longer get changes from replication slot "lsub5_slot"
DETAIL: This slot has been invalidated because it was conflicting with recovery.

== rows removal

postgres=# select conflict_reason from pg_replication_slots where slot_name = 'lsub6_slot';;
conflict_reason
-----------------
rows_removed
(1 row)

postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub6_slot', NULL, NULL, 'include-xids', '0');
ERROR: can no longer get changes from replication slot "lsub6_slot"
DETAIL: This slot has been invalidated because it was conflicting with recovery.

As you can see, only wal level and rows removal are mentioning conflict with
recovery.

So, are we already "wrong" mentioning "wal_removed" in conflict_reason?

It looks like yes. So, how about we fix it the way proposed here -
/messages/by-id/CALj2ACVd_dizYQiZwwUfsb+hG-fhGYo_kEDq0wn_vNwQvOrZHg@mail.gmail.com

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#50

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: shveta malik (#44)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 13, 2024 at 11:13 AM shveta malik <shveta.malik@gmail.com> wrote:

Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change.

JFYI, the patch does not apply to the head. There is a conflict in
multiple files.

Thanks for looking into this. I noticed that the v8 patches needed
rebase. Before I go do anything with the patches, I'm trying to gain
consensus on the design. Following is the summary of design choices
we've discussed so far:
1) conflict_reason vs invalidation_reason.
2) When to compute the XID age?
3) Where to do the invalidations? Is it in the checkpointer or
autovacuum or some other process?
4) Interaction of these new invalidations with sync slots on the standby.

I hope to get on to these one after the other.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#51

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#48)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 13, 2024 at 9:24 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Mar 13, 2024 at 9:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

So, how about we turn conflict_reason to only report the reasons that
actually cause conflict with recovery for logical slots, something
like below, and then have invalidation_cause as a generic column for
all sorts of invalidation reasons for both logical and physical slots?

If our above understanding is correct then coflict_reason will be a
subset of invalidation_reason. If so, whatever way we arrange this
information, there will be some sort of duplicity unless we just have
one column 'invalidation_reason' and update the docs to interpret it
correctly for conflicts.

Yes, there will be some sort of duplicity if we emit conflict_reason
as a text field. However, I still think the better way is to turn
conflict_reason text to conflict boolean and set it to true only on
rows_removed and wal_level_insufficient invalidations. When conflict
boolean is true, one (including all the tests that we've added
recently) can look for invalidation_reason text field for the reason.
This sounds reasonable to me as opposed to we just mentioning in the
docs that "if invalidation_reason is rows_removed or
wal_level_insufficient it's the reason for conflict with recovery".

Fair point. I think we can go either way. Bertrand, Nathan, and
others, do you have an opinion on this matter?

--
With Regards,
Amit Kapila.

#52

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#50)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 13, 2024 at 10:16 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Mar 13, 2024 at 11:13 AM shveta malik <shveta.malik@gmail.com> wrote:

Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change.

JFYI, the patch does not apply to the head. There is a conflict in
multiple files.

Thanks for looking into this. I noticed that the v8 patches needed
rebase. Before I go do anything with the patches, I'm trying to gain
consensus on the design. Following is the summary of design choices
we've discussed so far:
1) conflict_reason vs invalidation_reason.
2) When to compute the XID age?

I feel we should focus on two things (a) one is to introduce a new
column invalidation_reason, and (b) let's try to first complete
invalidation due to timeout. We can look into XID stuff if time
permits, remember, we don't have ample time left.

With Regards,
Amit Kapila.

#53

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#51)

4 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 14, 2024 at 12:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Mar 13, 2024 at 9:24 PM Bharath Rupireddy

Yes, there will be some sort of duplicity if we emit conflict_reason
as a text field. However, I still think the better way is to turn
conflict_reason text to conflict boolean and set it to true only on
rows_removed and wal_level_insufficient invalidations. When conflict
boolean is true, one (including all the tests that we've added
recently) can look for invalidation_reason text field for the reason.
This sounds reasonable to me as opposed to we just mentioning in the
docs that "if invalidation_reason is rows_removed or
wal_level_insufficient it's the reason for conflict with recovery".

Fair point. I think we can go either way. Bertrand, Nathan, and
others, do you have an opinion on this matter?

While we wait to hear from others on this, I'm attaching the v9 patch
set implementing the above idea (check 0001 patch). Please have a
look. I'll come back to the other review comments soon.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v9-0001-Track-invalidation_reason-in-pg_replication_slots.patchapplication/x-patch; name=v9-0001-Track-invalidation_reason-in-pg_replication_slots.patchDownload

From 18855c08cd8bcbaf41aba10048f0ea23a246e546 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 14 Mar 2024 12:48:52 +0000
Subject: [PATCH v9 1/4] Track invalidation_reason in pg_replication_slots

Up until now, reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
conflict_reason to show the reasons for slot invalidation, but
only for logical slots.

This commit adds a new column to show invalidation reasons for
both physical and logical slots. And, this commit also turns
conflict_reason text column to conflicting boolean column
(effectively reverting commit 007693f2a). One now can look at the
new invalidation_reason column for logical slots conflict with
recovery.
---
 doc/src/sgml/ref/pgupgrade.sgml               |  4 +-
 doc/src/sgml/system-views.sgml                | 63 +++++++++++--------
 src/backend/catalog/system_views.sql          |  5 +-
 src/backend/replication/logical/slotsync.c    |  2 +-
 src/backend/replication/slot.c                |  8 +--
 src/backend/replication/slotfuncs.c           | 25 +++++---
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  6 +-
 src/include/replication/slot.h                |  2 +-
 .../t/035_standby_logical_decoding.pl         | 35 ++++++-----
 .../t/040_standby_failover_slots_sync.pl      |  4 +-
 src/test/regress/expected/rules.out           |  7 ++-
 12 files changed, 95 insertions(+), 70 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 58c6c2df8b..8de52bf752 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -453,8 +453,8 @@ make prefix=/usr/local/pgsql.new install
       <para>
        All slots on the old cluster must be usable, i.e., there are no slots
        whose
-       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>conflict_reason</structfield>
-       is not <literal>NULL</literal>.
+       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>conflicting</structfield>
+       is not <literal>true</literal>.
       </para>
      </listitem>
      <listitem>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be90edd0e2..f3fb5ba1b0 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,34 +2525,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflict_reason</structfield> <type>text</type>
+       <structfield>conflicting</structfield> <type>bool</type>
       </para>
       <para>
-       The reason for the logical slot's conflict with recovery. It is always
-       NULL for physical slots, as well as for logical slots which are not
-       invalidated. The non-NULL values indicate that the slot is marked
-       as invalidated. Possible values are:
-       <itemizedlist spacing="compact">
-        <listitem>
-         <para>
-          <literal>wal_removed</literal> means that the required WAL has been
-          removed.
-         </para>
-        </listitem>
-        <listitem>
-         <para>
-          <literal>rows_removed</literal> means that the required rows have
-          been removed.
-         </para>
-        </listitem>
-        <listitem>
-         <para>
-          <literal>wal_level_insufficient</literal> means that the
-          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
-          perform logical decoding.
-         </para>
-        </listitem>
-       </itemizedlist>
+       True if this logical slot conflicted with recovery (and so is now
+       invalidated). When this column is true, check
+       <structfield>invalidation_reason</structfield> column for the conflict
+       reason.
       </para></entry>
      </row>
 
@@ -2581,6 +2560,38 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>invalidation_reason</structfield> <type>text</type>
+      </para>
+      <para>
+       The reason for the slot's invalidation. <literal>NULL</literal> if the
+       slot is currently actively being used. The non-NULL values indicate that
+       the slot is marked as invalidated. Possible values are:
+       <itemizedlist spacing="compact">
+        <listitem>
+         <para>
+          <literal>wal_removed</literal> means that the required WAL has been
+          removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>rows_removed</literal> means that the required rows have
+          been removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>wal_level_insufficient</literal> means that the
+          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
+          perform logical decoding.
+         </para>
+        </listitem>
+       </itemizedlist>
+      </para></entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 04227a72d1..cd22dad959 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,9 +1023,10 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflict_reason,
+            L.conflicting,
             L.failover,
-            L.synced
+            L.synced,
+            L.invalidation_reason
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 5074c8409f..260632cfdd 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -668,7 +668,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, conflict_reason"
+		" database, invalidation_reason"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 91ca397857..4f1a17f6ce 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -2356,21 +2356,21 @@ RestoreSlotFromDisk(const char *name)
 }
 
 /*
- * Maps a conflict reason for a replication slot to
+ * Maps a invalidation reason for a replication slot to
  * ReplicationSlotInvalidationCause.
  */
 ReplicationSlotInvalidationCause
-GetSlotInvalidationCause(const char *conflict_reason)
+GetSlotInvalidationCause(const char *invalidation_reason)
 {
 	ReplicationSlotInvalidationCause cause;
 	ReplicationSlotInvalidationCause result = RS_INVAL_NONE;
 	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
 
-	Assert(conflict_reason);
+	Assert(invalidation_reason);
 
 	for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
 	{
-		if (strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0)
+		if (strcmp(SlotInvalidationCauses[cause], invalidation_reason) == 0)
 		{
 			found = true;
 			result = cause;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index ad79e1fccd..b5a638edea 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 17
+#define PG_GET_REPLICATION_SLOTS_COLS 18
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -263,6 +263,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		bool		nulls[PG_GET_REPLICATION_SLOTS_COLS];
 		WALAvailability walstate;
 		int			i;
+		ReplicationSlotInvalidationCause cause;
 
 		if (!slot->in_use)
 			continue;
@@ -409,22 +410,32 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.data.database == InvalidOid)
+		cause = slot_contents.data.invalidated;
+
+		if (SlotIsPhysical(&slot_contents))
 			nulls[i++] = true;
 		else
 		{
-			ReplicationSlotInvalidationCause cause = slot_contents.data.invalidated;
-
-			if (cause == RS_INVAL_NONE)
-				nulls[i++] = true;
+			/*
+			 * rows_removed and wal_level_insufficient are only two reasons
+			 * for the logical slot's conflict with recovery.
+			 */
+			if (cause == RS_INVAL_HORIZON ||
+				cause == RS_INVAL_WAL_LEVEL)
+				values[i++] = BoolGetDatum(true);
 			else
-				values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+				values[i++] = BoolGetDatum(false);
 		}
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
 		values[i++] = BoolGetDatum(slot_contents.data.synced);
 
+		if (cause == RS_INVAL_NONE)
+			nulls[i++] = true;
+		else
+			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index b5b8d11602..34a157f792 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -676,13 +676,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+							"(CASE WHEN conflicting THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 4af5c2e847..e5dc1cbdb3 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11120,9 +11120,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 425effad21..7f25a083ee 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -273,7 +273,7 @@ extern void CheckPointReplicationSlots(bool is_shutdown);
 extern void CheckSlotRequirements(void);
 extern void CheckSlotPermissions(void);
 extern ReplicationSlotInvalidationCause
-			GetSlotInvalidationCause(const char *conflict_reason);
+			GetSlotInvalidationCause(const char *invalidation_reason);
 
 extern bool SlotExistsInStandbySlotNames(const char *slot_name);
 extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 88b03048c4..addff6a1a5 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,7 +168,7 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflict_reason in pg_replication_slots.
+# Check reason for conflict in pg_replication_slots.
 sub check_slots_conflict_reason
 {
 	my ($slot_prefix, $reason) = @_;
@@ -178,15 +178,15 @@ sub check_slots_conflict_reason
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$active_slot' and conflicting;));
 
-	is($res, "$reason", "$active_slot conflict_reason is $reason");
+	is($res, "$reason", "$active_slot reason for conflict is $reason");
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$inactive_slot' and conflicting;));
 
-	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
+	is($res, "$reason", "$inactive_slot reason for conflict is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -293,13 +293,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflict_reason is NULL for physical slot
+# Check conflicting is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT conflicting is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflict_reason as NULL");
+is($res, 't', "Physical slot reports conflicting as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -524,7 +524,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Ensure that replication slot stats are not removed after invalidation.
@@ -551,7 +551,7 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify conflict_reason is retained across a restart.
+# Verify reason for conflict is retained across a restart.
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
@@ -560,7 +560,8 @@ check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
+	"SELECT restart_lsn FROM pg_replication_slots
+		WHERE slot_name = 'vacuum_full_activeslot' AND conflicting;"
 );
 
 chomp($restart_lsn);
@@ -611,7 +612,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
@@ -647,7 +648,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
@@ -700,8 +701,8 @@ ok( $node_standby->poll_query_until(
 is( $node_standby->safe_psql(
 		'postgres',
 		q[select bool_or(conflicting) from
-		  (select conflict_reason is not NULL as conflicting
-		   from pg_replication_slots WHERE slot_type = 'logical')]),
+		  (select conflicting from pg_replication_slots
+		  	where slot_type = 'logical')]),
 	'f',
 	'Logical slots are reported as non conflicting');
 
@@ -739,7 +740,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
@@ -783,7 +784,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+# Verify reason for conflict is 'wal_level_insufficient' in pg_replication_slots
 check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 0ea1f3d323..f47bfd78eb 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -228,7 +228,7 @@ $standby1->safe_psql('postgres', "CHECKPOINT");
 # Check if the synced slot is invalidated
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'synchronized slot has been invalidated');
@@ -274,7 +274,7 @@ $standby1->wait_for_log(qr/dropped replication slot "lsub1_slot" of dbid [0-9]+/
 # flagged as 'synced'
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'logical slot is re-synced');
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 0cd2c64fca..055bec068d 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,10 +1473,11 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflict_reason,
+    l.conflicting,
     l.failover,
-    l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced)
+    l.synced,
+    l.invalidation_reason
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v9-0002-Add-XID-age-based-replication-slot-invalidation.patchapplication/x-patch; name=v9-0002-Add-XID-age-based-replication-slot-invalidation.patchDownload

From 4d4248000adfd9096c652bdf0a654ac2203d57a0 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 14 Mar 2024 12:51:48 +0000
Subject: [PATCH v9 2/4] Add XID age based replication slot invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres uses replication slot xmin (the
oldest transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain), and a new GUC
max_slot_xid_age. The checkpointer then looks at all replication
slots invalidating the slots based on the age set.
---
 doc/src/sgml/config.sgml                      |  21 ++++
 src/backend/access/transam/xlog.c             |  10 ++
 src/backend/replication/slot.c                |  44 ++++++-
 src/backend/utils/misc/guc_tables.c           |  10 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   3 +
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 108 ++++++++++++++++++
 8 files changed, 197 insertions(+), 1 deletion(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 65a6e6c408..6dd54ffcb7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4544,6 +4544,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age">
+      <term><varname>max_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>max_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 20a5f86209..36ae2ac6a4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7147,6 +7147,11 @@ CreateCheckPoint(int flags)
 	if (PriorRedoPtr != InvalidXLogRecPtr)
 		UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7597,6 +7602,11 @@ CreateRestartPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 4f1a17f6ce..2a1885da24 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_XID_AGE] = "xid_aged",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			max_slot_xid_age = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -1483,6 +1485,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_XID_AGE:
+			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1599,6 +1604,42 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						conflict = cause;
 					break;
+				case RS_INVAL_XID_AGE:
+					{
+						TransactionId xid_cur = ReadNextTransactionId();
+						TransactionId xid_limit;
+						TransactionId xid_slot;
+
+						if (TransactionIdIsNormal(s->data.xmin))
+						{
+							xid_slot = s->data.xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+						if (TransactionIdIsNormal(s->data.catalog_xmin))
+						{
+							xid_slot = s->data.catalog_xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1752,6 +1793,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 57d9de4dd9..6b5375909d 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2954,6 +2954,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"max_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&max_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 2244ee52f7..b4c928b826 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -334,6 +334,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#max_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7f25a083ee..614ba0e30b 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -227,6 +229,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
+extern PGDLLIMPORT int max_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index c67249500e..d698c3ec73 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -50,6 +50,7 @@ tests += {
       't/039_end_of_wal.pl',
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..2f482b56e8
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,108 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Initialize primary node, setting wal-segsize to 1MB
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 1, extra => ['--wal-segsize=1']);
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+});
+$primary->start;
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb1_slot');
+]);
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby1->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+$standby1->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NOT NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb1_slot';
+]) or die "Timed out waiting for slot xmin to advance";
+
+$primary->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET max_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby1->stop;
+
+my $logstart = -s $primary->logfile;
+
+# Do some work to advance xmin
+$primary->safe_psql(
+	'postgres', q{
+do $$
+begin
+  for i in 10000..11000 loop
+    -- use an exception block so that each iteration eats an XID
+    begin
+      insert into tab_int values (i);
+    exception
+      when division_by_zero then null;
+    end;
+  end loop;
+end$$;
+});
+
+my $invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb1_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb1_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'xid_aged';
+])
+  or die
+  "Timed out while waiting for replication slot sb1_slot to be invalidated";
+
+done_testing();
-- 
2.34.1

v9-0003-Track-inactive-replication-slot-information.patchapplication/x-patch; name=v9-0003-Track-inactive-replication-slot-information.patchDownload

From 8349200b0bad1c8cda3e3f96034e7b63e3054d97 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 14 Mar 2024 14:04:00 +0000
Subject: [PATCH v9 3/4] Track inactive replication slot information

Currently postgres doesn't track metrics like the time at which
the slot became inactive, and the total number of times the slot
became inactive in its lifetime. This commit adds two new metrics
last_inactive_at of type timestamptz and inactive_count of type numeric
to ReplicationSlotPersistentData. Whenever a slot becomes
inactive, the current timestamp and inactive count are persisted
to disk.

These metrics are useful in the following ways:
- To improve replication slot monitoring tools. For instance, one
can build a monitoring tool that signals a) when replication slots
is lying inactive for a day or so using last_inactive_at metric,
b) when a replication slot is becoming inactive too frequently
using last_inactive_at metric.

- To implement timeout-based inactive replication slot management
capability in postgres.

Increases SLOT_VERSION due to the added two new metrics.
---
 doc/src/sgml/system-views.sgml       | 20 +++++++++++++
 src/backend/catalog/system_views.sql |  4 ++-
 src/backend/replication/slot.c       | 43 ++++++++++++++++++++++------
 src/backend/replication/slotfuncs.c  | 15 +++++++++-
 src/include/catalog/pg_proc.dat      |  6 ++--
 src/include/replication/slot.h       |  6 ++++
 src/test/regress/expected/rules.out  |  6 ++--
 7 files changed, 84 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index f3fb5ba1b0..59cd1b5211 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2750,6 +2750,26 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        ID of role
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_at</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_count</structfield> <type>numeric</type>
+      </para>
+      <para>
+        The total number of times the slot became inactive in its lifetime.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index cd22dad959..de9f1d5506 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1026,7 +1026,9 @@ CREATE VIEW pg_replication_slots AS
             L.conflicting,
             L.failover,
             L.synced,
-            L.invalidation_reason
+            L.invalidation_reason,
+            L.last_inactive_at,
+            L.inactive_count
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 2a1885da24..e606218673 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -130,7 +130,7 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	5		/* version for new files */
+#define SLOT_VERSION	6		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -400,6 +400,8 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
 	slot->data.synced = synced;
+	slot->data.last_inactive_at = 0;
+	slot->data.inactive_count = 0;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
@@ -626,6 +628,17 @@ retry:
 
 	if (am_walsender)
 	{
+		if (s->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&s->mutex);
+			s->data.last_inactive_at = 0;
+			SpinLockRelease(&s->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				SlotIsLogical(s)
 				? errmsg("acquired logical replication slot \"%s\"",
@@ -693,16 +706,20 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
-	MyReplicationSlot = NULL;
-
-	/* might not have been set when we've been a plain slot */
-	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
-	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
-	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
-	LWLockRelease(ProcArrayLock);
-
 	if (am_walsender)
 	{
+		if (slot->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&slot->mutex);
+			slot->data.last_inactive_at = GetCurrentTimestamp();
+			slot->data.inactive_count++;
+			SpinLockRelease(&slot->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				is_logical
 				? errmsg("released logical replication slot \"%s\"",
@@ -712,6 +729,14 @@ ReplicationSlotRelease(void)
 
 		pfree(slotname);
 	}
+
+	MyReplicationSlot = NULL;
+
+	/* might not have been set when we've been a plain slot */
+	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
+	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
+	LWLockRelease(ProcArrayLock);
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index b5a638edea..4c7a120df1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 18
+#define PG_GET_REPLICATION_SLOTS_COLS 20
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -264,6 +264,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		WALAvailability walstate;
 		int			i;
 		ReplicationSlotInvalidationCause cause;
+		char		buf[256];
 
 		if (!slot->in_use)
 			continue;
@@ -436,6 +437,18 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		else
 			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
 
+		if (slot_contents.data.last_inactive_at > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.data.last_inactive_at);
+		else
+			nulls[i++] = true;
+
+		/* Convert to numeric. */
+		snprintf(buf, sizeof buf, UINT64_FORMAT, slot_contents.data.inactive_count);
+		values[i++] = DirectFunctionCall3(numeric_in,
+										  CStringGetDatum(buf),
+										  ObjectIdGetDatum(0),
+										  Int32GetDatum(-1));
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index e5dc1cbdb3..b26b53b714 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11120,9 +11120,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text,timestamptz,numeric}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason,last_inactive_at,inactive_count}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 614ba0e30b..780767a819 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -129,6 +129,12 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* When did this slot become inactive last time? */
+	TimestampTz last_inactive_at;
+
+	/* How many times the slot has been inactive? */
+	uint64		inactive_count;
 } ReplicationSlotPersistentData;
 
 /*
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 055bec068d..c0bdfe76d8 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1476,8 +1476,10 @@ pg_replication_slots| SELECT l.slot_name,
     l.conflicting,
     l.failover,
     l.synced,
-    l.invalidation_reason
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason)
+    l.invalidation_reason,
+    l.last_inactive_at,
+    l.inactive_count
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason, last_inactive_at, inactive_count)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v9-0004-Add-inactive_timeout-based-replication-slot-inval.patchapplication/x-patch; name=v9-0004-Add-inactive_timeout-based-replication-slot-inval.patchDownload

From ff5007e261039bf951a93babdf407faeeed2bdeb Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 14 Mar 2024 14:06:41 +0000
Subject: [PATCH v9 4/4] Add inactive_timeout based replication slot
 invalidation

Currently postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get dropped.

To achieve the above, postgres uses replication slot metric
inactive_at (the time at which the slot became inactive), and a
new GUC inactive_replication_slot_timeout. The checkpointer then
looks at all replication slots invalidating the inactive slots
based on the timeout set.
---
 doc/src/sgml/config.sgml                      | 18 +++++
 src/backend/access/transam/xlog.c             | 10 +++
 src/backend/replication/slot.c                | 22 +++++-
 src/backend/utils/misc/guc_tables.c           | 12 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/replication/slot.h                |  3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 79 +++++++++++++++++++
 7 files changed, 144 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 6dd54ffcb7..4b0b60a1ac 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4565,6 +4565,24 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-inactive-replication-slot-timeout" xreflabel="inactive_replication_slot_timeout">
+      <term><varname>inactive_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>inactive_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time at the next checkpoint. If this value is specified
+        without units, it is taken as seconds. A value of zero (which is
+        default) disables the timeout mechanism. This parameter can only be
+        set in the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 36ae2ac6a4..166c3ed794 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7152,6 +7152,11 @@ CreateCheckPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7607,6 +7612,11 @@ CreateRestartPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index e606218673..37498d3d98 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -108,10 +108,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
 	[RS_INVAL_XID_AGE] = "xid_aged",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -142,6 +143,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 int			max_slot_xid_age = 0;
+int			inactive_replication_slot_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -1513,6 +1515,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_XID_AGE:
 			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by inactive_replication_slot_timeout."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1665,6 +1670,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 						}
 					}
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (s->data.last_inactive_at > 0)
+					{
+						TimestampTz now;
+
+						Assert(s->data.persistency == RS_PERSISTENT);
+						Assert(s->active_pid == 0);
+
+						now = GetCurrentTimestamp();
+						if (TimestampDifferenceExceeds(s->data.last_inactive_at, now,
+													   inactive_replication_slot_timeout * 1000))
+							conflict = cause;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1819,6 +1838,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 6b5375909d..6caf40d51e 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2964,6 +2964,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"inactive_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&inactive_replication_slot_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index b4c928b826..7f2a3e41f1 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -261,6 +261,7 @@
 #recovery_prefetch = try	# prefetch pages referenced in the WAL?
 #wal_decode_buffer_size = 512kB	# lookahead window used for prefetching
 				# (change requires restart)
+#inactive_replication_slot_timeout = 0	# in seconds; 0 disables
 
 # - Archiving -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 780767a819..8378d7c913 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -55,6 +55,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* slot's xmin or catalog_xmin has reached the age */
 	RS_INVAL_XID_AGE,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -236,6 +238,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
 extern PGDLLIMPORT int max_slot_xid_age;
+extern PGDLLIMPORT int inactive_replication_slot_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index 2f482b56e8..4c66dd4a4e 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -105,4 +105,83 @@ $primary->poll_query_until(
   or die
   "Timed out while waiting for replication slot sb1_slot to be invalidated";
 
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb2_slot');
+]);
+
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET max_slot_xid_age = 0;
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+$standby2->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+});
+$standby2->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+# The inactive replication slot info should be null when the slot is active
+my $result = $primary->safe_psql(
+	'postgres', qq[
+	SELECT last_inactive_at IS NULL, inactive_count = 0 AS OK
+		FROM pg_replication_slots WHERE slot_name = 'sb2_slot';
+]);
+is($result, "t|t",
+	'check the inactive replication slot info for an active slot');
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET inactive_replication_slot_timeout TO '1s';
+]);
+$primary->reload;
+
+$logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby2->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_at IS NOT NULL AND
+		inactive_count = 1 AND slot_name = 'sb2_slot';
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+$invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb2_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb2_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb2_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for inactive replication slot sb2_slot to be invalidated";
+
 done_testing();
-- 
2.34.1

#54

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#53)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 14, 2024 at 7:58 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Mar 14, 2024 at 12:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Mar 13, 2024 at 9:24 PM Bharath Rupireddy

Yes, there will be some sort of duplicity if we emit conflict_reason
as a text field. However, I still think the better way is to turn
conflict_reason text to conflict boolean and set it to true only on
rows_removed and wal_level_insufficient invalidations. When conflict
boolean is true, one (including all the tests that we've added
recently) can look for invalidation_reason text field for the reason.
This sounds reasonable to me as opposed to we just mentioning in the
docs that "if invalidation_reason is rows_removed or
wal_level_insufficient it's the reason for conflict with recovery".

+1 on maintaining both conflicting and invalidation_reason

Fair point. I think we can go either way. Bertrand, Nathan, and
others, do you have an opinion on this matter?

While we wait to hear from others on this, I'm attaching the v9 patch
set implementing the above idea (check 0001 patch). Please have a
look. I'll come back to the other review comments soon.

Thanks for the patch. JFYI, patch09 does not apply to HEAD, some
recent commit caused the conflict.

Some trivial comments on patch001 (yet to review other patches)

1)
info.c:

- "%s as caught_up, conflict_reason IS NOT NULL as invalid "
+ "%s as caught_up, invalidation_reason IS NOT NULL as invalid "

Can we revert back to 'conflicting as invalid' since it is a query for
logical slots only.

2)
040_standby_failover_slots_sync.pl:

- q{SELECT conflict_reason IS NULL AND synced AND NOT temporary FROM
pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+ q{SELECT invalidation_reason IS NULL AND synced AND NOT temporary
FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}

Here too, can we have 'NOT conflicting' instead of '
invalidation_reason IS NULL' as it is a logical slot test.

thanks
Shveta

#55

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#43)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 13, 2024 at 9:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

BTW, is XID the based parameter 'max_slot_xid_age' not have similarity
with 'max_slot_wal_keep_size'? I think it will impact the rows we
removed based on xid horizons. Don't we need to consider it while
vacuum computing the xid horizons in ComputeXidHorizons() similar to
what we do for WAL w.r.t 'max_slot_wal_keep_size'?

I'm having a hard time understanding why we'd need something up there
in ComputeXidHorizons(). Can you elaborate it a bit please?

What's proposed with max_slot_xid_age is that during checkpoint we
look at slot's xmin and catalog_xmin, and the current system txn id.
Then, if the XID age of (xmin, catalog_xmin) and current_xid crosses
max_slot_xid_age, we invalidate the slot. Let me illustrate how all
this works:

1. Setup a primary and standby with hot_standby_feedback set to on on
standby. For instance, check my scripts at [1]cd /home/ubuntu/postgres/pg17/bin ./pg_ctl -D db17 -l logfile17 stop rm -rf db17 logfile17 rm -rf /home/ubuntu/postgres/pg17/bin/archived_wal mkdir /home/ubuntu/postgres/pg17/bin/archived_wal.

2. Stop the standby to make the slot inactive on the primary. Check
the slot is holding xmin of 738.
./pg_ctl -D sbdata -l logfilesbdata stop

3. Start consuming the XIDs on the primary with the following script
for instance
./psql -d postgres -p 5432
DROP TABLE tab_int;
CREATE TABLE tab_int (a int);

do $$
begin
for i in 1..268435 loop
-- use an exception block so that each iteration eats an XID
begin
insert into tab_int values (i);
exception
when division_by_zero then null;
end;
end loop;
end$$;

4. Make some dead rows in the table.
update tab_int set a = a+1;
delete from tab_int where a%4=0;

postgres=# SELECT n_dead_tup, n_tup_ins, n_tup_upd, n_tup_del FROM
pg_stat_user_tables WHERE relname = 'tab_int';
-[ RECORD 1 ]------
n_dead_tup | 335544
n_tup_ins | 268435
n_tup_upd | 268435
n_tup_del | 67109

5. Try vacuuming to delete the dead rows, observe 'tuples: 0 removed,
536870 remain, 335544 are dead but not yet removable'. The dead rows
can't be removed because the inactive slot is holding an xmin, see
'removable cutoff: 738, which was 268441 XIDs old when operation
ended'.

postgres=# vacuum verbose tab_int;
INFO: vacuuming "postgres.public.tab_int"
INFO: finished vacuuming "postgres.public.tab_int": index scans: 0
pages: 0 removed, 2376 remain, 2376 scanned (100.00% of total)
tuples: 0 removed, 536870 remain, 335544 are dead but not yet removable
removable cutoff: 738, which was 268441 XIDs old when operation ended
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
index scan not needed: 0 pages from table (0.00% of total) had 0 dead
item identifiers removed
avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
buffer usage: 4759 hits, 0 misses, 0 dirtied
WAL usage: 0 records, 0 full page images, 0 bytes
system usage: CPU: user: 0.07 s, system: 0.00 s, elapsed: 0.07 s
VACUUM

6. Now, repeat the above steps but with setting max_slot_xid_age =
200000 on the primary.

8. And, then vacuum the table, observe 'tuples: 335544 removed, 201326
remain, 0 are dead but not yet removable'.

postgres=# vacuum verbose tab_int;
INFO: vacuuming "postgres.public.tab_int"
INFO: finished vacuuming "postgres.public.tab_int": index scans: 0
pages: 0 removed, 2376 remain, 2376 scanned (100.00% of total)
tuples: 335544 removed, 201326 remain, 0 are dead but not yet removable
removable cutoff: 269179, which was 0 XIDs old when operation ended
new relfrozenxid: 269179, which is 268441 XIDs ahead of previous value
frozen: 1189 pages from table (50.04% of total) had 201326 tuples frozen
index scan not needed: 0 pages from table (0.00% of total) had 0 dead
item identifiers removed
avg read rate: 0.000 MB/s, avg write rate: 193.100 MB/s
buffer usage: 4760 hits, 0 misses, 2381 dirtied
WAL usage: 5942 records, 2378 full page images, 8343275 bytes
system usage: CPU: user: 0.09 s, system: 0.00 s, elapsed: 0.09 s
VACUUM

[1]: cd /home/ubuntu/postgres/pg17/bin ./pg_ctl -D db17 -l logfile17 stop rm -rf db17 logfile17 rm -rf /home/ubuntu/postgres/pg17/bin/archived_wal mkdir /home/ubuntu/postgres/pg17/bin/archived_wal
cd /home/ubuntu/postgres/pg17/bin
./pg_ctl -D db17 -l logfile17 stop
rm -rf db17 logfile17
rm -rf /home/ubuntu/postgres/pg17/bin/archived_wal
mkdir /home/ubuntu/postgres/pg17/bin/archived_wal

./initdb -D db17
echo "archive_mode = on
archive_command='cp %p
/home/ubuntu/postgres/pg17/bin/archived_wal/%f'" | tee -a
db17/postgresql.conf

./pg_ctl -D db17 -l logfile17 start
./psql -d postgres -p 5432 -c "SELECT
pg_create_physical_replication_slot('sb_repl_slot', true, false);"

rm -rf sbdata logfilesbdata
./pg_basebackup -D sbdata
echo "port=5433
primary_conninfo='host=localhost port=5432 dbname=postgres user=ubuntu'
primary_slot_name='sb_repl_slot'
restore_command='cp /home/ubuntu/postgres/pg17/bin/archived_wal/%f %p'
hot_standby_feedback = on" | tee -a sbdata/postgresql.conf

touch sbdata/standby.signal

./pg_ctl -D sbdata -l logfilesbdata start
./psql -d postgres -p 5433 -c "SELECT pg_is_in_recovery();"

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#56

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#53)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 14, 2024 at 7:58 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

While we wait to hear from others on this, I'm attaching the v9 patch
set implementing the above idea (check 0001 patch). Please have a
look. I'll come back to the other review comments soon.

patch002:

1)
I would like to understand the purpose of 'inactive_count'? Is it only
for users for monitoring purposes? We are not using it anywhere
internally.

I shutdown the instance 5 times and found that 'inactive_count' became
5 for all the slots created on that instance. Is this intentional? I
mean we can not really use them if the instance is down. I felt it
should increment the inactive_count only if during the span of
instance, they were actually inactive i.e. no streaming or replication
happening through them.

2)
slot.c:
+ case RS_INVAL_XID_AGE:
+ {
+ if (TransactionIdIsNormal(s->data.xmin))
+ {
+                          ..........
+ }
+ if (TransactionIdIsNormal(s->data.catalog_xmin))
+ {
+                          ..........
+ }
+ }

Can we optimize this code? It has duplicate code for processing
s->data.catalog_xmin and s->data.xmin. Can we create a sub-function
for this purpose and call it twice here?

thanks
Shveta

#57

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: shveta malik (#54)

4 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 15, 2024 at 10:15 AM shveta malik <shveta.malik@gmail.com> wrote:

wal_level_insufficient it's the reason for conflict with recovery".

+1 on maintaining both conflicting and invalidation_reason

Thanks.

Thanks for the patch. JFYI, patch09 does not apply to HEAD, some
recent commit caused the conflict.

Yep, the conflict is in src/test/recovery/meson.build and is because
of e6927270cd18d535b77cbe79c55c6584351524be.

Some trivial comments on patch001 (yet to review other patches)

Thanks for looking into this.

1)
info.c:
- "%s as caught_up, conflict_reason IS NOT NULL as invalid "
+ "%s as caught_up, invalidation_reason IS NOT NULL as invalid "
Can we revert back to 'conflicting as invalid' since it is a query for
logical slots only.

I guess, no. There the intention is to check for invalid logical slots
not just for the conflicting ones. The logical slots can get
invalidated due to other reasons as well.

2)
040_standby_failover_slots_sync.pl:
- q{SELECT conflict_reason IS NULL AND synced AND NOT temporary FROM
pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+ q{SELECT invalidation_reason IS NULL AND synced AND NOT temporary
FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
Here too, can we have 'NOT conflicting' instead of '
invalidation_reason IS NULL' as it is a logical slot test.

I guess no. The tests are ensuring the slot on the standby isn't invalidated.

In general, one needs to use the 'conflicting' column from
pg_replication_slots when the intention is to look for reasons for
conflicts, otherwise use the 'invalidation_reason' column for
invalidations.

Please see the attached v10 patch set after resolving the merge
conflict and fixing an indentation warning in the TAP test file.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v10-0001-Track-invalidation_reason-in-pg_replication_slot.patchapplication/x-patch; name=v10-0001-Track-invalidation_reason-in-pg_replication_slot.patchDownload

From 41290be4eb1562cf10313e3eda19fcbbf392088f Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 15 Mar 2024 05:47:44 +0000
Subject: [PATCH v10 1/4] Track invalidation_reason in pg_replication_slots

Up until now, reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
conflict_reason to show the reasons for slot invalidation, but
only for logical slots.

This commit adds a new column to show invalidation reasons for
both physical and logical slots. And, this commit also turns
conflict_reason text column to conflicting boolean column
(effectively reverting commit 007693f2a). One now can look at the
new invalidation_reason column for logical slots conflict with
recovery.
---
 doc/src/sgml/ref/pgupgrade.sgml               |  4 +-
 doc/src/sgml/system-views.sgml                | 63 +++++++++++--------
 src/backend/catalog/system_views.sql          |  5 +-
 src/backend/replication/logical/slotsync.c    |  2 +-
 src/backend/replication/slot.c                |  8 +--
 src/backend/replication/slotfuncs.c           | 25 +++++---
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  6 +-
 src/include/replication/slot.h                |  2 +-
 .../t/035_standby_logical_decoding.pl         | 35 ++++++-----
 .../t/040_standby_failover_slots_sync.pl      |  4 +-
 src/test/regress/expected/rules.out           |  7 ++-
 12 files changed, 95 insertions(+), 70 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 58c6c2df8b..8de52bf752 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -453,8 +453,8 @@ make prefix=/usr/local/pgsql.new install
       <para>
        All slots on the old cluster must be usable, i.e., there are no slots
        whose
-       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>conflict_reason</structfield>
-       is not <literal>NULL</literal>.
+       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>conflicting</structfield>
+       is not <literal>true</literal>.
       </para>
      </listitem>
      <listitem>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be90edd0e2..f3fb5ba1b0 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,34 +2525,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflict_reason</structfield> <type>text</type>
+       <structfield>conflicting</structfield> <type>bool</type>
       </para>
       <para>
-       The reason for the logical slot's conflict with recovery. It is always
-       NULL for physical slots, as well as for logical slots which are not
-       invalidated. The non-NULL values indicate that the slot is marked
-       as invalidated. Possible values are:
-       <itemizedlist spacing="compact">
-        <listitem>
-         <para>
-          <literal>wal_removed</literal> means that the required WAL has been
-          removed.
-         </para>
-        </listitem>
-        <listitem>
-         <para>
-          <literal>rows_removed</literal> means that the required rows have
-          been removed.
-         </para>
-        </listitem>
-        <listitem>
-         <para>
-          <literal>wal_level_insufficient</literal> means that the
-          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
-          perform logical decoding.
-         </para>
-        </listitem>
-       </itemizedlist>
+       True if this logical slot conflicted with recovery (and so is now
+       invalidated). When this column is true, check
+       <structfield>invalidation_reason</structfield> column for the conflict
+       reason.
       </para></entry>
      </row>
 
@@ -2581,6 +2560,38 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>invalidation_reason</structfield> <type>text</type>
+      </para>
+      <para>
+       The reason for the slot's invalidation. <literal>NULL</literal> if the
+       slot is currently actively being used. The non-NULL values indicate that
+       the slot is marked as invalidated. Possible values are:
+       <itemizedlist spacing="compact">
+        <listitem>
+         <para>
+          <literal>wal_removed</literal> means that the required WAL has been
+          removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>rows_removed</literal> means that the required rows have
+          been removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>wal_level_insufficient</literal> means that the
+          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
+          perform logical decoding.
+         </para>
+        </listitem>
+       </itemizedlist>
+      </para></entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 04227a72d1..cd22dad959 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,9 +1023,10 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflict_reason,
+            L.conflicting,
             L.failover,
-            L.synced
+            L.synced,
+            L.invalidation_reason
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 5074c8409f..260632cfdd 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -668,7 +668,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, conflict_reason"
+		" database, invalidation_reason"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 91ca397857..4f1a17f6ce 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -2356,21 +2356,21 @@ RestoreSlotFromDisk(const char *name)
 }
 
 /*
- * Maps a conflict reason for a replication slot to
+ * Maps a invalidation reason for a replication slot to
  * ReplicationSlotInvalidationCause.
  */
 ReplicationSlotInvalidationCause
-GetSlotInvalidationCause(const char *conflict_reason)
+GetSlotInvalidationCause(const char *invalidation_reason)
 {
 	ReplicationSlotInvalidationCause cause;
 	ReplicationSlotInvalidationCause result = RS_INVAL_NONE;
 	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
 
-	Assert(conflict_reason);
+	Assert(invalidation_reason);
 
 	for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
 	{
-		if (strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0)
+		if (strcmp(SlotInvalidationCauses[cause], invalidation_reason) == 0)
 		{
 			found = true;
 			result = cause;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index ad79e1fccd..b5a638edea 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 17
+#define PG_GET_REPLICATION_SLOTS_COLS 18
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -263,6 +263,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		bool		nulls[PG_GET_REPLICATION_SLOTS_COLS];
 		WALAvailability walstate;
 		int			i;
+		ReplicationSlotInvalidationCause cause;
 
 		if (!slot->in_use)
 			continue;
@@ -409,22 +410,32 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.data.database == InvalidOid)
+		cause = slot_contents.data.invalidated;
+
+		if (SlotIsPhysical(&slot_contents))
 			nulls[i++] = true;
 		else
 		{
-			ReplicationSlotInvalidationCause cause = slot_contents.data.invalidated;
-
-			if (cause == RS_INVAL_NONE)
-				nulls[i++] = true;
+			/*
+			 * rows_removed and wal_level_insufficient are only two reasons
+			 * for the logical slot's conflict with recovery.
+			 */
+			if (cause == RS_INVAL_HORIZON ||
+				cause == RS_INVAL_WAL_LEVEL)
+				values[i++] = BoolGetDatum(true);
 			else
-				values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+				values[i++] = BoolGetDatum(false);
 		}
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
 		values[i++] = BoolGetDatum(slot_contents.data.synced);
 
+		if (cause == RS_INVAL_NONE)
+			nulls[i++] = true;
+		else
+			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index b5b8d11602..34a157f792 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -676,13 +676,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+							"(CASE WHEN conflicting THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 700f7daf7b..63fd0b4cd7 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11123,9 +11123,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 425effad21..7f25a083ee 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -273,7 +273,7 @@ extern void CheckPointReplicationSlots(bool is_shutdown);
 extern void CheckSlotRequirements(void);
 extern void CheckSlotPermissions(void);
 extern ReplicationSlotInvalidationCause
-			GetSlotInvalidationCause(const char *conflict_reason);
+			GetSlotInvalidationCause(const char *invalidation_reason);
 
 extern bool SlotExistsInStandbySlotNames(const char *slot_name);
 extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 88b03048c4..8d6740c734 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,7 +168,7 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflict_reason in pg_replication_slots.
+# Check reason for conflict in pg_replication_slots.
 sub check_slots_conflict_reason
 {
 	my ($slot_prefix, $reason) = @_;
@@ -178,15 +178,15 @@ sub check_slots_conflict_reason
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$active_slot' and conflicting;));
 
-	is($res, "$reason", "$active_slot conflict_reason is $reason");
+	is($res, "$reason", "$active_slot reason for conflict is $reason");
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$inactive_slot' and conflicting;));
 
-	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
+	is($res, "$reason", "$inactive_slot reason for conflict is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -293,13 +293,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflict_reason is NULL for physical slot
+# Check conflicting is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT conflicting is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflict_reason as NULL");
+is($res, 't', "Physical slot reports conflicting as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -524,7 +524,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Ensure that replication slot stats are not removed after invalidation.
@@ -551,7 +551,7 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify conflict_reason is retained across a restart.
+# Verify reason for conflict is retained across a restart.
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
@@ -560,7 +560,8 @@ check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
+	"SELECT restart_lsn FROM pg_replication_slots
+		WHERE slot_name = 'vacuum_full_activeslot' AND conflicting;"
 );
 
 chomp($restart_lsn);
@@ -611,7 +612,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
@@ -647,7 +648,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
@@ -700,8 +701,8 @@ ok( $node_standby->poll_query_until(
 is( $node_standby->safe_psql(
 		'postgres',
 		q[select bool_or(conflicting) from
-		  (select conflict_reason is not NULL as conflicting
-		   from pg_replication_slots WHERE slot_type = 'logical')]),
+		  (select conflicting from pg_replication_slots
+			where slot_type = 'logical')]),
 	'f',
 	'Logical slots are reported as non conflicting');
 
@@ -739,7 +740,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
@@ -783,7 +784,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+# Verify reason for conflict is 'wal_level_insufficient' in pg_replication_slots
 check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 0ea1f3d323..f47bfd78eb 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -228,7 +228,7 @@ $standby1->safe_psql('postgres', "CHECKPOINT");
 # Check if the synced slot is invalidated
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'synchronized slot has been invalidated');
@@ -274,7 +274,7 @@ $standby1->wait_for_log(qr/dropped replication slot "lsub1_slot" of dbid [0-9]+/
 # flagged as 'synced'
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'logical slot is re-synced');
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 0cd2c64fca..055bec068d 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,10 +1473,11 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflict_reason,
+    l.conflicting,
     l.failover,
-    l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced)
+    l.synced,
+    l.invalidation_reason
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v10-0002-Add-XID-age-based-replication-slot-invalidation.patchapplication/x-patch; name=v10-0002-Add-XID-age-based-replication-slot-invalidation.patchDownload

From 98b48e257847299b676e90af25c26f9d4150669a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 15 Mar 2024 05:48:01 +0000
Subject: [PATCH v10 2/4] Add XID age based replication slot invalidation

Up until now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres uses replication slot xmin (the
oldest transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain), and a new GUC
max_slot_xid_age. The checkpointer then looks at all replication
slots invalidating the slots based on the age set.
---
 doc/src/sgml/config.sgml                      |  21 ++++
 src/backend/access/transam/xlog.c             |  10 ++
 src/backend/replication/slot.c                |  44 ++++++-
 src/backend/utils/misc/guc_tables.c           |  10 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   3 +
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 108 ++++++++++++++++++
 8 files changed, 197 insertions(+), 1 deletion(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 65a6e6c408..6dd54ffcb7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4544,6 +4544,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age">
+      <term><varname>max_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>max_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 20a5f86209..36ae2ac6a4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7147,6 +7147,11 @@ CreateCheckPoint(int flags)
 	if (PriorRedoPtr != InvalidXLogRecPtr)
 		UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7597,6 +7602,11 @@ CreateRestartPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 4f1a17f6ce..2a1885da24 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_XID_AGE] = "xid_aged",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			max_slot_xid_age = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -1483,6 +1485,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_XID_AGE:
+			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1599,6 +1604,42 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						conflict = cause;
 					break;
+				case RS_INVAL_XID_AGE:
+					{
+						TransactionId xid_cur = ReadNextTransactionId();
+						TransactionId xid_limit;
+						TransactionId xid_slot;
+
+						if (TransactionIdIsNormal(s->data.xmin))
+						{
+							xid_slot = s->data.xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+						if (TransactionIdIsNormal(s->data.catalog_xmin))
+						{
+							xid_slot = s->data.catalog_xmin;
+
+							xid_limit = xid_slot + max_slot_xid_age;
+							if (xid_limit < FirstNormalTransactionId)
+								xid_limit += FirstNormalTransactionId;
+
+							if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+							{
+								conflict = cause;
+								break;
+							}
+						}
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1752,6 +1793,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 57d9de4dd9..6b5375909d 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2954,6 +2954,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"max_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&max_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 2244ee52f7..b4c928b826 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -334,6 +334,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#max_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7f25a083ee..614ba0e30b 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -227,6 +229,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
+extern PGDLLIMPORT int max_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..2f482b56e8
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,108 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Initialize primary node, setting wal-segsize to 1MB
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 1, extra => ['--wal-segsize=1']);
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+});
+$primary->start;
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb1_slot');
+]);
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby1->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+$standby1->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NOT NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb1_slot';
+]) or die "Timed out waiting for slot xmin to advance";
+
+$primary->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET max_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby1->stop;
+
+my $logstart = -s $primary->logfile;
+
+# Do some work to advance xmin
+$primary->safe_psql(
+	'postgres', q{
+do $$
+begin
+  for i in 10000..11000 loop
+    -- use an exception block so that each iteration eats an XID
+    begin
+      insert into tab_int values (i);
+    exception
+      when division_by_zero then null;
+    end;
+  end loop;
+end$$;
+});
+
+my $invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb1_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb1_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'xid_aged';
+])
+  or die
+  "Timed out while waiting for replication slot sb1_slot to be invalidated";
+
+done_testing();
-- 
2.34.1

v10-0003-Track-inactive-replication-slot-information.patchapplication/x-patch; name=v10-0003-Track-inactive-replication-slot-information.patchDownload

From 488702c43d1b6fbb2d7dc56eb5e1409484d8b25f Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 15 Mar 2024 05:48:33 +0000
Subject: [PATCH v10 3/4] Track inactive replication slot information

Up until now, postgres doesn't track metrics like the time at which
the slot became inactive, and the total number of times the slot
became inactive in its lifetime. This commit adds two new metrics
last_inactive_at of type timestamptz and inactive_count of type numeric
to ReplicationSlotPersistentData. Whenever a slot becomes
inactive, the current timestamp and inactive count are persisted
to disk.

These metrics are useful in the following ways:
- To improve replication slot monitoring tools. For instance, one
can build a monitoring tool that signals a) when replication slots
is lying inactive for a day or so using last_inactive_at metric,
b) when a replication slot is becoming inactive too frequently
using last_inactive_at metric.

- To implement timeout-based inactive replication slot management
capability in postgres.

Increases SLOT_VERSION due to the added two new metrics.
---
 doc/src/sgml/system-views.sgml       | 20 +++++++++++++
 src/backend/catalog/system_views.sql |  4 ++-
 src/backend/replication/slot.c       | 43 ++++++++++++++++++++++------
 src/backend/replication/slotfuncs.c  | 15 +++++++++-
 src/include/catalog/pg_proc.dat      |  6 ++--
 src/include/replication/slot.h       |  6 ++++
 src/test/regress/expected/rules.out  |  6 ++--
 7 files changed, 84 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index f3fb5ba1b0..59cd1b5211 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2750,6 +2750,26 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        ID of role
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_at</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_count</structfield> <type>numeric</type>
+      </para>
+      <para>
+        The total number of times the slot became inactive in its lifetime.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index cd22dad959..de9f1d5506 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1026,7 +1026,9 @@ CREATE VIEW pg_replication_slots AS
             L.conflicting,
             L.failover,
             L.synced,
-            L.invalidation_reason
+            L.invalidation_reason,
+            L.last_inactive_at,
+            L.inactive_count
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 2a1885da24..e606218673 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -130,7 +130,7 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	5		/* version for new files */
+#define SLOT_VERSION	6		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -400,6 +400,8 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
 	slot->data.synced = synced;
+	slot->data.last_inactive_at = 0;
+	slot->data.inactive_count = 0;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
@@ -626,6 +628,17 @@ retry:
 
 	if (am_walsender)
 	{
+		if (s->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&s->mutex);
+			s->data.last_inactive_at = 0;
+			SpinLockRelease(&s->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				SlotIsLogical(s)
 				? errmsg("acquired logical replication slot \"%s\"",
@@ -693,16 +706,20 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
-	MyReplicationSlot = NULL;
-
-	/* might not have been set when we've been a plain slot */
-	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
-	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
-	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
-	LWLockRelease(ProcArrayLock);
-
 	if (am_walsender)
 	{
+		if (slot->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&slot->mutex);
+			slot->data.last_inactive_at = GetCurrentTimestamp();
+			slot->data.inactive_count++;
+			SpinLockRelease(&slot->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				is_logical
 				? errmsg("released logical replication slot \"%s\"",
@@ -712,6 +729,14 @@ ReplicationSlotRelease(void)
 
 		pfree(slotname);
 	}
+
+	MyReplicationSlot = NULL;
+
+	/* might not have been set when we've been a plain slot */
+	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
+	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
+	LWLockRelease(ProcArrayLock);
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index b5a638edea..4c7a120df1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 18
+#define PG_GET_REPLICATION_SLOTS_COLS 20
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -264,6 +264,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		WALAvailability walstate;
 		int			i;
 		ReplicationSlotInvalidationCause cause;
+		char		buf[256];
 
 		if (!slot->in_use)
 			continue;
@@ -436,6 +437,18 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		else
 			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
 
+		if (slot_contents.data.last_inactive_at > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.data.last_inactive_at);
+		else
+			nulls[i++] = true;
+
+		/* Convert to numeric. */
+		snprintf(buf, sizeof buf, UINT64_FORMAT, slot_contents.data.inactive_count);
+		values[i++] = DirectFunctionCall3(numeric_in,
+										  CStringGetDatum(buf),
+										  ObjectIdGetDatum(0),
+										  Int32GetDatum(-1));
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 63fd0b4cd7..c7ab0893eb 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11123,9 +11123,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text,timestamptz,numeric}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason,last_inactive_at,inactive_count}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 614ba0e30b..780767a819 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -129,6 +129,12 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* When did this slot become inactive last time? */
+	TimestampTz last_inactive_at;
+
+	/* How many times the slot has been inactive? */
+	uint64		inactive_count;
 } ReplicationSlotPersistentData;
 
 /*
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 055bec068d..c0bdfe76d8 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1476,8 +1476,10 @@ pg_replication_slots| SELECT l.slot_name,
     l.conflicting,
     l.failover,
     l.synced,
-    l.invalidation_reason
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason)
+    l.invalidation_reason,
+    l.last_inactive_at,
+    l.inactive_count
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason, last_inactive_at, inactive_count)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v10-0004-Add-inactive_timeout-based-replication-slot-inva.patchapplication/x-patch; name=v10-0004-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 559c02f760d998b0b78f68fbbd60fa5d3ec960d9 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 15 Mar 2024 05:49:02 +0000
Subject: [PATCH v10 4/4] Add inactive_timeout based replication slot
 invalidation

Up until now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get dropped.

To achieve the above, postgres uses replication slot metric
inactive_at (the time at which the slot became inactive), and a
new GUC inactive_replication_slot_timeout. The checkpointer then
looks at all replication slots invalidating the inactive slots
based on the timeout set.
---
 doc/src/sgml/config.sgml                      | 18 +++++
 src/backend/access/transam/xlog.c             | 10 +++
 src/backend/replication/slot.c                | 22 +++++-
 src/backend/utils/misc/guc_tables.c           | 12 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/replication/slot.h                |  3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 79 +++++++++++++++++++
 7 files changed, 144 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 6dd54ffcb7..4b0b60a1ac 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4565,6 +4565,24 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-inactive-replication-slot-timeout" xreflabel="inactive_replication_slot_timeout">
+      <term><varname>inactive_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>inactive_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time at the next checkpoint. If this value is specified
+        without units, it is taken as seconds. A value of zero (which is
+        default) disables the timeout mechanism. This parameter can only be
+        set in the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 36ae2ac6a4..166c3ed794 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7152,6 +7152,11 @@ CreateCheckPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7607,6 +7612,11 @@ CreateRestartPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index e606218673..37498d3d98 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -108,10 +108,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
 	[RS_INVAL_XID_AGE] = "xid_aged",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -142,6 +143,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 int			max_slot_xid_age = 0;
+int			inactive_replication_slot_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -1513,6 +1515,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_XID_AGE:
 			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by inactive_replication_slot_timeout."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1665,6 +1670,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 						}
 					}
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (s->data.last_inactive_at > 0)
+					{
+						TimestampTz now;
+
+						Assert(s->data.persistency == RS_PERSISTENT);
+						Assert(s->active_pid == 0);
+
+						now = GetCurrentTimestamp();
+						if (TimestampDifferenceExceeds(s->data.last_inactive_at, now,
+													   inactive_replication_slot_timeout * 1000))
+							conflict = cause;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1819,6 +1838,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 6b5375909d..6caf40d51e 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2964,6 +2964,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"inactive_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&inactive_replication_slot_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index b4c928b826..7f2a3e41f1 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -261,6 +261,7 @@
 #recovery_prefetch = try	# prefetch pages referenced in the WAL?
 #wal_decode_buffer_size = 512kB	# lookahead window used for prefetching
 				# (change requires restart)
+#inactive_replication_slot_timeout = 0	# in seconds; 0 disables
 
 # - Archiving -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 780767a819..8378d7c913 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -55,6 +55,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* slot's xmin or catalog_xmin has reached the age */
 	RS_INVAL_XID_AGE,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -236,6 +238,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
 extern PGDLLIMPORT int max_slot_xid_age;
+extern PGDLLIMPORT int inactive_replication_slot_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index 2f482b56e8..4c66dd4a4e 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -105,4 +105,83 @@ $primary->poll_query_until(
   or die
   "Timed out while waiting for replication slot sb1_slot to be invalidated";
 
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb2_slot');
+]);
+
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET max_slot_xid_age = 0;
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+$standby2->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+});
+$standby2->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+# The inactive replication slot info should be null when the slot is active
+my $result = $primary->safe_psql(
+	'postgres', qq[
+	SELECT last_inactive_at IS NULL, inactive_count = 0 AS OK
+		FROM pg_replication_slots WHERE slot_name = 'sb2_slot';
+]);
+is($result, "t|t",
+	'check the inactive replication slot info for an active slot');
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET inactive_replication_slot_timeout TO '1s';
+]);
+$primary->reload;
+
+$logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby2->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_at IS NOT NULL AND
+		inactive_count = 1 AND slot_name = 'sb2_slot';
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+$invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb2_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb2_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb2_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for inactive replication slot sb2_slot to be invalidated";
+
 done_testing();
-- 
2.34.1

#58

Nathan Bossart

nathandbossart@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#51)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 14, 2024 at 12:24:00PM +0530, Amit Kapila wrote:

On Wed, Mar 13, 2024 at 9:24 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Mar 13, 2024 at 9:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

So, how about we turn conflict_reason to only report the reasons that
actually cause conflict with recovery for logical slots, something
like below, and then have invalidation_cause as a generic column for
all sorts of invalidation reasons for both logical and physical slots?

If our above understanding is correct then coflict_reason will be a
subset of invalidation_reason. If so, whatever way we arrange this
information, there will be some sort of duplicity unless we just have
one column 'invalidation_reason' and update the docs to interpret it
correctly for conflicts.

Yes, there will be some sort of duplicity if we emit conflict_reason
as a text field. However, I still think the better way is to turn
conflict_reason text to conflict boolean and set it to true only on
rows_removed and wal_level_insufficient invalidations. When conflict
boolean is true, one (including all the tests that we've added
recently) can look for invalidation_reason text field for the reason.
This sounds reasonable to me as opposed to we just mentioning in the
docs that "if invalidation_reason is rows_removed or
wal_level_insufficient it's the reason for conflict with recovery".

Fair point. I think we can go either way. Bertrand, Nathan, and
others, do you have an opinion on this matter?

WFM

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#59

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#51)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Thu, Mar 14, 2024 at 12:24:00PM +0530, Amit Kapila wrote:

On Wed, Mar 13, 2024 at 9:24 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Mar 13, 2024 at 9:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

So, how about we turn conflict_reason to only report the reasons that
actually cause conflict with recovery for logical slots, something
like below, and then have invalidation_cause as a generic column for
all sorts of invalidation reasons for both logical and physical slots?

If our above understanding is correct then coflict_reason will be a
subset of invalidation_reason. If so, whatever way we arrange this
information, there will be some sort of duplicity unless we just have
one column 'invalidation_reason' and update the docs to interpret it
correctly for conflicts.

Yes, there will be some sort of duplicity if we emit conflict_reason
as a text field. However, I still think the better way is to turn
conflict_reason text to conflict boolean and set it to true only on
rows_removed and wal_level_insufficient invalidations. When conflict
boolean is true, one (including all the tests that we've added
recently) can look for invalidation_reason text field for the reason.
This sounds reasonable to me as opposed to we just mentioning in the
docs that "if invalidation_reason is rows_removed or
wal_level_insufficient it's the reason for conflict with recovery".

Fair point. I think we can go either way. Bertrand, Nathan, and
others, do you have an opinion on this matter?

Sounds like a good approach to me and one will be able to quickly identify
if a conflict occured.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#60

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: shveta malik (#56)

4 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 15, 2024 at 12:49 PM shveta malik <shveta.malik@gmail.com> wrote:

patch002:

1)
I would like to understand the purpose of 'inactive_count'? Is it only
for users for monitoring purposes? We are not using it anywhere
internally.

inactive_count metric helps detect unstable replication slots
connections that have a lot of disconnections. It's not used for the
inactive_timeout based slot invalidation mechanism.

I shutdown the instance 5 times and found that 'inactive_count' became
5 for all the slots created on that instance. Is this intentional?

Yes, it's incremented on shutdown (and for that matter upon every slot
release) for all the slots that are tied to walsenders.

I mean we can not really use them if the instance is down. I felt it
should increment the inactive_count only if during the span of
instance, they were actually inactive i.e. no streaming or replication
happening through them.

inactive_count is persisted to disk- upon clean shutdown, so, once the
slots become active again, one gets to see the metric and deduce some
info on disconnections.

Having said that, I'm okay to hear from others on the inactive_count
metric being added.

2)
slot.c:
+ case RS_INVAL_XID_AGE:

Can we optimize this code? It has duplicate code for processing
s->data.catalog_xmin and s->data.xmin. Can we create a sub-function
for this purpose and call it twice here?

Good idea. Done that way.

2)
The msg for patch 3 says:
--------------
a) when replication slots is lying inactive for a day or so using
last_inactive_at metric,
b) when a replication slot is becoming inactive too frequently using
last_inactive_at metric.
--------------
I think in b, you want to refer to inactive_count instead of last_inactive_at?

Right. Changed.

3)
I do not see invalidation_reason updated for 2 new reasons in system-views.sgml

Nice catch. Added them now.

I've also responded to Bertrand's comments here.

On Wed, Mar 6, 2024 at 3:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

A few comments:

1 ===
+       The reason for the slot's invalidation. <literal>NULL</literal> if the
+       slot is currently actively being used.
s/currently actively being used/not invalidated/ ? (I mean it could be valid
and not being used).

Changed.

3 ===

res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-                                                       "%s as caught_up, conflict_reason IS NOT NULL as invalid "
+                                                       "%s as caught_up, invalidation_reason IS NOT NULL as invalid "
"FROM pg_catalog.pg_replication_slots "
-                                                       "(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+                                                       "(CASE WHEN invalidation_reason IS NOT NULL THEN FALSE "

Yeah that's fine because there is logical slot filtering here.

Right. And, we really are looking for invalid slots there, so use of
invalidation_reason is much more correct than conflicting.

4 ===
-GetSlotInvalidationCause(const char *conflict_reason)
+GetSlotInvalidationCause(const char *invalidation_reason)
Should we change the comment "Maps a conflict reason" above this function?

Changed.

5 ===

-# Check conflict_reason is NULL for physical slot
+# Check invalidation_reason is NULL for physical slot
$res = $node_primary->safe_psql(
'postgres', qq[
-                SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+                SELECT invalidation_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
);

I don't think this test is needed anymore: it does not make that much sense since
it's done after the primary database initialization and startup.

It is now turned into a test verifying 'conflicting boolean' is null
for the physical slot. Isn't that okay?

6 ===

'Logical slots are reported as non conflicting');

What about?

"
# Verify slots are reported as valid in pg_replication_slots
'Logical slots are reported as valid');
"

Changed.

Please see the attached v11 patch set with all the above review
comments addressed.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v11-0001-Track-invalidation_reason-in-pg_replication_slot.patchapplication/octet-stream; name=v11-0001-Track-invalidation_reason-in-pg_replication_slot.patchDownload

From 483824a8b3248fe08b6bdf22c68bada4f0549212 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 16 Mar 2024 03:39:35 +0000
Subject: [PATCH v11 1/4] Track invalidation_reason in pg_replication_slots

Up until now, reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
conflict_reason to show the reasons for slot invalidation, but
only for logical slots.

This commit adds a new column to show invalidation reasons for
both physical and logical slots. And, this commit also turns
conflict_reason text column to conflicting boolean column
(effectively reverting commit 007693f2a). One now can look at the
new invalidation_reason column for logical slots conflict with
recovery.
---
 doc/src/sgml/ref/pgupgrade.sgml               |  4 +-
 doc/src/sgml/system-views.sgml                | 63 +++++++++++--------
 src/backend/catalog/system_views.sql          |  5 +-
 src/backend/replication/logical/slotsync.c    |  2 +-
 src/backend/replication/slot.c                |  8 +--
 src/backend/replication/slotfuncs.c           | 25 +++++---
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  6 +-
 src/include/replication/slot.h                |  2 +-
 .../t/035_standby_logical_decoding.pl         | 39 ++++++------
 .../t/040_standby_failover_slots_sync.pl      |  4 +-
 src/test/regress/expected/rules.out           |  7 ++-
 12 files changed, 97 insertions(+), 72 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 58c6c2df8b..8de52bf752 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -453,8 +453,8 @@ make prefix=/usr/local/pgsql.new install
       <para>
        All slots on the old cluster must be usable, i.e., there are no slots
        whose
-       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>conflict_reason</structfield>
-       is not <literal>NULL</literal>.
+       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>conflicting</structfield>
+       is not <literal>true</literal>.
       </para>
      </listitem>
      <listitem>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be90edd0e2..e685921847 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,34 +2525,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflict_reason</structfield> <type>text</type>
+       <structfield>conflicting</structfield> <type>bool</type>
       </para>
       <para>
-       The reason for the logical slot's conflict with recovery. It is always
-       NULL for physical slots, as well as for logical slots which are not
-       invalidated. The non-NULL values indicate that the slot is marked
-       as invalidated. Possible values are:
-       <itemizedlist spacing="compact">
-        <listitem>
-         <para>
-          <literal>wal_removed</literal> means that the required WAL has been
-          removed.
-         </para>
-        </listitem>
-        <listitem>
-         <para>
-          <literal>rows_removed</literal> means that the required rows have
-          been removed.
-         </para>
-        </listitem>
-        <listitem>
-         <para>
-          <literal>wal_level_insufficient</literal> means that the
-          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
-          perform logical decoding.
-         </para>
-        </listitem>
-       </itemizedlist>
+       True if this logical slot conflicted with recovery (and so is now
+       invalidated). When this column is true, check
+       <structfield>invalidation_reason</structfield> column for the conflict
+       reason.
       </para></entry>
      </row>
 
@@ -2581,6 +2560,38 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>invalidation_reason</structfield> <type>text</type>
+      </para>
+      <para>
+       The reason for the slot's invalidation. It is set for both logical and
+       physical slots. <literal>NULL</literal> if the slot is not invalidated.
+       Possible values are:
+       <itemizedlist spacing="compact">
+        <listitem>
+         <para>
+          <literal>wal_removed</literal> means that the required WAL has been
+          removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>rows_removed</literal> means that the required rows have
+          been removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>wal_level_insufficient</literal> means that the
+          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
+          perform logical decoding.
+         </para>
+        </listitem>
+       </itemizedlist>
+      </para></entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 04227a72d1..cd22dad959 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,9 +1023,10 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflict_reason,
+            L.conflicting,
             L.failover,
-            L.synced
+            L.synced,
+            L.invalidation_reason
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 5074c8409f..260632cfdd 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -668,7 +668,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, conflict_reason"
+		" database, invalidation_reason"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 91ca397857..4f1a17f6ce 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -2356,21 +2356,21 @@ RestoreSlotFromDisk(const char *name)
 }
 
 /*
- * Maps a conflict reason for a replication slot to
+ * Maps a invalidation reason for a replication slot to
  * ReplicationSlotInvalidationCause.
  */
 ReplicationSlotInvalidationCause
-GetSlotInvalidationCause(const char *conflict_reason)
+GetSlotInvalidationCause(const char *invalidation_reason)
 {
 	ReplicationSlotInvalidationCause cause;
 	ReplicationSlotInvalidationCause result = RS_INVAL_NONE;
 	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
 
-	Assert(conflict_reason);
+	Assert(invalidation_reason);
 
 	for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
 	{
-		if (strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0)
+		if (strcmp(SlotInvalidationCauses[cause], invalidation_reason) == 0)
 		{
 			found = true;
 			result = cause;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index ad79e1fccd..b5a638edea 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 17
+#define PG_GET_REPLICATION_SLOTS_COLS 18
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -263,6 +263,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		bool		nulls[PG_GET_REPLICATION_SLOTS_COLS];
 		WALAvailability walstate;
 		int			i;
+		ReplicationSlotInvalidationCause cause;
 
 		if (!slot->in_use)
 			continue;
@@ -409,22 +410,32 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.data.database == InvalidOid)
+		cause = slot_contents.data.invalidated;
+
+		if (SlotIsPhysical(&slot_contents))
 			nulls[i++] = true;
 		else
 		{
-			ReplicationSlotInvalidationCause cause = slot_contents.data.invalidated;
-
-			if (cause == RS_INVAL_NONE)
-				nulls[i++] = true;
+			/*
+			 * rows_removed and wal_level_insufficient are only two reasons
+			 * for the logical slot's conflict with recovery.
+			 */
+			if (cause == RS_INVAL_HORIZON ||
+				cause == RS_INVAL_WAL_LEVEL)
+				values[i++] = BoolGetDatum(true);
 			else
-				values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+				values[i++] = BoolGetDatum(false);
 		}
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
 		values[i++] = BoolGetDatum(slot_contents.data.synced);
 
+		if (cause == RS_INVAL_NONE)
+			nulls[i++] = true;
+		else
+			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index b5b8d11602..34a157f792 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -676,13 +676,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+							"(CASE WHEN conflicting THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 700f7daf7b..63fd0b4cd7 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11123,9 +11123,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 425effad21..7f25a083ee 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -273,7 +273,7 @@ extern void CheckPointReplicationSlots(bool is_shutdown);
 extern void CheckSlotRequirements(void);
 extern void CheckSlotPermissions(void);
 extern ReplicationSlotInvalidationCause
-			GetSlotInvalidationCause(const char *conflict_reason);
+			GetSlotInvalidationCause(const char *invalidation_reason);
 
 extern bool SlotExistsInStandbySlotNames(const char *slot_name);
 extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 88b03048c4..2203841ca1 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,7 +168,7 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflict_reason in pg_replication_slots.
+# Check reason for conflict in pg_replication_slots.
 sub check_slots_conflict_reason
 {
 	my ($slot_prefix, $reason) = @_;
@@ -178,15 +178,15 @@ sub check_slots_conflict_reason
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$active_slot' and conflicting;));
 
-	is($res, "$reason", "$active_slot conflict_reason is $reason");
+	is($res, "$reason", "$active_slot reason for conflict is $reason");
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$inactive_slot' and conflicting;));
 
-	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
+	is($res, "$reason", "$inactive_slot reason for conflict is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -293,13 +293,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflict_reason is NULL for physical slot
+# Check conflicting is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT conflicting is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflict_reason as NULL");
+is($res, 't', "Physical slot reports conflicting as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -524,7 +524,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Ensure that replication slot stats are not removed after invalidation.
@@ -551,7 +551,7 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify conflict_reason is retained across a restart.
+# Verify reason for conflict is retained across a restart.
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
@@ -560,7 +560,8 @@ check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
+	"SELECT restart_lsn FROM pg_replication_slots
+		WHERE slot_name = 'vacuum_full_activeslot' AND conflicting;"
 );
 
 chomp($restart_lsn);
@@ -611,7 +612,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
@@ -647,7 +648,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
@@ -696,14 +697,14 @@ ok( $node_standby->poll_query_until(
 	'confl_active_logicalslot not updated'
 ) or die "Timed out waiting confl_active_logicalslot to be updated";
 
-# Verify slots are reported as non conflicting in pg_replication_slots
+# Verify slots are reported as valid in pg_replication_slots
 is( $node_standby->safe_psql(
 		'postgres',
 		q[select bool_or(conflicting) from
-		  (select conflict_reason is not NULL as conflicting
-		   from pg_replication_slots WHERE slot_type = 'logical')]),
+		  (select conflicting from pg_replication_slots
+			where slot_type = 'logical')]),
 	'f',
-	'Logical slots are reported as non conflicting');
+	'Logical slots are reported as valid');
 
 # Turn hot_standby_feedback back on
 change_hot_standby_feedback_and_wait_for_xmins(1, 0);
@@ -739,7 +740,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
@@ -783,7 +784,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+# Verify reason for conflict is 'wal_level_insufficient' in pg_replication_slots
 check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 0ea1f3d323..f47bfd78eb 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -228,7 +228,7 @@ $standby1->safe_psql('postgres', "CHECKPOINT");
 # Check if the synced slot is invalidated
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'synchronized slot has been invalidated');
@@ -274,7 +274,7 @@ $standby1->wait_for_log(qr/dropped replication slot "lsub1_slot" of dbid [0-9]+/
 # flagged as 'synced'
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'logical slot is re-synced');
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 0cd2c64fca..055bec068d 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,10 +1473,11 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflict_reason,
+    l.conflicting,
     l.failover,
-    l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced)
+    l.synced,
+    l.invalidation_reason
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v11-0002-Add-XID-age-based-replication-slot-invalidation.patchapplication/octet-stream; name=v11-0002-Add-XID-age-based-replication-slot-invalidation.patchDownload

From ea5acdd80b3a93dd8e9ae69628d237e71e9ad575 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 16 Mar 2024 03:46:28 +0000
Subject: [PATCH v11 2/4] Add XID age based replication slot invalidation

Up until now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres uses replication slot xmin (the
oldest transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain), and a new GUC
max_slot_xid_age. The checkpointer then looks at all replication
slots invalidating the slots based on the age set.
---
 doc/src/sgml/config.sgml                      |  21 ++++
 doc/src/sgml/system-views.sgml                |   8 ++
 src/backend/access/transam/xlog.c             |  10 ++
 src/backend/replication/slot.c                |  49 +++++++-
 src/backend/utils/misc/guc_tables.c           |  10 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   3 +
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 108 ++++++++++++++++++
 9 files changed, 210 insertions(+), 1 deletion(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 65a6e6c408..6dd54ffcb7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4544,6 +4544,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age">
+      <term><varname>max_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>max_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index e685921847..56252b12ee 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2588,6 +2588,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>xid_aged</literal> means that the slot's
+          <literal>xmin</literal> or <literal>catalog_xmin</literal>
+          has reached the age specified by
+          <xref linkend="guc-max-slot-xid-age"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 20a5f86209..36ae2ac6a4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7147,6 +7147,11 @@ CreateCheckPoint(int flags)
 	if (PriorRedoPtr != InvalidXLogRecPtr)
 		UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7597,6 +7602,11 @@ CreateRestartPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 4f1a17f6ce..dc37586dcc 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_XID_AGE] = "xid_aged",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			max_slot_xid_age = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -158,6 +160,7 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool IsSlotXIDAged(TransactionId xmin);
 
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
@@ -1483,6 +1486,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_XID_AGE:
+			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1499,6 +1505,31 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Returns true if slot's passed in xmin/catalog_xmin age is more than
+ * max_slot_xid_age.
+ */
+static bool
+IsSlotXIDAged(TransactionId xmin)
+{
+	TransactionId xid_cur;
+	TransactionId xid_limit;
+
+	if (!TransactionIdIsNormal(xmin))
+		return false;
+
+	xid_cur = ReadNextTransactionId();
+	xid_limit = xmin + max_slot_xid_age;
+
+	if (xid_limit < FirstNormalTransactionId)
+		xid_limit += FirstNormalTransactionId;
+
+	if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+		return true;
+
+	return false;
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1599,6 +1630,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						conflict = cause;
 					break;
+				case RS_INVAL_XID_AGE:
+					{
+						if (IsSlotXIDAged(s->data.xmin))
+						{
+							conflict = cause;
+							break;
+						}
+
+						if (IsSlotXIDAged(s->data.catalog_xmin))
+						{
+							conflict = cause;
+							break;
+						}
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1752,6 +1798,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 57d9de4dd9..6b5375909d 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2954,6 +2954,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"max_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&max_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 2244ee52f7..b4c928b826 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -334,6 +334,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#max_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7f25a083ee..614ba0e30b 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -227,6 +229,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
+extern PGDLLIMPORT int max_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..2f482b56e8
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,108 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Initialize primary node, setting wal-segsize to 1MB
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 1, extra => ['--wal-segsize=1']);
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+});
+$primary->start;
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb1_slot');
+]);
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby1->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+$standby1->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NOT NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb1_slot';
+]) or die "Timed out waiting for slot xmin to advance";
+
+$primary->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET max_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby1->stop;
+
+my $logstart = -s $primary->logfile;
+
+# Do some work to advance xmin
+$primary->safe_psql(
+	'postgres', q{
+do $$
+begin
+  for i in 10000..11000 loop
+    -- use an exception block so that each iteration eats an XID
+    begin
+      insert into tab_int values (i);
+    exception
+      when division_by_zero then null;
+    end;
+  end loop;
+end$$;
+});
+
+my $invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb1_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb1_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'xid_aged';
+])
+  or die
+  "Timed out while waiting for replication slot sb1_slot to be invalidated";
+
+done_testing();
-- 
2.34.1

v11-0003-Track-inactive-replication-slot-information.patchapplication/octet-stream; name=v11-0003-Track-inactive-replication-slot-information.patchDownload

From 3f08b8cc6346aabba5226d060823fcdf71e6b1f8 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 16 Mar 2024 03:47:15 +0000
Subject: [PATCH v11 3/4] Track inactive replication slot information

Up until now, postgres doesn't track metrics like the time at
which the slot became inactive, and the total number of times the
slot became inactive in its lifetime. This commit adds two new
metrics last_inactive_at of type timestamptz and inactive_count of
type numeric to ReplicationSlotPersistentData. Whenever a slot
becomes inactive, the current timestamp and inactive count are
persisted to disk.

These metrics are useful in the following ways:
- To improve replication slot monitoring tools. For instance, one
can build a monitoring tool that signals a) when replication slots
is lying inactive for a day or so using last_inactive_at metric,
b) when a replication slot is becoming inactive too frequently
using inactive_count metric.

- To implement timeout-based inactive replication slot management
capability in postgres.

Increases SLOT_VERSION due to the added two new metrics.
---
 doc/src/sgml/system-views.sgml       | 20 +++++++++++++
 src/backend/catalog/system_views.sql |  4 ++-
 src/backend/replication/slot.c       | 43 ++++++++++++++++++++++------
 src/backend/replication/slotfuncs.c  | 15 +++++++++-
 src/include/catalog/pg_proc.dat      |  6 ++--
 src/include/replication/slot.h       |  6 ++++
 src/test/regress/expected/rules.out  |  6 ++--
 7 files changed, 84 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 56252b12ee..365c0fd52d 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2758,6 +2758,26 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        ID of role
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_at</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_count</structfield> <type>numeric</type>
+      </para>
+      <para>
+        The total number of times the slot became inactive in its lifetime.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index cd22dad959..de9f1d5506 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1026,7 +1026,9 @@ CREATE VIEW pg_replication_slots AS
             L.conflicting,
             L.failover,
             L.synced,
-            L.invalidation_reason
+            L.invalidation_reason,
+            L.last_inactive_at,
+            L.inactive_count
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index dc37586dcc..6b6e5141f7 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -130,7 +130,7 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	5		/* version for new files */
+#define SLOT_VERSION	6		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -401,6 +401,8 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
 	slot->data.synced = synced;
+	slot->data.last_inactive_at = 0;
+	slot->data.inactive_count = 0;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
@@ -627,6 +629,17 @@ retry:
 
 	if (am_walsender)
 	{
+		if (s->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&s->mutex);
+			s->data.last_inactive_at = 0;
+			SpinLockRelease(&s->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				SlotIsLogical(s)
 				? errmsg("acquired logical replication slot \"%s\"",
@@ -694,16 +707,20 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
-	MyReplicationSlot = NULL;
-
-	/* might not have been set when we've been a plain slot */
-	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
-	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
-	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
-	LWLockRelease(ProcArrayLock);
-
 	if (am_walsender)
 	{
+		if (slot->data.persistency == RS_PERSISTENT)
+		{
+			SpinLockAcquire(&slot->mutex);
+			slot->data.last_inactive_at = GetCurrentTimestamp();
+			slot->data.inactive_count++;
+			SpinLockRelease(&slot->mutex);
+
+			/* Write this slot to disk */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+		}
+
 		ereport(log_replication_commands ? LOG : DEBUG1,
 				is_logical
 				? errmsg("released logical replication slot \"%s\"",
@@ -713,6 +730,14 @@ ReplicationSlotRelease(void)
 
 		pfree(slotname);
 	}
+
+	MyReplicationSlot = NULL;
+
+	/* might not have been set when we've been a plain slot */
+	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
+	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
+	LWLockRelease(ProcArrayLock);
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index b5a638edea..4c7a120df1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 18
+#define PG_GET_REPLICATION_SLOTS_COLS 20
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -264,6 +264,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		WALAvailability walstate;
 		int			i;
 		ReplicationSlotInvalidationCause cause;
+		char		buf[256];
 
 		if (!slot->in_use)
 			continue;
@@ -436,6 +437,18 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		else
 			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
 
+		if (slot_contents.data.last_inactive_at > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.data.last_inactive_at);
+		else
+			nulls[i++] = true;
+
+		/* Convert to numeric. */
+		snprintf(buf, sizeof buf, UINT64_FORMAT, slot_contents.data.inactive_count);
+		values[i++] = DirectFunctionCall3(numeric_in,
+										  CStringGetDatum(buf),
+										  ObjectIdGetDatum(0),
+										  Int32GetDatum(-1));
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 63fd0b4cd7..c7ab0893eb 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11123,9 +11123,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text,timestamptz,numeric}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason,last_inactive_at,inactive_count}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 614ba0e30b..780767a819 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -129,6 +129,12 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* When did this slot become inactive last time? */
+	TimestampTz last_inactive_at;
+
+	/* How many times the slot has been inactive? */
+	uint64		inactive_count;
 } ReplicationSlotPersistentData;
 
 /*
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 055bec068d..c0bdfe76d8 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1476,8 +1476,10 @@ pg_replication_slots| SELECT l.slot_name,
     l.conflicting,
     l.failover,
     l.synced,
-    l.invalidation_reason
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason)
+    l.invalidation_reason,
+    l.last_inactive_at,
+    l.inactive_count
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason, last_inactive_at, inactive_count)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v11-0004-Add-inactive_timeout-based-replication-slot-inva.patchapplication/octet-stream; name=v11-0004-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 233febfa7d67b53c1a5094ec494cb69caead5120 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 16 Mar 2024 03:53:04 +0000
Subject: [PATCH v11 4/4] Add inactive_timeout based replication slot
 invalidation

Up until now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get dropped.

To achieve the above, postgres uses replication slot metric
inactive_at (the time at which the slot became inactive), and a
new GUC inactive_replication_slot_timeout. The checkpointer then
looks at all replication slots invalidating the inactive slots
based on the timeout set.
---
 doc/src/sgml/config.sgml                      | 18 +++++
 doc/src/sgml/system-views.sgml                |  7 ++
 src/backend/access/transam/xlog.c             | 10 +++
 src/backend/replication/slot.c                | 22 +++++-
 src/backend/utils/misc/guc_tables.c           | 12 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/replication/slot.h                |  3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 79 +++++++++++++++++++
 8 files changed, 151 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 6dd54ffcb7..4b0b60a1ac 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4565,6 +4565,24 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-inactive-replication-slot-timeout" xreflabel="inactive_replication_slot_timeout">
+      <term><varname>inactive_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>inactive_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time at the next checkpoint. If this value is specified
+        without units, it is taken as seconds. A value of zero (which is
+        default) disables the timeout mechanism. This parameter can only be
+        set in the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 365c0fd52d..c18dc5feb5 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2596,6 +2596,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           <xref linkend="guc-max-slot-xid-age"/> parameter.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by
+          <xref linkend="guc-inactive-replication-slot-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 36ae2ac6a4..166c3ed794 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7152,6 +7152,11 @@ CreateCheckPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Delete old log files, those no longer needed for last checkpoint to
 	 * prevent the disk holding the xlog from growing full.
@@ -7607,6 +7612,11 @@ CreateRestartPoint(int flags)
 		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
 										   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate inactive replication slots based on timeout */
+	if (inactive_replication_slot_timeout > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Retreat _logSegNo using the current end of xlog replayed or received,
 	 * whichever is later.
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 6b6e5141f7..10fe944623 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -108,10 +108,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
 	[RS_INVAL_XID_AGE] = "xid_aged",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -142,6 +143,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 int			max_slot_xid_age = 0;
+int			inactive_replication_slot_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -1514,6 +1516,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_XID_AGE:
 			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by inactive_replication_slot_timeout."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1670,6 +1675,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 						}
 					}
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (s->data.last_inactive_at > 0)
+					{
+						TimestampTz now;
+
+						Assert(s->data.persistency == RS_PERSISTENT);
+						Assert(s->active_pid == 0);
+
+						now = GetCurrentTimestamp();
+						if (TimestampDifferenceExceeds(s->data.last_inactive_at, now,
+													   inactive_replication_slot_timeout * 1000))
+							conflict = cause;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1824,6 +1843,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 6b5375909d..6caf40d51e 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2964,6 +2964,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"inactive_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&inactive_replication_slot_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index b4c928b826..7f2a3e41f1 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -261,6 +261,7 @@
 #recovery_prefetch = try	# prefetch pages referenced in the WAL?
 #wal_decode_buffer_size = 512kB	# lookahead window used for prefetching
 				# (change requires restart)
+#inactive_replication_slot_timeout = 0	# in seconds; 0 disables
 
 # - Archiving -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 780767a819..8378d7c913 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -55,6 +55,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* slot's xmin or catalog_xmin has reached the age */
 	RS_INVAL_XID_AGE,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -236,6 +238,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
 extern PGDLLIMPORT int max_slot_xid_age;
+extern PGDLLIMPORT int inactive_replication_slot_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index 2f482b56e8..4c66dd4a4e 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -105,4 +105,83 @@ $primary->poll_query_until(
   or die
   "Timed out while waiting for replication slot sb1_slot to be invalidated";
 
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot('sb2_slot');
+]);
+
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET max_slot_xid_age = 0;
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+$standby2->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+});
+$standby2->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+# The inactive replication slot info should be null when the slot is active
+my $result = $primary->safe_psql(
+	'postgres', qq[
+	SELECT last_inactive_at IS NULL, inactive_count = 0 AS OK
+		FROM pg_replication_slots WHERE slot_name = 'sb2_slot';
+]);
+is($result, "t|t",
+	'check the inactive replication slot info for an active slot');
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET inactive_replication_slot_timeout TO '1s';
+]);
+$primary->reload;
+
+$logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby2->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_at IS NOT NULL AND
+		inactive_count = 1 AND slot_name = 'sb2_slot';
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+$invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb2_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb2_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb2_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for inactive replication slot sb2_slot to be invalidated";
+
 done_testing();
-- 
2.34.1

#61

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#55)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 15, 2024 at 10:45 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Mar 13, 2024 at 9:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

BTW, is XID the based parameter 'max_slot_xid_age' not have similarity
with 'max_slot_wal_keep_size'? I think it will impact the rows we
removed based on xid horizons. Don't we need to consider it while
vacuum computing the xid horizons in ComputeXidHorizons() similar to
what we do for WAL w.r.t 'max_slot_wal_keep_size'?

I'm having a hard time understanding why we'd need something up there
in ComputeXidHorizons(). Can you elaborate it a bit please?

What's proposed with max_slot_xid_age is that during checkpoint we
look at slot's xmin and catalog_xmin, and the current system txn id.
Then, if the XID age of (xmin, catalog_xmin) and current_xid crosses
max_slot_xid_age, we invalidate the slot.

I can see that in your patch (in function
InvalidatePossiblyObsoleteSlot()). As per my understanding, we need
something similar for slot xids in ComputeXidHorizons() as we are
doing WAL in KeepLogSeg(). In KeepLogSeg(), we compute the minimum LSN
location required by slots and then adjust it for
'max_slot_wal_keep_size'. On similar lines, currently in
ComputeXidHorizons(), we compute the minimum xid required by slots
(procArray->replication_slot_xmin and
procArray->replication_slot_catalog_xmin) but then don't adjust it for
'max_slot_xid_age'. I could be missing something in this but it is
better to keep discussing this and try to move with another parameter
'inactive_replication_slot_timeout' which according to me can be kept
at slot level instead of a GUC but OTOH we need to see the arguments
on both side and then decide which makes more sense.

--
With Regards,
Amit Kapila.

#62

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#61)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sat, Mar 16, 2024 at 3:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

procArray->replication_slot_catalog_xmin) but then don't adjust it for
'max_slot_xid_age'. I could be missing something in this but it is
better to keep discussing this and try to move with another parameter
'inactive_replication_slot_timeout' which according to me can be kept
at slot level instead of a GUC but OTOH we need to see the arguments
on both side and then decide which makes more sense.

Hm. Are you suggesting inactive_timeout to be a slot level parameter
similar to 'failover' property added recently by
c393308b69d229b664391ac583b9e07418d411b6 and
73292404370c9900a96e2bebdc7144f7010339cf? With this approach, one can
set inactive_timeout while creating the slot either via
pg_create_physical_replication_slot() or
pg_create_logical_replication_slot() or CREATE_REPLICATION_SLOT or
ALTER_REPLICATION_SLOT command, and postgres tracks the
last_inactive_at for every slot based on which the slot gets
invalidated. If this understanding is right, I can go ahead and work
towards it.

Alternatively, we can go the route of making GUC a list of key-value
pairs of {slot_name, inactive_timeout}, but this kind of GUC for
setting slot level parameters is going to be the first of its kind, so
I'd prefer the above approach.

Thoughts?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#63

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#62)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sun, Mar 17, 2024 at 2:03 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Sat, Mar 16, 2024 at 3:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

procArray->replication_slot_catalog_xmin) but then don't adjust it for
'max_slot_xid_age'. I could be missing something in this but it is
better to keep discussing this and try to move with another parameter
'inactive_replication_slot_timeout' which according to me can be kept
at slot level instead of a GUC but OTOH we need to see the arguments
on both side and then decide which makes more sense.

Hm. Are you suggesting inactive_timeout to be a slot level parameter
similar to 'failover' property added recently by
c393308b69d229b664391ac583b9e07418d411b6 and
73292404370c9900a96e2bebdc7144f7010339cf? With this approach, one can
set inactive_timeout while creating the slot either via
pg_create_physical_replication_slot() or
pg_create_logical_replication_slot() or CREATE_REPLICATION_SLOT or
ALTER_REPLICATION_SLOT command, and postgres tracks the
last_inactive_at for every slot based on which the slot gets
invalidated. If this understanding is right, I can go ahead and work
towards it.

Yeah, I have something like that in mind. You can prepare the patch
but it would be good if others involved in this thread can also share
their opinion.

Alternatively, we can go the route of making GUC a list of key-value
pairs of {slot_name, inactive_timeout}, but this kind of GUC for
setting slot level parameters is going to be the first of its kind, so
I'd prefer the above approach.

I would prefer a slot-level parameter in this case rather than a GUC.

--
With Regards,
Amit Kapila.

#64

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#61)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sat, Mar 16, 2024 at 3:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

What's proposed with max_slot_xid_age is that during checkpoint we
look at slot's xmin and catalog_xmin, and the current system txn id.
Then, if the XID age of (xmin, catalog_xmin) and current_xid crosses
max_slot_xid_age, we invalidate the slot.

I can see that in your patch (in function
InvalidatePossiblyObsoleteSlot()). As per my understanding, we need
something similar for slot xids in ComputeXidHorizons() as we are
doing WAL in KeepLogSeg(). In KeepLogSeg(), we compute the minimum LSN
location required by slots and then adjust it for
'max_slot_wal_keep_size'. On similar lines, currently in
ComputeXidHorizons(), we compute the minimum xid required by slots
(procArray->replication_slot_xmin and
procArray->replication_slot_catalog_xmin) but then don't adjust it for
'max_slot_xid_age'. I could be missing something in this but it is
better to keep discussing this

After invalidating slots because of max_slot_xid_age, the
procArray->replication_slot_xmin and
procArray->replication_slot_catalog_xmin are recomputed immediately in
InvalidateObsoleteReplicationSlots->ReplicationSlotsComputeRequiredXmin->ProcArraySetReplicationSlotXmin.
And, later the XID horizons in ComputeXidHorizons are computed before
the vacuum on each table via GetOldestNonRemovableTransactionId.
Aren't these enough? Do you want the XID horizons recomputed
immediately, something like the below?

/* Invalidate replication slots based on xmin or catalog_xmin age */
if (max_slot_xid_age > 0)
{
if (InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE,
0, InvalidOid,
InvalidTransactionId))
{
ComputeXidHorizonsResult horizons;

/*
* Some slots have been invalidated; update the XID horizons
* as a side-effect.
*/
ComputeXidHorizons(&horizons);
}
}

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#65

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#64)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 18, 2024 at 9:58 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Sat, Mar 16, 2024 at 3:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

What's proposed with max_slot_xid_age is that during checkpoint we
look at slot's xmin and catalog_xmin, and the current system txn id.
Then, if the XID age of (xmin, catalog_xmin) and current_xid crosses
max_slot_xid_age, we invalidate the slot.

I can see that in your patch (in function
InvalidatePossiblyObsoleteSlot()). As per my understanding, we need
something similar for slot xids in ComputeXidHorizons() as we are
doing WAL in KeepLogSeg(). In KeepLogSeg(), we compute the minimum LSN
location required by slots and then adjust it for
'max_slot_wal_keep_size'. On similar lines, currently in
ComputeXidHorizons(), we compute the minimum xid required by slots
(procArray->replication_slot_xmin and
procArray->replication_slot_catalog_xmin) but then don't adjust it for
'max_slot_xid_age'. I could be missing something in this but it is
better to keep discussing this

After invalidating slots because of max_slot_xid_age, the
procArray->replication_slot_xmin and
procArray->replication_slot_catalog_xmin are recomputed immediately in
InvalidateObsoleteReplicationSlots->ReplicationSlotsComputeRequiredXmin->ProcArraySetReplicationSlotXmin.
And, later the XID horizons in ComputeXidHorizons are computed before
the vacuum on each table via GetOldestNonRemovableTransactionId.
Aren't these enough?

IIUC, this will be delayed by one cycle in the vacuum rather than
doing it when the slot's xmin age is crossed and it can be
invalidated.

Do you want the XID horizons recomputed

immediately, something like the below?

I haven't thought of the exact logic but we can try to mimic the
handling similar to WAL.

--
With Regards,
Amit Kapila.

#66

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#60)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Sat, Mar 16, 2024 at 09:29:01AM +0530, Bharath Rupireddy wrote:

I've also responded to Bertrand's comments here.

Thanks!

On Wed, Mar 6, 2024 at 3:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
5 ===
-# Check conflict_reason is NULL for physical slot
+# Check invalidation_reason is NULL for physical slot
$res = $node_primary->safe_psql(
'postgres', qq[
-                SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+                SELECT invalidation_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
);
I don't think this test is needed anymore: it does not make that much sense since
it's done after the primary database initialization and startup.
It is now turned into a test verifying 'conflicting boolean' is null
for the physical slot. Isn't that okay?

Yeah makes more sense now, thanks!

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#67

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#63)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Mon, Mar 18, 2024 at 08:50:56AM +0530, Amit Kapila wrote:

On Sun, Mar 17, 2024 at 2:03 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Sat, Mar 16, 2024 at 3:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

procArray->replication_slot_catalog_xmin) but then don't adjust it for
'max_slot_xid_age'. I could be missing something in this but it is
better to keep discussing this and try to move with another parameter
'inactive_replication_slot_timeout' which according to me can be kept
at slot level instead of a GUC but OTOH we need to see the arguments
on both side and then decide which makes more sense.

Hm. Are you suggesting inactive_timeout to be a slot level parameter
similar to 'failover' property added recently by
c393308b69d229b664391ac583b9e07418d411b6 and
73292404370c9900a96e2bebdc7144f7010339cf? With this approach, one can
set inactive_timeout while creating the slot either via
pg_create_physical_replication_slot() or
pg_create_logical_replication_slot() or CREATE_REPLICATION_SLOT or
ALTER_REPLICATION_SLOT command, and postgres tracks the
last_inactive_at for every slot based on which the slot gets
invalidated. If this understanding is right, I can go ahead and work
towards it.

Yeah, I have something like that in mind. You can prepare the patch
but it would be good if others involved in this thread can also share
their opinion.

I think it makes sense to put the inactive_timeout granularity at the slot
level (as the activity could vary a lot say between one slot linked to a
subcription and one linked to some plugins). As far max_slot_xid_age I've the
feeling that a new GUC is good enough.

Alternatively, we can go the route of making GUC a list of key-value
pairs of {slot_name, inactive_timeout}, but this kind of GUC for
setting slot level parameters is going to be the first of its kind, so
I'd prefer the above approach.

I would prefer a slot-level parameter in this case rather than a GUC.

Yeah, same here.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#68

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#60)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Sat, Mar 16, 2024 at 09:29:01AM +0530, Bharath Rupireddy wrote:

Please see the attached v11 patch set with all the above review
comments addressed.

Thanks!

Looking at 0001:

1 ===

+       True if this logical slot conflicted with recovery (and so is now
+       invalidated). When this column is true, check

Worth to add back the physical slot mention "Always NULL for physical slots."?

2 ===

@@ -1023,9 +1023,10 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflict_reason,
+            L.conflicting,
             L.failover,
-            L.synced
+            L.synced,
+            L.invalidation_reason

What about making invalidation_reason close to conflict_reason?

3 ===

- * Maps a conflict reason for a replication slot to
+ * Maps a invalidation reason for a replication slot to

s/a invalidation/an invalidation/?

4 ===

While at it, shouldn't we also rename "conflict" to say "invalidation_cause" in
InvalidatePossiblyObsoleteSlot()?

5 ===

+ * rows_removed and wal_level_insufficient are only two reasons

s/are only two/are the only two/?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#69

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#52)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Thu, Mar 14, 2024 at 12:27:26PM +0530, Amit Kapila wrote:

On Wed, Mar 13, 2024 at 10:16 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Mar 13, 2024 at 11:13 AM shveta malik <shveta.malik@gmail.com> wrote:

Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change.

JFYI, the patch does not apply to the head. There is a conflict in
multiple files.

Thanks for looking into this. I noticed that the v8 patches needed
rebase. Before I go do anything with the patches, I'm trying to gain
consensus on the design. Following is the summary of design choices
we've discussed so far:
1) conflict_reason vs invalidation_reason.
2) When to compute the XID age?

I feel we should focus on two things (a) one is to introduce a new
column invalidation_reason, and (b) let's try to first complete
invalidation due to timeout. We can look into XID stuff if time
permits, remember, we don't have ample time left.

Agree. While it makes sense to invalidate slots for wal removal in
CreateCheckPoint() (because this is the place where wal is removed), I 'm not
sure this is the right place for the 2 new cases.

Let's focus on the timeout one as proposed above (as probably the simplest one):
as this one is purely related to time and activity what about to invalidate them
when?:

- their usage resume
- in pg_get_replication_slots()

The idea is to invalidate the slot when one resumes activity on it or wants to
get information about it (and among other things wants to know if the slot is
valid or not).

Thoughts?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#70

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#69)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 18, 2024 at 8:19 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Thu, Mar 14, 2024 at 12:27:26PM +0530, Amit Kapila wrote:

On Wed, Mar 13, 2024 at 10:16 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Mar 13, 2024 at 11:13 AM shveta malik <shveta.malik@gmail.com> wrote:

Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change.

JFYI, the patch does not apply to the head. There is a conflict in
multiple files.

Thanks for looking into this. I noticed that the v8 patches needed
rebase. Before I go do anything with the patches, I'm trying to gain
consensus on the design. Following is the summary of design choices
we've discussed so far:
1) conflict_reason vs invalidation_reason.
2) When to compute the XID age?

I feel we should focus on two things (a) one is to introduce a new
column invalidation_reason, and (b) let's try to first complete
invalidation due to timeout. We can look into XID stuff if time
permits, remember, we don't have ample time left.

Agree. While it makes sense to invalidate slots for wal removal in
CreateCheckPoint() (because this is the place where wal is removed), I 'm not
sure this is the right place for the 2 new cases.

Let's focus on the timeout one as proposed above (as probably the simplest one):
as this one is purely related to time and activity what about to invalidate them
when?:

- their usage resume
- in pg_get_replication_slots()

The idea is to invalidate the slot when one resumes activity on it or wants to
get information about it (and among other things wants to know if the slot is
valid or not).

Trying to invalidate at those two places makes sense to me but we
still need to cover the cases where it takes very long to resume the
slot activity and the dangling slot cases where the activity is never
resumed. How about apart from the above two places, trying to
invalidate in CheckPointReplicationSlots() where we are traversing all
the slots? This could prevent invalid slots from being marked as
dirty.

BTW, how will the user use 'inactive_count' to know whether a
replication slot is becoming inactive too frequently? The patch just
keeps incrementing this counter, one will never know in the last 'n'
minutes, how many times the slot became inactive unless there is some
monitoring tool that keeps capturing this counter from time to time
and calculates the frequency in some way. Even, if this is useful, it
is not clear to me whether we need to store 'inactive_count' in the
slot's persistent data. I understand it could be a metric required by
the user but wouldn't it be better to track this via
pg_stat_replication_slots such that we don't need to store this in
slot's persist data? If this understanding is correct, I would say
let's remove 'inactive_count' as well from the main patch and discuss
it separately.

--
With Regards,
Amit Kapila.

#71

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#70)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Mar 19, 2024 at 10:56:25AM +0530, Amit Kapila wrote:

On Mon, Mar 18, 2024 at 8:19 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Agree. While it makes sense to invalidate slots for wal removal in
CreateCheckPoint() (because this is the place where wal is removed), I 'm not
sure this is the right place for the 2 new cases.

Let's focus on the timeout one as proposed above (as probably the simplest one):
as this one is purely related to time and activity what about to invalidate them
when?:

- their usage resume
- in pg_get_replication_slots()

The idea is to invalidate the slot when one resumes activity on it or wants to
get information about it (and among other things wants to know if the slot is
valid or not).

Trying to invalidate at those two places makes sense to me but we
still need to cover the cases where it takes very long to resume the
slot activity and the dangling slot cases where the activity is never
resumed.

I understand it's better to have the slot reflecting its real status internally
but it is a real issue if that's not the case until the activity on it is resumed?
(just asking, not saying we should not)

How about apart from the above two places, trying to
invalidate in CheckPointReplicationSlots() where we are traversing all
the slots?

I think that's a good place but there is still a window of time (that could also
be "large" depending of the activity and the checkpoint frequency) during which
the slot is not known as invalid internally. But yeah, at leat we know that we'll
mark it as invalid at some point...

BTW:

        if (am_walsender)
        {
+               if (slot->data.persistency == RS_PERSISTENT)
+               {
+                       SpinLockAcquire(&slot->mutex);
+                       slot->data.last_inactive_at = GetCurrentTimestamp();
+                       slot->data.inactive_count++;
+                       SpinLockRelease(&slot->mutex);

I'm also feeling the same concern as Shveta mentioned in [1]/messages/by-id/CAJpy0uD64X=2ENmbHaRiWTKeQawr-rbGoy_GdhQQLVXzUSKTMg@mail.gmail.com: that a "normal"
backend using pg_logical_slot_get_changes() or friends would not set the
last_inactive_at.

[1]: /messages/by-id/CAJpy0uD64X=2ENmbHaRiWTKeQawr-rbGoy_GdhQQLVXzUSKTMg@mail.gmail.com

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#72

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#71)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 19, 2024 at 3:11 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Tue, Mar 19, 2024 at 10:56:25AM +0530, Amit Kapila wrote:

On Mon, Mar 18, 2024 at 8:19 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Agree. While it makes sense to invalidate slots for wal removal in
CreateCheckPoint() (because this is the place where wal is removed), I 'm not
sure this is the right place for the 2 new cases.

Let's focus on the timeout one as proposed above (as probably the simplest one):
as this one is purely related to time and activity what about to invalidate them
when?:

- their usage resume
- in pg_get_replication_slots()

The idea is to invalidate the slot when one resumes activity on it or wants to
get information about it (and among other things wants to know if the slot is
valid or not).

Trying to invalidate at those two places makes sense to me but we
still need to cover the cases where it takes very long to resume the
slot activity and the dangling slot cases where the activity is never
resumed.

I understand it's better to have the slot reflecting its real status internally
but it is a real issue if that's not the case until the activity on it is resumed?
(just asking, not saying we should not)

Sorry, I didn't understand your point. Can you try to explain by example?

--
With Regards,
Amit Kapila.

#73

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#72)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Mar 19, 2024 at 04:20:35PM +0530, Amit Kapila wrote:

On Tue, Mar 19, 2024 at 3:11 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Tue, Mar 19, 2024 at 10:56:25AM +0530, Amit Kapila wrote:

On Mon, Mar 18, 2024 at 8:19 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Agree. While it makes sense to invalidate slots for wal removal in
CreateCheckPoint() (because this is the place where wal is removed), I 'm not
sure this is the right place for the 2 new cases.

Let's focus on the timeout one as proposed above (as probably the simplest one):
as this one is purely related to time and activity what about to invalidate them
when?:

- their usage resume
- in pg_get_replication_slots()

The idea is to invalidate the slot when one resumes activity on it or wants to
get information about it (and among other things wants to know if the slot is
valid or not).

Trying to invalidate at those two places makes sense to me but we
still need to cover the cases where it takes very long to resume the
slot activity and the dangling slot cases where the activity is never
resumed.

I understand it's better to have the slot reflecting its real status internally
but it is a real issue if that's not the case until the activity on it is resumed?
(just asking, not saying we should not)

Sorry, I didn't understand your point. Can you try to explain by example?

Sorry if that was not clear, let me try to rephrase it first: what issue to you
see if the invalidation of such a slot occurs only when its usage resume or
when pg_get_replication_slots() is triggered? I understand that this could lead
to the slot not being invalidated (maybe forever) but is that an issue for an
inactive slot?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#74

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#67)

7 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 18, 2024 at 3:02 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hm. Are you suggesting inactive_timeout to be a slot level parameter
similar to 'failover' property added recently by
c393308b69d229b664391ac583b9e07418d411b6 and
73292404370c9900a96e2bebdc7144f7010339cf?

Yeah, I have something like that in mind. You can prepare the patch
but it would be good if others involved in this thread can also share
their opinion.

I think it makes sense to put the inactive_timeout granularity at the slot
level (as the activity could vary a lot say between one slot linked to a
subcription and one linked to some plugins). As far max_slot_xid_age I've the
feeling that a new GUC is good enough.

Well, here I'm implementing the above idea. The attached v12 patches
majorly have the following changes:

1. inactive_timeout is now slot-level, that is, one can set it while
creating the slot either via SQL functions or via replication commands
or via subscription.
2. last_inactive_at and inactive_timeout are now tracked in on-disk
replication slot data structure.
3. last_inactive_at is now set even for non-walsenders whenever the
slot is released as opposed to initial versions of the patches setting
it only for walsenders.
4. slot's inactive_timeout parameter is now migrated to the new
cluster with pg_upgrade.
5. slot's inactive_timeout parameter is now synced to the standby when
failover is enabled for the slot.
6. Test cases are added to cover most of the above cases including new
invalidation mechanisms.

Following are some open points:

1. Where to do inactive_timeout invalidation exactly if not the checkpointer.
2. Where to do XID age invalidation exactly if not the checkpointer.
3. How to go about recomputing XID horizons based on max_slot_xid_age.
Does the slot's horizon's need to be adjusted in ComputeXidHorizons()?
4. New invalidation mechanisms interaction with slot sync feature.
5. Review comments on 0001 from Bertrand.

Please see the attached v12 patches.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v12-0001-Track-invalidation_reason-in-pg_replication_slot.patchapplication/octet-stream; name=v12-0001-Track-invalidation_reason-in-pg_replication_slot.patchDownload

From 7f3dcb43f32bfdddfed85381fc4daeab8d65fe2b Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 19 Mar 2024 17:20:28 +0000
Subject: [PATCH v12 1/7] Track invalidation_reason in pg_replication_slots

Up until now, reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
conflict_reason to show the reasons for slot invalidation, but
only for logical slots.

This commit adds a new column to show invalidation reasons for
both physical and logical slots. And, this commit also turns
conflict_reason text column to conflicting boolean column
(effectively reverting commit 007693f2a). One now can look at the
new invalidation_reason column for logical slots conflict with
recovery.
---
 doc/src/sgml/ref/pgupgrade.sgml               |  4 +-
 doc/src/sgml/system-views.sgml                | 63 +++++++++++--------
 src/backend/catalog/system_views.sql          |  5 +-
 src/backend/replication/logical/slotsync.c    |  2 +-
 src/backend/replication/slot.c                |  8 +--
 src/backend/replication/slotfuncs.c           | 25 +++++---
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  6 +-
 src/include/replication/slot.h                |  2 +-
 .../t/035_standby_logical_decoding.pl         | 39 ++++++------
 .../t/040_standby_failover_slots_sync.pl      |  4 +-
 src/test/regress/expected/rules.out           |  7 ++-
 12 files changed, 97 insertions(+), 72 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 58c6c2df8b..8de52bf752 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -453,8 +453,8 @@ make prefix=/usr/local/pgsql.new install
       <para>
        All slots on the old cluster must be usable, i.e., there are no slots
        whose
-       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>conflict_reason</structfield>
-       is not <literal>NULL</literal>.
+       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>conflicting</structfield>
+       is not <literal>true</literal>.
       </para>
      </listitem>
      <listitem>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be90edd0e2..e685921847 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,34 +2525,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflict_reason</structfield> <type>text</type>
+       <structfield>conflicting</structfield> <type>bool</type>
       </para>
       <para>
-       The reason for the logical slot's conflict with recovery. It is always
-       NULL for physical slots, as well as for logical slots which are not
-       invalidated. The non-NULL values indicate that the slot is marked
-       as invalidated. Possible values are:
-       <itemizedlist spacing="compact">
-        <listitem>
-         <para>
-          <literal>wal_removed</literal> means that the required WAL has been
-          removed.
-         </para>
-        </listitem>
-        <listitem>
-         <para>
-          <literal>rows_removed</literal> means that the required rows have
-          been removed.
-         </para>
-        </listitem>
-        <listitem>
-         <para>
-          <literal>wal_level_insufficient</literal> means that the
-          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
-          perform logical decoding.
-         </para>
-        </listitem>
-       </itemizedlist>
+       True if this logical slot conflicted with recovery (and so is now
+       invalidated). When this column is true, check
+       <structfield>invalidation_reason</structfield> column for the conflict
+       reason.
       </para></entry>
      </row>
 
@@ -2581,6 +2560,38 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>invalidation_reason</structfield> <type>text</type>
+      </para>
+      <para>
+       The reason for the slot's invalidation. It is set for both logical and
+       physical slots. <literal>NULL</literal> if the slot is not invalidated.
+       Possible values are:
+       <itemizedlist spacing="compact">
+        <listitem>
+         <para>
+          <literal>wal_removed</literal> means that the required WAL has been
+          removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>rows_removed</literal> means that the required rows have
+          been removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>wal_level_insufficient</literal> means that the
+          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
+          perform logical decoding.
+         </para>
+        </listitem>
+       </itemizedlist>
+      </para></entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 04227a72d1..cd22dad959 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,9 +1023,10 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflict_reason,
+            L.conflicting,
             L.failover,
-            L.synced
+            L.synced,
+            L.invalidation_reason
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 7b180bdb5c..30480960c5 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -663,7 +663,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, conflict_reason"
+		" database, invalidation_reason"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 91ca397857..4f1a17f6ce 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -2356,21 +2356,21 @@ RestoreSlotFromDisk(const char *name)
 }
 
 /*
- * Maps a conflict reason for a replication slot to
+ * Maps a invalidation reason for a replication slot to
  * ReplicationSlotInvalidationCause.
  */
 ReplicationSlotInvalidationCause
-GetSlotInvalidationCause(const char *conflict_reason)
+GetSlotInvalidationCause(const char *invalidation_reason)
 {
 	ReplicationSlotInvalidationCause cause;
 	ReplicationSlotInvalidationCause result = RS_INVAL_NONE;
 	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
 
-	Assert(conflict_reason);
+	Assert(invalidation_reason);
 
 	for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
 	{
-		if (strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0)
+		if (strcmp(SlotInvalidationCauses[cause], invalidation_reason) == 0)
 		{
 			found = true;
 			result = cause;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index ad79e1fccd..b5a638edea 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 17
+#define PG_GET_REPLICATION_SLOTS_COLS 18
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -263,6 +263,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		bool		nulls[PG_GET_REPLICATION_SLOTS_COLS];
 		WALAvailability walstate;
 		int			i;
+		ReplicationSlotInvalidationCause cause;
 
 		if (!slot->in_use)
 			continue;
@@ -409,22 +410,32 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.data.database == InvalidOid)
+		cause = slot_contents.data.invalidated;
+
+		if (SlotIsPhysical(&slot_contents))
 			nulls[i++] = true;
 		else
 		{
-			ReplicationSlotInvalidationCause cause = slot_contents.data.invalidated;
-
-			if (cause == RS_INVAL_NONE)
-				nulls[i++] = true;
+			/*
+			 * rows_removed and wal_level_insufficient are only two reasons
+			 * for the logical slot's conflict with recovery.
+			 */
+			if (cause == RS_INVAL_HORIZON ||
+				cause == RS_INVAL_WAL_LEVEL)
+				values[i++] = BoolGetDatum(true);
 			else
-				values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+				values[i++] = BoolGetDatum(false);
 		}
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
 		values[i++] = BoolGetDatum(slot_contents.data.synced);
 
+		if (cause == RS_INVAL_NONE)
+			nulls[i++] = true;
+		else
+			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index b5b8d11602..34a157f792 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -676,13 +676,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+							"(CASE WHEN conflicting THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 177d81a891..1689009d4f 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11130,9 +11130,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 425effad21..7f25a083ee 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -273,7 +273,7 @@ extern void CheckPointReplicationSlots(bool is_shutdown);
 extern void CheckSlotRequirements(void);
 extern void CheckSlotPermissions(void);
 extern ReplicationSlotInvalidationCause
-			GetSlotInvalidationCause(const char *conflict_reason);
+			GetSlotInvalidationCause(const char *invalidation_reason);
 
 extern bool SlotExistsInStandbySlotNames(const char *slot_name);
 extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 88b03048c4..2203841ca1 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,7 +168,7 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflict_reason in pg_replication_slots.
+# Check reason for conflict in pg_replication_slots.
 sub check_slots_conflict_reason
 {
 	my ($slot_prefix, $reason) = @_;
@@ -178,15 +178,15 @@ sub check_slots_conflict_reason
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$active_slot' and conflicting;));
 
-	is($res, "$reason", "$active_slot conflict_reason is $reason");
+	is($res, "$reason", "$active_slot reason for conflict is $reason");
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$inactive_slot' and conflicting;));
 
-	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
+	is($res, "$reason", "$inactive_slot reason for conflict is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -293,13 +293,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflict_reason is NULL for physical slot
+# Check conflicting is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT conflicting is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflict_reason as NULL");
+is($res, 't', "Physical slot reports conflicting as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -524,7 +524,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Ensure that replication slot stats are not removed after invalidation.
@@ -551,7 +551,7 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify conflict_reason is retained across a restart.
+# Verify reason for conflict is retained across a restart.
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
@@ -560,7 +560,8 @@ check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
+	"SELECT restart_lsn FROM pg_replication_slots
+		WHERE slot_name = 'vacuum_full_activeslot' AND conflicting;"
 );
 
 chomp($restart_lsn);
@@ -611,7 +612,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
@@ -647,7 +648,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
@@ -696,14 +697,14 @@ ok( $node_standby->poll_query_until(
 	'confl_active_logicalslot not updated'
 ) or die "Timed out waiting confl_active_logicalslot to be updated";
 
-# Verify slots are reported as non conflicting in pg_replication_slots
+# Verify slots are reported as valid in pg_replication_slots
 is( $node_standby->safe_psql(
 		'postgres',
 		q[select bool_or(conflicting) from
-		  (select conflict_reason is not NULL as conflicting
-		   from pg_replication_slots WHERE slot_type = 'logical')]),
+		  (select conflicting from pg_replication_slots
+			where slot_type = 'logical')]),
 	'f',
-	'Logical slots are reported as non conflicting');
+	'Logical slots are reported as valid');
 
 # Turn hot_standby_feedback back on
 change_hot_standby_feedback_and_wait_for_xmins(1, 0);
@@ -739,7 +740,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
@@ -783,7 +784,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+# Verify reason for conflict is 'wal_level_insufficient' in pg_replication_slots
 check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 0ea1f3d323..f47bfd78eb 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -228,7 +228,7 @@ $standby1->safe_psql('postgres', "CHECKPOINT");
 # Check if the synced slot is invalidated
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'synchronized slot has been invalidated');
@@ -274,7 +274,7 @@ $standby1->wait_for_log(qr/dropped replication slot "lsub1_slot" of dbid [0-9]+/
 # flagged as 'synced'
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'logical slot is re-synced');
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 84e359f6ed..19c44c0cb7 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,10 +1473,11 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflict_reason,
+    l.conflicting,
     l.failover,
-    l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced)
+    l.synced,
+    l.invalidation_reason
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v12-0002-Track-last_inactive_at-for-replication-slots.patchapplication/octet-stream; name=v12-0002-Track-last_inactive_at-for-replication-slots.patchDownload

From 9e688023aa968cefea5bd5e1d7f0b763b52ee22a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 19 Mar 2024 17:20:45 +0000
Subject: [PATCH v12 2/7] Track last_inactive_at for replication slots

Up until now, postgres doesn't track metric that's the time at
which the slot became inactive. This commit adds new metric
last_inactive_at of type timestamptz to
ReplicationSlotPersistentData. Whenever a slot
becomes inactive, the current timestamp is persisted to disk.

This metric is useful in the following ways:
- To improve replication slot monitoring tools. For instance, one
can build a monitoring tool that signals a) when replication slots
is lying inactive for a day or so using last_inactive_at metric

- To implement timeout-based inactive replication slot management
capability in postgres.

Increases SLOT_VERSION due to the added new metric.
---
 doc/src/sgml/system-views.sgml       | 11 +++++++++++
 src/backend/catalog/system_views.sql |  3 ++-
 src/backend/replication/slot.c       | 25 ++++++++++++++++++++++++-
 src/backend/replication/slotfuncs.c  |  7 ++++++-
 src/include/catalog/pg_proc.dat      |  6 +++---
 src/include/replication/slot.h       |  3 +++
 src/test/regress/expected/rules.out  |  5 +++--
 7 files changed, 52 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index e685921847..ab43032d74 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2750,6 +2750,17 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        ID of role
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_at</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index cd22dad959..2fa4272006 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1026,7 +1026,8 @@ CREATE VIEW pg_replication_slots AS
             L.conflicting,
             L.failover,
             L.synced,
-            L.invalidation_reason
+            L.invalidation_reason,
+            L.last_inactive_at
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 4f1a17f6ce..fa40c5f4f1 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -129,7 +129,7 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	5		/* version for new files */
+#define SLOT_VERSION	6		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -398,6 +398,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
 	slot->data.synced = synced;
+	slot->data.last_inactive_at = 0;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
@@ -622,6 +623,17 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
+	if (s->data.persistency == RS_PERSISTENT)
+	{
+		SpinLockAcquire(&s->mutex);
+		s->data.last_inactive_at = 0;
+		SpinLockRelease(&s->mutex);
+
+		/* Write this slot to disk */
+		ReplicationSlotMarkDirty();
+		ReplicationSlotSave();
+	}
+
 	if (am_walsender)
 	{
 		ereport(log_replication_commands ? LOG : DEBUG1,
@@ -691,6 +703,17 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
+	if (slot->data.persistency == RS_PERSISTENT)
+	{
+		SpinLockAcquire(&slot->mutex);
+		slot->data.last_inactive_at = GetCurrentTimestamp();
+		SpinLockRelease(&slot->mutex);
+
+		/* Write this slot to disk */
+		ReplicationSlotMarkDirty();
+		ReplicationSlotSave();
+	}
+
 	MyReplicationSlot = NULL;
 
 	/* might not have been set when we've been a plain slot */
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index b5a638edea..95802bf2c9 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 18
+#define PG_GET_REPLICATION_SLOTS_COLS 19
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -436,6 +436,11 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		else
 			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
 
+		if (slot_contents.data.last_inactive_at > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.data.last_inactive_at);
+		else
+			nulls[i++] = true;
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 1689009d4f..318bd2abc8 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11130,9 +11130,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text,timestamptz}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason,last_inactive_at}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7f25a083ee..10f8ba67bc 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -127,6 +127,9 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* When did this slot become inactive last time? */
+	TimestampTz last_inactive_at;
 } ReplicationSlotPersistentData;
 
 /*
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 19c44c0cb7..88fbd6a53c 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1476,8 +1476,9 @@ pg_replication_slots| SELECT l.slot_name,
     l.conflicting,
     l.failover,
     l.synced,
-    l.invalidation_reason
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason)
+    l.invalidation_reason,
+    l.last_inactive_at
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason, last_inactive_at)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v12-0003-Allow-setting-inactive_timeout-for-replication-s.patchapplication/octet-stream; name=v12-0003-Allow-setting-inactive_timeout-for-replication-s.patchDownload

From 6b119a89b400867a99f70ee414426c61e6929552 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 19 Mar 2024 17:31:09 +0000
Subject: [PATCH v12 3/7] Allow setting inactive_timeout for replication slots
 via SQL API

---
 contrib/test_decoding/expected/slot.out       | 102 ++++++++++++++++++
 contrib/test_decoding/sql/slot.sql            |  34 ++++++
 doc/src/sgml/func.sgml                        |  18 ++--
 doc/src/sgml/system-views.sgml                |   9 ++
 src/backend/catalog/system_functions.sql      |   2 +
 src/backend/catalog/system_views.sql          |   3 +-
 src/backend/replication/logical/slotsync.c    |  17 ++-
 src/backend/replication/slot.c                |  20 +++-
 src/backend/replication/slotfuncs.c           |  31 +++++-
 src/backend/replication/walsender.c           |   4 +-
 src/bin/pg_upgrade/info.c                     |   6 +-
 src/bin/pg_upgrade/pg_upgrade.c               |   5 +-
 src/bin/pg_upgrade/pg_upgrade.h               |   2 +
 src/include/catalog/pg_proc.dat               |  22 ++--
 src/include/replication/slot.h                |   5 +-
 .../t/040_standby_failover_slots_sync.pl      |  11 +-
 src/test/regress/expected/rules.out           |   5 +-
 17 files changed, 257 insertions(+), 39 deletions(-)

diff --git a/contrib/test_decoding/expected/slot.out b/contrib/test_decoding/expected/slot.out
index 349ab2d380..6771520afb 100644
--- a/contrib/test_decoding/expected/slot.out
+++ b/contrib/test_decoding/expected/slot.out
@@ -466,3 +466,105 @@ SELECT pg_drop_replication_slot('physical_slot');
  
 (1 row)
 
+-- Test negative value for inactive_timeout option for slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', inactive_timeout := -300);  -- error
+ERROR:  "inactive_timeout" must not be negative
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', inactive_timeout := -600);  -- error
+ERROR:  "inactive_timeout" must not be negative
+-- Test inactive_timeout option for temporary slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', temporary := true, inactive_timeout := 300);  -- error
+ERROR:  cannot set inactive_timeout for a temporary replication slot
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', temporary := true, inactive_timeout := 600);  -- error
+ERROR:  cannot set inactive_timeout for a temporary replication slot
+-- Test inactive_timeout option of physical slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot1', immediately_reserve := true, inactive_timeout := 300);
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot2');
+ ?column? 
+----------
+ init
+(1 row)
+
+-- Copy physical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_slot1', dst_slot_name := 'it_phy_slot3');
+ ?column? 
+----------
+ copy
+(1 row)
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+  slot_name   | slot_type | inactive_timeout 
+--------------+-----------+------------------
+ it_phy_slot1 | physical  |              300
+ it_phy_slot2 | physical  |                0
+ it_phy_slot3 | physical  |              300
+(3 rows)
+
+SELECT pg_drop_replication_slot('it_phy_slot1');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_phy_slot2');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_phy_slot3');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+-- Test inactive_timeout option of logical slots.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot2', plugin := 'test_decoding');
+ ?column? 
+----------
+ init
+(1 row)
+
+-- Copy logical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slot1', dst_slot_name := 'it_log_slot3');
+ ?column? 
+----------
+ copy
+(1 row)
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+  slot_name   | slot_type | inactive_timeout 
+--------------+-----------+------------------
+ it_log_slot1 | logical   |              600
+ it_log_slot2 | logical   |                0
+ it_log_slot3 | logical   |              600
+(3 rows)
+
+SELECT pg_drop_replication_slot('it_log_slot1');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_log_slot2');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_log_slot3');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
diff --git a/contrib/test_decoding/sql/slot.sql b/contrib/test_decoding/sql/slot.sql
index 580e3ae3be..443e91da07 100644
--- a/contrib/test_decoding/sql/slot.sql
+++ b/contrib/test_decoding/sql/slot.sql
@@ -190,3 +190,37 @@ SELECT pg_drop_replication_slot('failover_true_slot');
 SELECT pg_drop_replication_slot('failover_false_slot');
 SELECT pg_drop_replication_slot('failover_default_slot');
 SELECT pg_drop_replication_slot('physical_slot');
+
+-- Test negative value for inactive_timeout option for slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', inactive_timeout := -300);  -- error
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', inactive_timeout := -600);  -- error
+
+-- Test inactive_timeout option for temporary slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', temporary := true, inactive_timeout := 300);  -- error
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', temporary := true, inactive_timeout := 600);  -- error
+
+-- Test inactive_timeout option of physical slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot1', immediately_reserve := true, inactive_timeout := 300);
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot2');
+
+-- Copy physical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_slot1', dst_slot_name := 'it_phy_slot3');
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+
+SELECT pg_drop_replication_slot('it_phy_slot1');
+SELECT pg_drop_replication_slot('it_phy_slot2');
+SELECT pg_drop_replication_slot('it_phy_slot3');
+
+-- Test inactive_timeout option of logical slots.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot2', plugin := 'test_decoding');
+
+-- Copy logical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slot1', dst_slot_name := 'it_log_slot3');
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+
+SELECT pg_drop_replication_slot('it_log_slot1');
+SELECT pg_drop_replication_slot('it_log_slot2');
+SELECT pg_drop_replication_slot('it_log_slot3');
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5b225ccf4f..0ece7c8d3d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28148,7 +28148,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <indexterm>
          <primary>pg_create_physical_replication_slot</primary>
         </indexterm>
-        <function>pg_create_physical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type> <optional>, <parameter>immediately_reserve</parameter> <type>boolean</type>, <parameter>temporary</parameter> <type>boolean</type> </optional> )
+        <function>pg_create_physical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type> <optional>, <parameter>immediately_reserve</parameter> <type>boolean</type>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>inactive_timeout</parameter> <type>integer</type> </optional>)
         <returnvalue>record</returnvalue>
         ( <parameter>slot_name</parameter> <type>name</type>,
         <parameter>lsn</parameter> <type>pg_lsn</type> )
@@ -28165,9 +28165,12 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         parameter, <parameter>temporary</parameter>, when set to true, specifies that
         the slot should not be permanently stored to disk and is only meant
         for use by the current session. Temporary slots are also
-        released upon any error. This function corresponds
-        to the replication protocol command <literal>CREATE_REPLICATION_SLOT
-        ... PHYSICAL</literal>.
+        released upon any error. The optional fourth
+        parameter, <parameter>inactive_timeout</parameter>, when set to a
+        non-zero value, specifies the amount of time in seconds the slot is
+        allowed to be inactive. This function corresponds to the replication
+        protocol command
+        <literal>CREATE_REPLICATION_SLOT ... PHYSICAL</literal>.
        </para></entry>
       </row>
 
@@ -28192,7 +28195,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <indexterm>
          <primary>pg_create_logical_replication_slot</primary>
         </indexterm>
-        <function>pg_create_logical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type>, <parameter>plugin</parameter> <type>name</type> <optional>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>twophase</parameter> <type>boolean</type>, <parameter>failover</parameter> <type>boolean</type> </optional> )
+        <function>pg_create_logical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type>, <parameter>plugin</parameter> <type>name</type> <optional>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>twophase</parameter> <type>boolean</type>, <parameter>failover</parameter> <type>boolean</type>, <parameter>inactive_timeout</parameter> <type>integer</type> </optional> )
         <returnvalue>record</returnvalue>
         ( <parameter>slot_name</parameter> <type>name</type>,
         <parameter>lsn</parameter> <type>pg_lsn</type> )
@@ -28211,7 +28214,10 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <parameter>failover</parameter>, when set to true,
         specifies that this slot is enabled to be synced to the
         standbys so that logical replication can be resumed after
-        failover. A call to this function has the same effect as
+        failover.  The optional sixth parameter,
+        <parameter>inactive_timeout</parameter>, when set to a
+        non-zero value, specifies the amount of time in seconds the slot is
+        allowed to be inactive. A call to this function has the same effect as
         the replication protocol command
         <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
        </para></entry>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index ab43032d74..f413c819de 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2761,6 +2761,15 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
         used.
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_timeout</structfield> <type>integer</type>
+      </para>
+      <para>
+        The amount of time in seconds the slot is allowed to be inactive.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index fe2bb50f46..af27616657 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -469,6 +469,7 @@ AS 'pg_logical_emit_message_bytea';
 CREATE OR REPLACE FUNCTION pg_create_physical_replication_slot(
     IN slot_name name, IN immediately_reserve boolean DEFAULT false,
     IN temporary boolean DEFAULT false,
+    IN inactive_timeout int DEFAULT 0,
     OUT slot_name name, OUT lsn pg_lsn)
 RETURNS RECORD
 LANGUAGE INTERNAL
@@ -480,6 +481,7 @@ CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot(
     IN temporary boolean DEFAULT false,
     IN twophase boolean DEFAULT false,
     IN failover boolean DEFAULT false,
+    IN inactive_timeout int DEFAULT 0,
     OUT slot_name name, OUT lsn pg_lsn)
 RETURNS RECORD
 LANGUAGE INTERNAL
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2fa4272006..a43048ae93 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1027,7 +1027,8 @@ CREATE VIEW pg_replication_slots AS
             L.failover,
             L.synced,
             L.invalidation_reason,
-            L.last_inactive_at
+            L.last_inactive_at,
+            L.inactive_timeout
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..c01876ceeb 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -131,6 +131,7 @@ typedef struct RemoteSlot
 	char	   *database;
 	bool		two_phase;
 	bool		failover;
+	int			inactive_timeout;
 	XLogRecPtr	restart_lsn;
 	XLogRecPtr	confirmed_lsn;
 	TransactionId catalog_xmin;
@@ -167,7 +168,8 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		remote_slot->two_phase == slot->data.two_phase &&
 		remote_slot->failover == slot->data.failover &&
 		remote_slot->confirmed_lsn == slot->data.confirmed_flush &&
-		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
+		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 &&
+		remote_slot->inactive_timeout == slot->data.inactive_timeout)
 		return false;
 
 	/* Avoid expensive operations while holding a spinlock. */
@@ -182,6 +184,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 	slot->data.confirmed_flush = remote_slot->confirmed_lsn;
 	slot->data.catalog_xmin = remote_slot->catalog_xmin;
 	slot->effective_catalog_xmin = remote_slot->catalog_xmin;
+	slot->data.inactive_timeout = remote_slot->inactive_timeout;
 	SpinLockRelease(&slot->mutex);
 
 	if (xmin_changed)
@@ -607,7 +610,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
 							  remote_slot->two_phase,
 							  remote_slot->failover,
-							  true);
+							  true, 0);
 
 		/* For shorter lines. */
 		slot = MyReplicationSlot;
@@ -627,6 +630,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		SpinLockAcquire(&slot->mutex);
 		slot->effective_catalog_xmin = xmin_horizon;
 		slot->data.catalog_xmin = xmin_horizon;
+		slot->data.inactive_timeout = remote_slot->inactive_timeout;
 		SpinLockRelease(&slot->mutex);
 		ReplicationSlotsComputeRequiredXmin(true);
 		LWLockRelease(ProcArrayLock);
@@ -652,9 +656,9 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 static bool
 synchronize_slots(WalReceiverConn *wrconn)
 {
-#define SLOTSYNC_COLUMN_COUNT 9
+#define SLOTSYNC_COLUMN_COUNT 10
 	Oid			slotRow[SLOTSYNC_COLUMN_COUNT] = {TEXTOID, TEXTOID, LSNOID,
-	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID};
+	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID, INT4OID};
 
 	WalRcvExecResult *res;
 	TupleTableSlot *tupslot;
@@ -663,7 +667,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, invalidation_reason"
+		" database, invalidation_reason, inactive_timeout"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
@@ -743,6 +747,9 @@ synchronize_slots(WalReceiverConn *wrconn)
 		remote_slot->invalidated = isnull ? RS_INVAL_NONE :
 			GetSlotInvalidationCause(TextDatumGetCString(d));
 
+		remote_slot->inactive_timeout = DatumGetInt32(slot_getattr(tupslot, ++col,
+																   &isnull));
+
 		/* Sanity check */
 		Assert(col == SLOTSYNC_COLUMN_COUNT);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fa40c5f4f1..071e960ec7 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -129,7 +129,7 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	6		/* version for new files */
+#define SLOT_VERSION	7		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -304,11 +304,14 @@ ReplicationSlotValidateName(const char *name, int elevel)
  * failover: If enabled, allows the slot to be synced to standbys so
  *     that logical replication can be resumed after failover.
  * synced: True if the slot is synchronized from the primary server.
+ * inactive_timeout: The amount of time in seconds the slot is allowed to be
+ *     inactive.
  */
 void
 ReplicationSlotCreate(const char *name, bool db_specific,
 					  ReplicationSlotPersistency persistency,
-					  bool two_phase, bool failover, bool synced)
+					  bool two_phase, bool failover, bool synced,
+					  int inactive_timeout)
 {
 	ReplicationSlot *slot = NULL;
 	int			i;
@@ -345,6 +348,18 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 					errmsg("cannot enable failover for a temporary replication slot"));
 	}
 
+	if (inactive_timeout > 0)
+	{
+		/*
+		 * Do not allow users to set inactive_timeout for temporary slots,
+		 * because temporary slots will not be written to the disk.
+		 */
+		if (persistency == RS_TEMPORARY)
+			ereport(ERROR,
+					errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					errmsg("cannot set inactive_timeout for a temporary replication slot"));
+	}
+
 	/*
 	 * If some other backend ran this code concurrently with us, we'd likely
 	 * both allocate the same slot, and that would be bad.  We'd also be at
@@ -399,6 +414,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.failover = failover;
 	slot->data.synced = synced;
 	slot->data.last_inactive_at = 0;
+	slot->data.inactive_timeout = inactive_timeout;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 95802bf2c9..66c3e97c84 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -38,14 +38,15 @@
  */
 static void
 create_physical_replication_slot(char *name, bool immediately_reserve,
-								 bool temporary, XLogRecPtr restart_lsn)
+								 bool temporary, int inactive_timeout,
+								 XLogRecPtr restart_lsn)
 {
 	Assert(!MyReplicationSlot);
 
 	/* acquire replication slot, this will check for conflicting names */
 	ReplicationSlotCreate(name, false,
 						  temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
-						  false, false);
+						  false, false, inactive_timeout);
 
 	if (immediately_reserve)
 	{
@@ -71,6 +72,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 	Name		name = PG_GETARG_NAME(0);
 	bool		immediately_reserve = PG_GETARG_BOOL(1);
 	bool		temporary = PG_GETARG_BOOL(2);
+	int			inactive_timeout = PG_GETARG_INT32(3);
 	Datum		values[2];
 	bool		nulls[2];
 	TupleDesc	tupdesc;
@@ -84,9 +86,15 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 
 	CheckSlotRequirements();
 
+	if (inactive_timeout < 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("\"inactive_timeout\" must not be negative")));
+
 	create_physical_replication_slot(NameStr(*name),
 									 immediately_reserve,
 									 temporary,
+									 inactive_timeout,
 									 InvalidXLogRecPtr);
 
 	values[0] = NameGetDatum(&MyReplicationSlot->data.name);
@@ -120,7 +128,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 static void
 create_logical_replication_slot(char *name, char *plugin,
 								bool temporary, bool two_phase,
-								bool failover,
+								bool failover, int inactive_timeout,
 								XLogRecPtr restart_lsn,
 								bool find_startpoint)
 {
@@ -138,7 +146,7 @@ create_logical_replication_slot(char *name, char *plugin,
 	 */
 	ReplicationSlotCreate(name, true,
 						  temporary ? RS_TEMPORARY : RS_EPHEMERAL, two_phase,
-						  failover, false);
+						  failover, false, inactive_timeout);
 
 	/*
 	 * Create logical decoding context to find start point or, if we don't
@@ -177,6 +185,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 	bool		temporary = PG_GETARG_BOOL(2);
 	bool		two_phase = PG_GETARG_BOOL(3);
 	bool		failover = PG_GETARG_BOOL(4);
+	int			inactive_timeout = PG_GETARG_INT32(5);
 	Datum		result;
 	TupleDesc	tupdesc;
 	HeapTuple	tuple;
@@ -190,11 +199,17 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 
 	CheckLogicalDecodingRequirements();
 
+	if (inactive_timeout < 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("\"inactive_timeout\" must not be negative")));
+
 	create_logical_replication_slot(NameStr(*name),
 									NameStr(*plugin),
 									temporary,
 									two_phase,
 									failover,
+									inactive_timeout,
 									InvalidXLogRecPtr,
 									true);
 
@@ -239,7 +254,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 19
+#define PG_GET_REPLICATION_SLOTS_COLS 20
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -441,6 +456,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		else
 			nulls[i++] = true;
 
+		values[i++] = Int32GetDatum(slot_contents.data.inactive_timeout);
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
@@ -720,6 +737,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 	XLogRecPtr	src_restart_lsn;
 	bool		src_islogical;
 	bool		temporary;
+	int			inactive_timeout;
 	char	   *plugin;
 	Datum		values[2];
 	bool		nulls[2];
@@ -776,6 +794,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 	src_restart_lsn = first_slot_contents.data.restart_lsn;
 	temporary = (first_slot_contents.data.persistency == RS_TEMPORARY);
 	plugin = logical_slot ? NameStr(first_slot_contents.data.plugin) : NULL;
+	inactive_timeout = first_slot_contents.data.inactive_timeout;
 
 	/* Check type of replication slot */
 	if (src_islogical != logical_slot)
@@ -823,6 +842,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 										temporary,
 										false,
 										false,
+										inactive_timeout,
 										src_restart_lsn,
 										false);
 	}
@@ -830,6 +850,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 		create_physical_replication_slot(NameStr(*dst_name),
 										 true,
 										 temporary,
+										 inactive_timeout,
 										 src_restart_lsn);
 
 	/*
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bc40c454de..5315c08650 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1221,7 +1221,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	{
 		ReplicationSlotCreate(cmd->slotname, false,
 							  cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
-							  false, false, false);
+							  false, false, false, 0);
 
 		if (reserve_wal)
 		{
@@ -1252,7 +1252,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 		 */
 		ReplicationSlotCreate(cmd->slotname, true,
 							  cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
-							  two_phase, failover, false);
+							  two_phase, failover, false, 0);
 
 		/*
 		 * Do options check early so that we can bail before calling the
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 34a157f792..6817e9be67 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -676,7 +676,8 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid, "
+							"inactive_timeout "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
@@ -696,6 +697,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 		int			i_failover;
 		int			i_caught_up;
 		int			i_invalid;
+		int			i_inactive_timeout;
 
 		slotinfos = (LogicalSlotInfo *) pg_malloc(sizeof(LogicalSlotInfo) * num_slots);
 
@@ -705,6 +707,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 		i_failover = PQfnumber(res, "failover");
 		i_caught_up = PQfnumber(res, "caught_up");
 		i_invalid = PQfnumber(res, "invalid");
+		i_inactive_timeout = PQfnumber(res, "inactive_timeout");
 
 		for (int slotnum = 0; slotnum < num_slots; slotnum++)
 		{
@@ -716,6 +719,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 			curr->failover = (strcmp(PQgetvalue(res, slotnum, i_failover), "t") == 0);
 			curr->caught_up = (strcmp(PQgetvalue(res, slotnum, i_caught_up), "t") == 0);
 			curr->invalid = (strcmp(PQgetvalue(res, slotnum, i_invalid), "t") == 0);
+			curr->inactive_timeout = atooid(PQgetvalue(res, slotnum, i_inactive_timeout));
 		}
 	}
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index f6143b6bc4..2656056103 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -931,9 +931,10 @@ create_logical_replication_slots(void)
 			appendPQExpBuffer(query, ", ");
 			appendStringLiteralConn(query, slot_info->plugin, conn);
 
-			appendPQExpBuffer(query, ", false, %s, %s);",
+			appendPQExpBuffer(query, ", false, %s, %s, %d);",
 							  slot_info->two_phase ? "true" : "false",
-							  slot_info->failover ? "true" : "false");
+							  slot_info->failover ? "true" : "false",
+							  slot_info->inactive_timeout);
 
 			PQclear(executeQueryOrDie(conn, "%s", query->data));
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 92bcb693fb..eb86d000b1 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -162,6 +162,8 @@ typedef struct
 	bool		invalid;		/* if true, the slot is unusable */
 	bool		failover;		/* is the slot designated to be synced to the
 								 * physical standby? */
+	int			inactive_timeout;	/* The amount of time in seconds the slot
+									 * is allowed to be inactive. */
 } LogicalSlotInfo;
 
 typedef struct
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 318bd2abc8..61c8d6b267 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11102,10 +11102,10 @@
 # replication slots
 { oid => '3779', descr => 'create a physical replication slot',
   proname => 'pg_create_physical_replication_slot', provolatile => 'v',
-  proparallel => 'u', prorettype => 'record', proargtypes => 'name bool bool',
-  proallargtypes => '{name,bool,bool,name,pg_lsn}',
-  proargmodes => '{i,i,i,o,o}',
-  proargnames => '{slot_name,immediately_reserve,temporary,slot_name,lsn}',
+  proparallel => 'u', prorettype => 'record', proargtypes => 'name bool bool int4',
+  proallargtypes => '{name,bool,bool,int4,name,pg_lsn}',
+  proargmodes => '{i,i,i,i,o,o}',
+  proargnames => '{slot_name,immediately_reserve,temporary,inactive_timeout,slot_name,lsn}',
   prosrc => 'pg_create_physical_replication_slot' },
 { oid => '4220',
   descr => 'copy a physical replication slot, changing temporality',
@@ -11130,17 +11130,17 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text,timestamptz}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason,last_inactive_at}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text,timestamptz,int4}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason,last_inactive_at,inactive_timeout}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
   proparallel => 'u', prorettype => 'record',
-  proargtypes => 'name name bool bool bool',
-  proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
-  proargmodes => '{i,i,i,i,i,o,o}',
-  proargnames => '{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
+  proargtypes => 'name name bool bool bool int4',
+  proallargtypes => '{name,name,bool,bool,bool,int4,name,pg_lsn}',
+  proargmodes => '{i,i,i,i,i,i,o,o}',
+  proargnames => '{slot_name,plugin,temporary,twophase,failover,inactive_timeout,slot_name,lsn}',
   prosrc => 'pg_create_logical_replication_slot' },
 { oid => '4222',
   descr => 'copy a logical replication slot, changing temporality and plugin',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 10f8ba67bc..3f57166b61 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -130,6 +130,9 @@ typedef struct ReplicationSlotPersistentData
 
 	/* When did this slot become inactive last time? */
 	TimestampTz last_inactive_at;
+
+	/* The amount of time in seconds the slot is allowed to be inactive. */
+	int			inactive_timeout;
 } ReplicationSlotPersistentData;
 
 /*
@@ -239,7 +242,7 @@ extern void ReplicationSlotsShmemInit(void);
 extern void ReplicationSlotCreate(const char *name, bool db_specific,
 								  ReplicationSlotPersistency persistency,
 								  bool two_phase, bool failover,
-								  bool synced);
+								  bool synced, int inactive_timeout);
 extern void ReplicationSlotPersist(void);
 extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index f47bfd78eb..e4e244effb 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -153,7 +153,7 @@ $primary->append_conf('postgresql.conf', "log_min_messages = 'debug2'");
 $primary->reload;
 
 $primary->psql('postgres',
-	q{SELECT pg_create_logical_replication_slot('lsub2_slot', 'test_decoding', false, false, true);}
+	q{SELECT pg_create_logical_replication_slot('lsub2_slot', 'test_decoding', false, false, true, 3600);}
 );
 
 $primary->psql('postgres',
@@ -190,6 +190,15 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Confirm that the synced slot on the standby has got inactive_timeout from the
+# primary.
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT inactive_timeout FROM pg_replication_slots WHERE slot_name = 'lsub2_slot' AND synced AND NOT temporary;}
+	),
+	"3600",
+	'synced logical slot has got inactive_timeout on standby');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 88fbd6a53c..1c683ceaca 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1477,8 +1477,9 @@ pg_replication_slots| SELECT l.slot_name,
     l.failover,
     l.synced,
     l.invalidation_reason,
-    l.last_inactive_at
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason, last_inactive_at)
+    l.last_inactive_at,
+    l.inactive_timeout
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason, last_inactive_at, inactive_timeout)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v12-0004-Allow-setting-inactive_timeout-in-the-replicatio.patchapplication/octet-stream; name=v12-0004-Allow-setting-inactive_timeout-in-the-replicatio.patchDownload

From 5a4f06c766177f6f3e94a80b6a9458173ea5c762 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 19 Mar 2024 17:32:46 +0000
Subject: [PATCH v12 4/7] Allow setting inactive_timeout in the replication
 command

---
 doc/src/sgml/protocol.sgml                    | 20 ++++++
 src/backend/commands/subscriptioncmds.c       |  6 +-
 .../libpqwalreceiver/libpqwalreceiver.c       | 61 ++++++++++++++++---
 src/backend/replication/logical/tablesync.c   |  1 +
 src/backend/replication/slot.c                | 30 ++++++++-
 src/backend/replication/walreceiver.c         |  2 +-
 src/backend/replication/walsender.c           | 38 +++++++++---
 src/include/replication/slot.h                |  3 +-
 src/include/replication/walreceiver.h         | 11 ++--
 src/test/recovery/t/001_stream_rep.pl         | 50 +++++++++++++++
 10 files changed, 195 insertions(+), 27 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index a5cb19357f..2ffa1b470a 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2068,6 +2068,16 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>INACTIVE_TIMEOUT [ <replaceable class="parameter">integer</replaceable> ]</literal></term>
+        <listitem>
+         <para>
+          If set to a non-zero value, specifies the amount of time in seconds
+          the slot is allowed to be inactive. The default is zero.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
       <para>
@@ -2168,6 +2178,16 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>INACTIVE_TIMEOUT [ <replaceable class="parameter">integer</replaceable> ]</literal></term>
+        <listitem>
+         <para>
+          If set to a non-zero value, specifies the amount of time in seconds
+          the slot is allowed to be inactive. The default is zero.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </listitem>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 5a47fa984d..4562de49c4 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -827,7 +827,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 					twophase_enabled = true;
 
 				walrcv_create_slot(wrconn, opts.slot_name, false, twophase_enabled,
-								   opts.failover, CRS_NOEXPORT_SNAPSHOT, NULL);
+								   opts.failover, 0, CRS_NOEXPORT_SNAPSHOT, NULL);
 
 				if (twophase_enabled)
 					UpdateTwoPhaseState(subid, LOGICALREP_TWOPHASE_STATE_ENABLED);
@@ -849,7 +849,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 			else if (opts.slot_name &&
 					 (opts.failover || walrcv_server_version(wrconn) >= 170000))
 			{
-				walrcv_alter_slot(wrconn, opts.slot_name, opts.failover);
+				walrcv_alter_slot(wrconn, opts.slot_name, &opts.failover, NULL);
 			}
 		}
 		PG_FINALLY();
@@ -1541,7 +1541,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 
 		PG_TRY();
 		{
-			walrcv_alter_slot(wrconn, sub->slotname, opts.failover);
+			walrcv_alter_slot(wrconn, sub->slotname, &opts.failover, NULL);
 		}
 		PG_FINALLY();
 		{
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 761bf0f677..126250a076 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -77,10 +77,11 @@ static char *libpqrcv_create_slot(WalReceiverConn *conn,
 								  bool temporary,
 								  bool two_phase,
 								  bool failover,
+								  int inactive_timeout,
 								  CRSSnapshotAction snapshot_action,
 								  XLogRecPtr *lsn);
 static void libpqrcv_alter_slot(WalReceiverConn *conn, const char *slotname,
-								bool failover);
+								bool *failover, int *inactive_timeout);
 static pid_t libpqrcv_get_backend_pid(WalReceiverConn *conn);
 static WalRcvExecResult *libpqrcv_exec(WalReceiverConn *conn,
 									   const char *query,
@@ -1008,7 +1009,8 @@ libpqrcv_send(WalReceiverConn *conn, const char *buffer, int nbytes)
  */
 static char *
 libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname,
-					 bool temporary, bool two_phase, bool failover,
+					 bool temporary, bool two_phase,
+					 bool failover, int inactive_timeout,
 					 CRSSnapshotAction snapshot_action, XLogRecPtr *lsn)
 {
 	PGresult   *res;
@@ -1048,6 +1050,15 @@ libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname,
 				appendStringInfoChar(&cmd, ' ');
 		}
 
+		if (inactive_timeout > 0)
+		{
+			appendStringInfo(&cmd, "INACTIVE_TIMEOUT %d", inactive_timeout);
+			if (use_new_options_syntax)
+				appendStringInfoString(&cmd, ", ");
+			else
+				appendStringInfoChar(&cmd, ' ');
+		}
+
 		if (use_new_options_syntax)
 		{
 			switch (snapshot_action)
@@ -1084,10 +1095,24 @@ libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname,
 	}
 	else
 	{
+		appendStringInfoString(&cmd, " PHYSICAL ");
 		if (use_new_options_syntax)
-			appendStringInfoString(&cmd, " PHYSICAL (RESERVE_WAL)");
-		else
-			appendStringInfoString(&cmd, " PHYSICAL RESERVE_WAL");
+			appendStringInfoChar(&cmd, '(');
+
+		appendStringInfoString(&cmd, "RESERVE_WAL");
+
+		if (inactive_timeout > 0)
+		{
+			if (use_new_options_syntax)
+				appendStringInfoString(&cmd, ", ");
+			else
+				appendStringInfoChar(&cmd, ' ');
+
+			appendStringInfo(&cmd, "INACTIVE_TIMEOUT %d", inactive_timeout);
+		}
+
+		if (use_new_options_syntax)
+			appendStringInfoChar(&cmd, ')');
 	}
 
 	res = libpqrcv_PQexec(conn->streamConn, cmd.data);
@@ -1121,15 +1146,33 @@ libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname,
  */
 static void
 libpqrcv_alter_slot(WalReceiverConn *conn, const char *slotname,
-					bool failover)
+					bool *failover, int *inactive_timeout)
 {
 	StringInfoData cmd;
 	PGresult   *res;
+	bool		specified_prev_opt = false;
 
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "ALTER_REPLICATION_SLOT %s ( FAILOVER %s )",
-					 quote_identifier(slotname),
-					 failover ? "true" : "false");
+	appendStringInfo(&cmd, "ALTER_REPLICATION_SLOT %s (",
+					 quote_identifier(slotname));
+
+	if (failover != NULL)
+	{
+		appendStringInfo(&cmd, "FAILOVER %s",
+						 *failover ? "true" : "false");
+		specified_prev_opt = true;
+	}
+
+	if (inactive_timeout != NULL)
+	{
+		if (specified_prev_opt)
+			appendStringInfoString(&cmd, ", ");
+
+		appendStringInfo(&cmd, "INACTIVE_TIMEOUT %d", *inactive_timeout);
+		specified_prev_opt = true;
+	}
+
+	appendStringInfoChar(&cmd, ')');
 
 	res = libpqrcv_PQexec(conn->streamConn, cmd.data);
 	pfree(cmd.data);
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 1061d5b61b..59f8e5fbaa 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1431,6 +1431,7 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 	walrcv_create_slot(LogRepWorkerWalRcvConn,
 					   slotname, false /* permanent */ , false /* two_phase */ ,
 					   MySubscription->failover,
+					   0,
 					   CRS_USE_SNAPSHOT, origin_startpos);
 
 	/*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 071e960ec7..35a186f4bc 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -816,8 +816,10 @@ ReplicationSlotDrop(const char *name, bool nowait)
  * Change the definition of the slot identified by the specified name.
  */
 void
-ReplicationSlotAlter(const char *name, bool failover)
+ReplicationSlotAlter(const char *name, bool failover, int inactive_timeout)
 {
+	bool		lock_acquired;
+
 	Assert(MyReplicationSlot == NULL);
 
 	ReplicationSlotAcquire(name, false);
@@ -860,10 +862,36 @@ ReplicationSlotAlter(const char *name, bool failover)
 				errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				errmsg("cannot enable failover for a temporary replication slot"));
 
+	/*
+	 * Do not allow users to set inactive_timeout for temporary slots because
+	 * temporary, slots will not be written to the disk.
+	 */
+	if (inactive_timeout > 0 && MyReplicationSlot->data.persistency == RS_TEMPORARY)
+		ereport(ERROR,
+				errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				errmsg("cannot set inactive_timeout for a temporary replication slot"));
+
+	lock_acquired = false;
 	if (MyReplicationSlot->data.failover != failover)
 	{
 		SpinLockAcquire(&MyReplicationSlot->mutex);
+		lock_acquired = true;
 		MyReplicationSlot->data.failover = failover;
+	}
+
+	if (MyReplicationSlot->data.inactive_timeout != inactive_timeout)
+	{
+		if (!lock_acquired)
+		{
+			SpinLockAcquire(&MyReplicationSlot->mutex);
+			lock_acquired = true;
+		}
+
+		MyReplicationSlot->data.inactive_timeout = inactive_timeout;
+	}
+
+	if (lock_acquired)
+	{
 		SpinLockRelease(&MyReplicationSlot->mutex);
 
 		ReplicationSlotMarkDirty();
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index acda5f68d9..ac2ebb0c69 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -389,7 +389,7 @@ WalReceiverMain(char *startup_data, size_t startup_data_len)
 					 "pg_walreceiver_%lld",
 					 (long long int) walrcv_get_backend_pid(wrconn));
 
-			walrcv_create_slot(wrconn, slotname, true, false, false, 0, NULL);
+			walrcv_create_slot(wrconn, slotname, true, false, false, 0, 0, NULL);
 
 			SpinLockAcquire(&walrcv->mutex);
 			strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 5315c08650..0420274247 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1123,13 +1123,15 @@ static void
 parseCreateReplSlotOptions(CreateReplicationSlotCmd *cmd,
 						   bool *reserve_wal,
 						   CRSSnapshotAction *snapshot_action,
-						   bool *two_phase, bool *failover)
+						   bool *two_phase, bool *failover,
+						   int *inactive_timeout)
 {
 	ListCell   *lc;
 	bool		snapshot_action_given = false;
 	bool		reserve_wal_given = false;
 	bool		two_phase_given = false;
 	bool		failover_given = false;
+	bool		inactive_timeout_given = false;
 
 	/* Parse options */
 	foreach(lc, cmd->options)
@@ -1188,6 +1190,15 @@ parseCreateReplSlotOptions(CreateReplicationSlotCmd *cmd,
 			failover_given = true;
 			*failover = defGetBoolean(defel);
 		}
+		else if (strcmp(defel->defname, "inactive_timeout") == 0)
+		{
+			if (inactive_timeout_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			inactive_timeout_given = true;
+			*inactive_timeout = defGetInt32(defel);
+		}
 		else
 			elog(ERROR, "unrecognized option: %s", defel->defname);
 	}
@@ -1205,6 +1216,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	bool		reserve_wal = false;
 	bool		two_phase = false;
 	bool		failover = false;
+	int			inactive_timeout = 0;
 	CRSSnapshotAction snapshot_action = CRS_EXPORT_SNAPSHOT;
 	DestReceiver *dest;
 	TupOutputState *tstate;
@@ -1215,13 +1227,13 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	Assert(!MyReplicationSlot);
 
 	parseCreateReplSlotOptions(cmd, &reserve_wal, &snapshot_action, &two_phase,
-							   &failover);
+							   &failover, &inactive_timeout);
 
 	if (cmd->kind == REPLICATION_KIND_PHYSICAL)
 	{
 		ReplicationSlotCreate(cmd->slotname, false,
 							  cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
-							  false, false, false, 0);
+							  false, false, false, inactive_timeout);
 
 		if (reserve_wal)
 		{
@@ -1252,7 +1264,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 		 */
 		ReplicationSlotCreate(cmd->slotname, true,
 							  cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
-							  two_phase, failover, false, 0);
+							  two_phase, failover, false, inactive_timeout);
 
 		/*
 		 * Do options check early so that we can bail before calling the
@@ -1411,9 +1423,11 @@ DropReplicationSlot(DropReplicationSlotCmd *cmd)
  * Process extra options given to ALTER_REPLICATION_SLOT.
  */
 static void
-ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover)
+ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover,
+						  int *inactive_timeout)
 {
 	bool		failover_given = false;
+	bool		inactive_timeout_given = false;
 
 	/* Parse options */
 	foreach_ptr(DefElem, defel, cmd->options)
@@ -1427,6 +1441,15 @@ ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover)
 			failover_given = true;
 			*failover = defGetBoolean(defel);
 		}
+		else if (strcmp(defel->defname, "inactive_timeout") == 0)
+		{
+			if (inactive_timeout_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			inactive_timeout_given = true;
+			*inactive_timeout = defGetInt32(defel);
+		}
 		else
 			elog(ERROR, "unrecognized option: %s", defel->defname);
 	}
@@ -1439,9 +1462,10 @@ static void
 AlterReplicationSlot(AlterReplicationSlotCmd *cmd)
 {
 	bool		failover = false;
+	int			inactive_timeout = 0;
 
-	ParseAlterReplSlotOptions(cmd, &failover);
-	ReplicationSlotAlter(cmd->slotname, failover);
+	ParseAlterReplSlotOptions(cmd, &failover, &inactive_timeout);
+	ReplicationSlotAlter(cmd->slotname, failover, inactive_timeout);
 }
 
 /*
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 3f57166b61..8966188acb 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -246,7 +246,8 @@ extern void ReplicationSlotCreate(const char *name, bool db_specific,
 extern void ReplicationSlotPersist(void);
 extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
-extern void ReplicationSlotAlter(const char *name, bool failover);
+extern void ReplicationSlotAlter(const char *name, bool failover,
+								 int inactive_timeout);
 
 extern void ReplicationSlotAcquire(const char *name, bool nowait);
 extern void ReplicationSlotRelease(void);
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 12f71fa99b..038812fd24 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -366,6 +366,7 @@ typedef char *(*walrcv_create_slot_fn) (WalReceiverConn *conn,
 										bool temporary,
 										bool two_phase,
 										bool failover,
+										int inactive_timeout,
 										CRSSnapshotAction snapshot_action,
 										XLogRecPtr *lsn);
 
@@ -377,7 +378,7 @@ typedef char *(*walrcv_create_slot_fn) (WalReceiverConn *conn,
  */
 typedef void (*walrcv_alter_slot_fn) (WalReceiverConn *conn,
 									  const char *slotname,
-									  bool failover);
+									  bool *failover, int *inactive_timeout);
 
 /*
  * walrcv_get_backend_pid_fn
@@ -453,10 +454,10 @@ extern PGDLLIMPORT WalReceiverFunctionsType *WalReceiverFunctions;
 	WalReceiverFunctions->walrcv_receive(conn, buffer, wait_fd)
 #define walrcv_send(conn, buffer, nbytes) \
 	WalReceiverFunctions->walrcv_send(conn, buffer, nbytes)
-#define walrcv_create_slot(conn, slotname, temporary, two_phase, failover, snapshot_action, lsn) \
-	WalReceiverFunctions->walrcv_create_slot(conn, slotname, temporary, two_phase, failover, snapshot_action, lsn)
-#define walrcv_alter_slot(conn, slotname, failover) \
-	WalReceiverFunctions->walrcv_alter_slot(conn, slotname, failover)
+#define walrcv_create_slot(conn, slotname, temporary, two_phase, failover, inactive_timeout, snapshot_action, lsn) \
+	WalReceiverFunctions->walrcv_create_slot(conn, slotname, temporary, two_phase, failover, inactive_timeout, snapshot_action, lsn)
+#define walrcv_alter_slot(conn, slotname, failover, inactive_timeout) \
+	WalReceiverFunctions->walrcv_alter_slot(conn, slotname, failover, inactive_timeout)
 #define walrcv_get_backend_pid(conn) \
 	WalReceiverFunctions->walrcv_get_backend_pid(conn)
 #define walrcv_exec(conn, exec, nRetTypes, retTypes) \
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 5311ade509..db00b6aa24 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -604,4 +604,54 @@ ok( pump_until(
 	'base backup cleanly canceled');
 $sigchld_bb->finish();
 
+# Drop any existing slots on the primary, for the follow-up tests.
+$node_primary->safe_psql('postgres',
+	"SELECT pg_drop_replication_slot(slot_name) FROM pg_replication_slots;");
+
+# Test setting inactive_timeout option via replication commands.
+$node_primary->append_conf(
+	'postgresql.conf', qq(
+wal_level = logical
+));
+$node_primary->restart;
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_phy_slot1 PHYSICAL (RESERVE_WAL, INACTIVE_TIMEOUT 100);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_phy_slot2 PHYSICAL (RESERVE_WAL);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"ALTER_REPLICATION_SLOT it_phy_slot2 (INACTIVE_TIMEOUT 200);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_log_slot1 LOGICAL pgoutput (TWO_PHASE, INACTIVE_TIMEOUT 300);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_log_slot2 LOGICAL pgoutput;",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"ALTER_REPLICATION_SLOT it_log_slot2 (INACTIVE_TIMEOUT 400);",
+	extra_params => [ '-d', $connstr_db ]);
+
+my $slot_info_expected = 'it_log_slot1|logical|300
+it_log_slot2|logical|400
+it_phy_slot1|physical|100
+it_phy_slot2|physical|0';
+
+my $slot_info = $node_primary->safe_psql('postgres',
+	qq[SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;]);
+is($slot_info, $slot_info_expected, "replication slots with inactive_timeout on primary exist");
+
 done_testing();
-- 
2.34.1

v12-0005-Add-inactive_timeout-option-to-subscriptions.patchapplication/octet-stream; name=v12-0005-Add-inactive_timeout-option-to-subscriptions.patchDownload

From c2cf2955adeb3de4460ab6718395db23fd868db3 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 19 Mar 2024 17:34:58 +0000
Subject: [PATCH v12 5/7] Add inactive_timeout option to subscriptions

---
 doc/src/sgml/catalogs.sgml                  |  11 ++
 doc/src/sgml/ref/alter_subscription.sgml    |   5 +-
 doc/src/sgml/ref/create_subscription.sgml   |  12 ++
 src/backend/catalog/pg_subscription.c       |   1 +
 src/backend/catalog/system_views.sql        |   3 +-
 src/backend/commands/subscriptioncmds.c     |  89 ++++++++++-
 src/backend/replication/logical/tablesync.c |   2 +-
 src/bin/pg_dump/pg_dump.c                   |  22 ++-
 src/bin/pg_dump/pg_dump.h                   |   1 +
 src/bin/pg_upgrade/t/003_logical_slots.pl   |   6 +-
 src/bin/psql/describe.c                     |   7 +-
 src/bin/psql/tab-complete.c                 |  14 +-
 src/include/catalog/pg_subscription.h       |   9 ++
 src/test/regress/expected/subscription.out  | 154 ++++++++++----------
 src/test/regress/sql/subscription.sql       |   1 +
 15 files changed, 240 insertions(+), 97 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2f091ad09d..e64e8cef7d 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -8025,6 +8025,17 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>subinactivetimeout</structfield> <type>int4</type>
+      </para>
+      <para>
+        When set to a non-zero value, specifies the amount of time in seconds
+        the associated replication slots (i.e. the main slot and the table
+        sync slots) in the upstream database are allowed to be inactive.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 413ce68ce2..d02d6232de 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -227,8 +227,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
       <link linkend="sql-createsubscription-params-with-disable-on-error"><literal>disable_on_error</literal></link>,
       <link linkend="sql-createsubscription-params-with-password-required"><literal>password_required</literal></link>,
       <link linkend="sql-createsubscription-params-with-run-as-owner"><literal>run_as_owner</literal></link>,
-      <link linkend="sql-createsubscription-params-with-origin"><literal>origin</literal></link>, and
-      <link linkend="sql-createsubscription-params-with-failover"><literal>failover</literal></link>.
+      <link linkend="sql-createsubscription-params-with-origin"><literal>origin</literal></link>,
+      <link linkend="sql-createsubscription-params-with-failover"><literal>failover</literal></link>, and
+      <link linkend="sql-createsubscription-params-with-inactive-timeout"><literal>inactive_timeout</literal></link>.
       Only a superuser can set <literal>password_required = false</literal>.
      </para>
 
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 15794731bb..7be4610921 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -414,6 +414,18 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry id="sql-createsubscription-params-with-inactive-timeout">
+        <term><literal>inactive_timeout</literal> (<type>int4</type>)</term>
+        <listitem>
+         <para>
+          When set to a non-zero value, specifies the amount of time in seconds
+          the associated replication slots (i.e. the main slot and the table
+          sync slots) in the upstream database are allowed to be inactive.
+          The default is <literal>0</literal>.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist></para>
 
     </listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 9efc9159f2..f874146e72 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -72,6 +72,7 @@ GetSubscription(Oid subid, bool missing_ok)
 	sub->passwordrequired = subform->subpasswordrequired;
 	sub->runasowner = subform->subrunasowner;
 	sub->failover = subform->subfailover;
+	sub->inactivetimeout = subform->subinactivetimeout;
 
 	/* Get conninfo */
 	datum = SysCacheGetAttrNotNull(SUBSCRIPTIONOID,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a43048ae93..6005315ce3 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1363,7 +1363,8 @@ REVOKE ALL ON pg_subscription FROM public;
 GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
               subbinary, substream, subtwophasestate, subdisableonerr,
 			  subpasswordrequired, subrunasowner, subfailover,
-              subslotname, subsynccommit, subpublications, suborigin)
+              subinactivetimeout, subslotname, subsynccommit,
+              subpublications, suborigin)
     ON pg_subscription TO public;
 
 CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 4562de49c4..3a75ef1aac 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -70,8 +70,9 @@
 #define SUBOPT_PASSWORD_REQUIRED	0x00000800
 #define SUBOPT_RUN_AS_OWNER			0x00001000
 #define SUBOPT_FAILOVER				0x00002000
-#define SUBOPT_LSN					0x00004000
-#define SUBOPT_ORIGIN				0x00008000
+#define SUBOPT_INACTIVE_TIMEOUT		0x00004000
+#define SUBOPT_LSN					0x00008000
+#define SUBOPT_ORIGIN				0x00010000
 
 /* check if the 'val' has 'bits' set */
 #define IsSet(val, bits)  (((val) & (bits)) == (bits))
@@ -97,6 +98,7 @@ typedef struct SubOpts
 	bool		passwordrequired;
 	bool		runasowner;
 	bool		failover;
+	int			inactivetimeout;
 	char	   *origin;
 	XLogRecPtr	lsn;
 } SubOpts;
@@ -159,6 +161,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
 		opts->runasowner = false;
 	if (IsSet(supported_opts, SUBOPT_FAILOVER))
 		opts->failover = false;
+	if (IsSet(supported_opts, SUBOPT_INACTIVE_TIMEOUT))
+		opts->inactivetimeout = 0;
 	if (IsSet(supported_opts, SUBOPT_ORIGIN))
 		opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
 
@@ -316,6 +320,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
 			opts->specified_opts |= SUBOPT_FAILOVER;
 			opts->failover = defGetBoolean(defel);
 		}
+		else if (IsSet(supported_opts, SUBOPT_INACTIVE_TIMEOUT) &&
+				 strcmp(defel->defname, "inactive_timeout") == 0)
+		{
+			if (IsSet(opts->specified_opts, SUBOPT_INACTIVE_TIMEOUT))
+				errorConflictingDefElem(defel, pstate);
+
+			opts->specified_opts |= SUBOPT_INACTIVE_TIMEOUT;
+			opts->inactivetimeout = defGetInt32(defel);
+		}
 		else if (IsSet(supported_opts, SUBOPT_ORIGIN) &&
 				 strcmp(defel->defname, "origin") == 0)
 		{
@@ -453,6 +466,17 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
 								"slot_name = NONE", "create_slot = false")));
 		}
 	}
+
+	if (opts->inactivetimeout > 0 &&
+		IsSet(opts->specified_opts, SUBOPT_INACTIVE_TIMEOUT) &&
+		!opts->create_slot)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_SYNTAX_ERROR),
+		/*- translator: %s is string of the form "option = value" */
+				 errmsg("subscription with inactive_timeout = %d must also set %s",
+						opts->inactivetimeout, "create_slot = true")));
+	}
 }
 
 /*
@@ -610,7 +634,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 					  SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
 					  SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
 					  SUBOPT_DISABLE_ON_ERR | SUBOPT_PASSWORD_REQUIRED |
-					  SUBOPT_RUN_AS_OWNER | SUBOPT_FAILOVER | SUBOPT_ORIGIN);
+					  SUBOPT_RUN_AS_OWNER | SUBOPT_FAILOVER |
+					  SUBOPT_INACTIVE_TIMEOUT | SUBOPT_ORIGIN);
 	parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
 
 	/*
@@ -717,6 +742,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	values[Anum_pg_subscription_subpasswordrequired - 1] = BoolGetDatum(opts.passwordrequired);
 	values[Anum_pg_subscription_subrunasowner - 1] = BoolGetDatum(opts.runasowner);
 	values[Anum_pg_subscription_subfailover - 1] = BoolGetDatum(opts.failover);
+	values[Anum_pg_subscription_subinactivetimeout - 1] = Int32GetDatum(opts.inactivetimeout);
 	values[Anum_pg_subscription_subconninfo - 1] =
 		CStringGetTextDatum(conninfo);
 	if (opts.slot_name)
@@ -827,7 +853,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 					twophase_enabled = true;
 
 				walrcv_create_slot(wrconn, opts.slot_name, false, twophase_enabled,
-								   opts.failover, 0, CRS_NOEXPORT_SNAPSHOT, NULL);
+								   opts.failover, opts.inactivetimeout,
+								   CRS_NOEXPORT_SNAPSHOT, NULL);
 
 				if (twophase_enabled)
 					UpdateTwoPhaseState(subid, LOGICALREP_TWOPHASE_STATE_ENABLED);
@@ -851,6 +878,16 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 			{
 				walrcv_alter_slot(wrconn, opts.slot_name, &opts.failover, NULL);
 			}
+
+			/*
+			 * We do the same for inactive_timeout property just like failover
+			 * property above.
+			 */
+			else if (opts.slot_name &&
+					 (opts.inactivetimeout > 0 || walrcv_server_version(wrconn) >= 170000))
+			{
+				walrcv_alter_slot(wrconn, opts.slot_name, NULL, &opts.inactivetimeout);
+			}
 		}
 		PG_FINALLY();
 		{
@@ -1168,7 +1205,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 								  SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
 								  SUBOPT_PASSWORD_REQUIRED |
 								  SUBOPT_RUN_AS_OWNER | SUBOPT_FAILOVER |
-								  SUBOPT_ORIGIN);
+								  SUBOPT_INACTIVE_TIMEOUT | SUBOPT_ORIGIN);
 
 				parse_subscription_options(pstate, stmt->options,
 										   supported_opts, &opts);
@@ -1272,6 +1309,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 					replaces[Anum_pg_subscription_subfailover - 1] = true;
 				}
 
+				if (IsSet(opts.specified_opts, SUBOPT_INACTIVE_TIMEOUT))
+				{
+					if (!sub->slotname)
+						ereport(ERROR,
+								(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+								 errmsg("cannot set %s for a subscription that does not have a slot name",
+										"inactive_timeout")));
+
+					values[Anum_pg_subscription_subinactivetimeout - 1] =
+						BoolGetDatum(opts.inactivetimeout);
+					replaces[Anum_pg_subscription_subinactivetimeout - 1] = true;
+				}
+
 				if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
 				{
 					values[Anum_pg_subscription_suborigin - 1] =
@@ -1550,6 +1600,35 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 		PG_END_TRY();
 	}
 
+	if (replaces[Anum_pg_subscription_subinactivetimeout - 1])
+	{
+		bool		must_use_password;
+		char	   *err;
+		WalReceiverConn *wrconn;
+
+		/* Load the library providing us libpq calls. */
+		load_file("libpqwalreceiver", false);
+
+		/* Try to connect to the publisher. */
+		must_use_password = sub->passwordrequired && !sub->ownersuperuser;
+		wrconn = walrcv_connect(sub->conninfo, true, true, must_use_password,
+								sub->name, &err);
+		if (!wrconn)
+			ereport(ERROR,
+					(errcode(ERRCODE_CONNECTION_FAILURE),
+					 errmsg("could not connect to the publisher: %s", err)));
+
+		PG_TRY();
+		{
+			walrcv_alter_slot(wrconn, sub->slotname, NULL, &opts.inactivetimeout);
+		}
+		PG_FINALLY();
+		{
+			walrcv_disconnect(wrconn);
+		}
+		PG_END_TRY();
+	}
+
 	table_close(rel, RowExclusiveLock);
 
 	ObjectAddressSet(myself, SubscriptionRelationId, subid);
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 59f8e5fbaa..c660f1e65e 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1431,7 +1431,7 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 	walrcv_create_slot(LogRepWorkerWalRcvConn,
 					   slotname, false /* permanent */ , false /* two_phase */ ,
 					   MySubscription->failover,
-					   0,
+					   MySubscription->inactivetimeout,
 					   CRS_USE_SNAPSHOT, origin_startpos);
 
 	/*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index a5149ca823..12b462c9a7 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4653,6 +4653,7 @@ getSubscriptions(Archive *fout)
 	int			i_suboriginremotelsn;
 	int			i_subenabled;
 	int			i_subfailover;
+	int			i_subinactivetimeout;
 	int			i,
 				ntups;
 
@@ -4719,11 +4720,13 @@ getSubscriptions(Archive *fout)
 	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
 		appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn,\n"
 							 " s.subenabled,\n"
-							 " s.subfailover\n");
+							 " s.subfailover,\n"
+							 " s.subinactivetimeout\n");
 	else
 		appendPQExpBufferStr(query, " NULL AS suboriginremotelsn,\n"
 							 " false AS subenabled,\n"
-							 " false AS subfailover\n");
+							 " false AS subfailover,\n"
+							 " 0 AS subinactivetimeout\n");
 
 	appendPQExpBufferStr(query,
 						 "FROM pg_subscription s\n");
@@ -4763,6 +4766,7 @@ getSubscriptions(Archive *fout)
 	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
 	i_subenabled = PQfnumber(res, "subenabled");
 	i_subfailover = PQfnumber(res, "subfailover");
+	i_subinactivetimeout = PQfnumber(res, "subinactivetimeout");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4809,6 +4813,8 @@ getSubscriptions(Archive *fout)
 			pg_strdup(PQgetvalue(res, i, i_subenabled));
 		subinfo[i].subfailover =
 			pg_strdup(PQgetvalue(res, i, i_subfailover));
+		subinfo[i].subinactivetimeout =
+			atooid(PQgetvalue(res, i, i_subinactivetimeout));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -5090,6 +5096,18 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 			appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s SET(failover = true);\n", qsubname);
 		}
 
+		if (subinfo->subinactivetimeout > 0)
+		{
+			/*
+			 * Preserve subscription's inactive_timeout option to be able to
+			 * use it after the upgrade.
+			 */
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber's inactive_timeout option.\n");
+			appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s SET(inactive_timeout = %d);\n",
+							  qsubname, subinfo->subinactivetimeout);
+		}
+
 		if (strcmp(subinfo->subenabled, "t") == 0)
 		{
 			/*
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 9bc93520b4..bfaedcd7e2 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -667,6 +667,7 @@ typedef struct _SubscriptionInfo
 	char	   *suborigin;
 	char	   *suboriginremotelsn;
 	char	   *subfailover;
+	int			subinactivetimeout;
 } SubscriptionInfo;
 
 /*
diff --git a/src/bin/pg_upgrade/t/003_logical_slots.pl b/src/bin/pg_upgrade/t/003_logical_slots.pl
index 83d71c3084..8aa34d66cc 100644
--- a/src/bin/pg_upgrade/t/003_logical_slots.pl
+++ b/src/bin/pg_upgrade/t/003_logical_slots.pl
@@ -172,7 +172,7 @@ $sub->start;
 $sub->safe_psql(
 	'postgres', qq[
 	CREATE TABLE tbl (a int);
-	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (two_phase = 'true', failover = 'true')
+	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (two_phase = 'true', failover = 'true', inactive_timeout = 3600)
 ]);
 $sub->wait_for_subscription_sync($oldpub, 'regress_sub');
 
@@ -192,8 +192,8 @@ command_ok([@pg_upgrade_cmd], 'run of pg_upgrade of old cluster');
 # Check that the slot 'regress_sub' has migrated to the new cluster
 $newpub->start;
 my $result = $newpub->safe_psql('postgres',
-	"SELECT slot_name, two_phase, failover FROM pg_replication_slots");
-is($result, qq(regress_sub|t|t), 'check the slot exists on new cluster');
+	"SELECT slot_name, two_phase, failover, inactive_timeout = 3600 FROM pg_replication_slots");
+is($result, qq(regress_sub|t|t|t), 'check the slot exists on new cluster');
 
 # Update the connection
 my $new_connstr = $newpub->connstr . ' dbname=postgres';
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 6433497bcd..73fcfa421d 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6581,7 +6581,7 @@ describeSubscriptions(const char *pattern, bool verbose)
 	printQueryOpt myopt = pset.popt;
 	static const bool translate_columns[] = {false, false, false, false,
 		false, false, false, false, false, false, false, false, false, false,
-	false};
+	false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -6650,6 +6650,11 @@ describeSubscriptions(const char *pattern, bool verbose)
 							  ", subfailover AS \"%s\"\n",
 							  gettext_noop("Failover"));
 
+		if (pset.sversion >= 170000)
+			appendPQExpBuffer(&buf,
+							  ", subinactivetimeout AS \"%s\"\n",
+							  gettext_noop("Inactive timeout"));
+
 		appendPQExpBuffer(&buf,
 						  ",  subsynccommit AS \"%s\"\n"
 						  ",  subconninfo AS \"%s\"\n",
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 56d723de8a..bf7349bae1 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1946,9 +1946,10 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("(", "PUBLICATION");
 	/* ALTER SUBSCRIPTION <name> SET ( */
 	else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
-		COMPLETE_WITH("binary", "disable_on_error", "failover", "origin",
-					  "password_required", "run_as_owner", "slot_name",
-					  "streaming", "synchronous_commit");
+		COMPLETE_WITH("binary", "disable_on_error", "failover",
+					  "inactive_timeout", "origin", "password_required",
+					  "run_as_owner", "slot_name", "streaming",
+					  "synchronous_commit");
 	/* ALTER SUBSCRIPTION <name> SKIP ( */
 	else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
 		COMPLETE_WITH("lsn");
@@ -3344,9 +3345,10 @@ psql_completion(const char *text, int start, int end)
 	/* Complete "CREATE SUBSCRIPTION <name> ...  WITH ( <opt>" */
 	else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
 		COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
-					  "disable_on_error", "enabled", "failover", "origin",
-					  "password_required", "run_as_owner", "slot_name",
-					  "streaming", "synchronous_commit", "two_phase");
+					  "disable_on_error", "enabled", "failover",
+					  "inactive_timeout", "origin", "password_required",
+					  "run_as_owner", "slot_name", "streaming",
+					  "synchronous_commit", "two_phase");
 
 /* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
 
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 0aa14ec4a2..1113cdf690 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -98,6 +98,11 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
 								 * slots) in the upstream database are enabled
 								 * to be synchronized to the standbys. */
 
+	int32		subinactivetimeout; /* Associated replication slots (i.e. the
+									 * main slot and the table sync slots) in
+									 * the upstream database are allowed to be
+									 * inactive this amount of time. */
+
 #ifdef CATALOG_VARLEN			/* variable-length fields start here */
 	/* Connection string to the publisher */
 	text		subconninfo BKI_FORCE_NOT_NULL;
@@ -151,6 +156,10 @@ typedef struct Subscription
 								 * (i.e. the main slot and the table sync
 								 * slots) in the upstream database are enabled
 								 * to be synchronized to the standbys. */
+	int32		inactivetimeout;	/* Associated replication slots (i.e. the
+									 * main slot and the table sync slots) in
+									 * the upstream database are allowed to be
+									 * inactive this amount of time. */
 	char	   *conninfo;		/* Connection string to the publisher */
 	char	   *slotname;		/* Name of the replication slot */
 	char	   *synccommit;		/* Synchronous commit setting for worker */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 1eee6b17b8..83c96c764a 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -101,6 +101,8 @@ CREATE SUBSCRIPTION regress_testsub2 CONNECTION 'dbname=regress_doesnotexist' PU
 ERROR:  subscription with slot_name = NONE must also set create_slot = false
 CREATE SUBSCRIPTION regress_testsub2 CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (slot_name = NONE, create_slot = false);
 ERROR:  subscription with slot_name = NONE must also set enabled = false
+CREATE SUBSCRIPTION regress_testsub2 CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (create_slot = false, inactive_timeout = 3600);
+ERROR:  subscription with inactive_timeout = 3600 must also set create_slot = true
 -- ok - with slot_name = NONE
 CREATE SUBSCRIPTION regress_testsub3 CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (slot_name = NONE, connect = false);
 WARNING:  subscription was created, but is not connected
@@ -118,18 +120,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
 WARNING:  subscription was created, but is not connected
 HINT:  To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
 \dRs+ regress_testsub4
-                                                                                                                 List of subscriptions
-       Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | none   | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+       Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | none   | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
 \dRs+ regress_testsub4
-                                                                                                                 List of subscriptions
-       Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+       Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 DROP SUBSCRIPTION regress_testsub3;
@@ -147,10 +149,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
 ERROR:  invalid connection string syntax: missing "=" after "foobar" in connection info string
 
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -159,10 +161,10 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = 'newname');
 ALTER SUBSCRIPTION regress_testsub SET (password_required = false);
 ALTER SUBSCRIPTION regress_testsub SET (run_as_owner = true);
 \dRs+
-                                                                                                                     List of subscriptions
-      Name       |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |           Conninfo           | Skip LSN 
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | f                 | t             | f        | off                | dbname=regress_doesnotexist2 | 0/0
+                                                                                                                              List of subscriptions
+      Name       |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |           Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | f                 | t             | f        |                0 | off                | dbname=regress_doesnotexist2 | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (password_required = true);
@@ -178,10 +180,10 @@ ERROR:  unrecognized subscription parameter: "create_slot"
 -- ok
 ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
 \dRs+
-                                                                                                                     List of subscriptions
-      Name       |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |           Conninfo           | Skip LSN 
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist2 | 0/12345
+                                                                                                                              List of subscriptions
+      Name       |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |           Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist2 | 0/12345
 (1 row)
 
 -- ok - with lsn = NONE
@@ -190,10 +192,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
 ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
 ERROR:  invalid WAL location (LSN): 0/0
 \dRs+
-                                                                                                                     List of subscriptions
-      Name       |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |           Conninfo           | Skip LSN 
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist2 | 0/0
+                                                                                                                              List of subscriptions
+      Name       |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |           Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist2 | 0/0
 (1 row)
 
 BEGIN;
@@ -225,10 +227,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
 ERROR:  invalid value for parameter "synchronous_commit": "foobar"
 HINT:  Available values: local, remote_write, remote_apply, on, off.
 \dRs+
-                                                                                                                       List of subscriptions
-        Name         |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |           Conninfo           | Skip LSN 
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | t                 | f             | f        | local              | dbname=regress_doesnotexist2 | 0/0
+                                                                                                                                List of subscriptions
+        Name         |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |           Conninfo           | Skip LSN 
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | local              | dbname=regress_doesnotexist2 | 0/0
 (1 row)
 
 -- rename back to keep the rest simple
@@ -257,19 +259,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
 WARNING:  subscription was created, but is not connected
 HINT:  To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | t      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | t      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (binary = false);
 ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 DROP SUBSCRIPTION regress_testsub;
@@ -281,27 +283,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
 WARNING:  subscription was created, but is not connected
 HINT:  To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | on        | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | on        | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | parallel  | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | parallel  | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
 ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 -- fail - publication already exists
@@ -316,10 +318,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
 ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
 ERROR:  publication "testpub1" is already in subscription "regress_testsub"
 \dRs+
-                                                                                                                        List of subscriptions
-      Name       |           Owner           | Enabled |         Publication         | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub,testpub1,testpub2} | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                                  List of subscriptions
+      Name       |           Owner           | Enabled |         Publication         | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub,testpub1,testpub2} | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 -- fail - publication used more than once
@@ -334,10 +336,10 @@ ERROR:  publication "testpub3" is not in subscription "regress_testsub"
 -- ok - delete publications
 ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 DROP SUBSCRIPTION regress_testsub;
@@ -373,10 +375,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
 WARNING:  subscription was created, but is not connected
 HINT:  To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | p                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | p                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 --fail - alter of two_phase option not supported.
@@ -385,10 +387,10 @@ ERROR:  unrecognized subscription parameter: "two_phase"
 -- but can alter streaming when two_phase enabled
 ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | on        | p                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | on        | p                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -398,10 +400,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
 WARNING:  subscription was created, but is not connected
 HINT:  To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | on        | p                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | on        | p                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -414,18 +416,18 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
 WARNING:  subscription was created, but is not connected
 HINT:  To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | t                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | t                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql
index 1b2a23ba7b..ba901800f8 100644
--- a/src/test/regress/sql/subscription.sql
+++ b/src/test/regress/sql/subscription.sql
@@ -60,6 +60,7 @@ CREATE SUBSCRIPTION regress_testsub2 CONNECTION 'dbname=regress_doesnotexist' PU
 CREATE SUBSCRIPTION regress_testsub2 CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (slot_name = NONE);
 CREATE SUBSCRIPTION regress_testsub2 CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (slot_name = NONE, enabled = false);
 CREATE SUBSCRIPTION regress_testsub2 CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (slot_name = NONE, create_slot = false);
+CREATE SUBSCRIPTION regress_testsub2 CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (create_slot = false, inactive_timeout = 3600);
 
 -- ok - with slot_name = NONE
 CREATE SUBSCRIPTION regress_testsub3 CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (slot_name = NONE, connect = false);
-- 
2.34.1

v12-0006-Add-inactive_timeout-based-replication-slot-inva.patchapplication/octet-stream; name=v12-0006-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 05f22dba1351b32422832b648af3af1c37d9edcf Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 19 Mar 2024 18:38:18 +0000
Subject: [PATCH v12 6/7] Add inactive_timeout based replication slot
 invalidation

Up until now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days at slot level, after which the inactive slots get
dropped.

To achieve the above, postgres uses replication slot metric
last_inactive_at (the time at which the slot became inactive), and
a new slot level parameter inactive_timeout. The checkpointer then
looks at all replication slots invalidating the inactive slots
based on the timeout set.
---
 doc/src/sgml/func.sgml                      |  12 +-
 doc/src/sgml/ref/create_subscription.sgml   |   4 +-
 doc/src/sgml/system-views.sgml              |  10 +-
 src/backend/access/transam/xlog.c           |   8 +
 src/backend/replication/slot.c              |  22 ++-
 src/include/replication/slot.h              |   2 +
 src/test/recovery/meson.build               |   1 +
 src/test/recovery/t/050_invalidate_slots.pl | 168 ++++++++++++++++++++
 8 files changed, 217 insertions(+), 10 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 0ece7c8d3d..da316f345d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28168,8 +28168,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         released upon any error. The optional fourth
         parameter, <parameter>inactive_timeout</parameter>, when set to a
         non-zero value, specifies the amount of time in seconds the slot is
-        allowed to be inactive. This function corresponds to the replication
-        protocol command
+        allowed to be inactive before getting invalidated.
+        This function corresponds to the replication protocol command
         <literal>CREATE_REPLICATION_SLOT ... PHYSICAL</literal>.
        </para></entry>
       </row>
@@ -28214,12 +28214,12 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <parameter>failover</parameter>, when set to true,
         specifies that this slot is enabled to be synced to the
         standbys so that logical replication can be resumed after
-        failover.  The optional sixth parameter,
+        failover. The optional sixth parameter,
         <parameter>inactive_timeout</parameter>, when set to a
         non-zero value, specifies the amount of time in seconds the slot is
-        allowed to be inactive. A call to this function has the same effect as
-        the replication protocol command
-        <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
+        allowed to be inactive before getting invalidated.
+        A call to this function has the same effect as the replication protocol
+        command <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
        </para></entry>
       </row>
 
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 7be4610921..472592c750 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -421,8 +421,8 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
          <para>
           When set to a non-zero value, specifies the amount of time in seconds
           the associated replication slots (i.e. the main slot and the table
-          sync slots) in the upstream database are allowed to be inactive.
-          The default is <literal>0</literal>.
+          sync slots) in the upstream database are allowed to be inactive before
+          getting invalidated. The default is <literal>0</literal>.
          </para>
         </listitem>
        </varlistentry>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index f413c819de..9821c6f77a 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2588,6 +2588,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by slot's
+          <literal>inactive_timeout</literal> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
@@ -2767,7 +2774,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        <structfield>inactive_timeout</structfield> <type>integer</type>
       </para>
       <para>
-        The amount of time in seconds the slot is allowed to be inactive.
+        The amount of time in seconds the slot is allowed to be inactive before
+        getting invalidated.
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 20a5f86209..ea4ece22de 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7168,6 +7168,10 @@ CreateCheckPoint(int flags)
 	RemoveOldXlogFiles(_logSegNo, RedoRecPtr, recptr,
 					   checkPoint.ThisTimeLineID);
 
+	/* Invalidate inactive replication slots based on timeout */
+	InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+									   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Make more log segments if needed.  (Do this after recycling old log
 	 * segments, since that may supply some of the needed files.)
@@ -7635,6 +7639,10 @@ CreateRestartPoint(int flags)
 
 	RemoveOldXlogFiles(_logSegNo, RedoRecPtr, endptr, replayTLI);
 
+	/* Invalidate inactive replication slots based on timeout */
+	InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
+									   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Make more log segments if needed.  (Do this after recycling old log
 	 * segments, since that may supply some of the needed files.)
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 35a186f4bc..26121e939d 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -1550,6 +1551,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by slot's inactive_timeout parameter."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1666,6 +1670,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						conflict = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (s->data.last_inactive_at > 0 &&
+						s->data.inactive_timeout > 0)
+					{
+						TimestampTz now;
+
+						Assert(s->data.persistency == RS_PERSISTENT);
+						Assert(s->active_pid == 0);
+
+						now = GetCurrentTimestamp();
+						if (TimestampDifferenceExceeds(s->data.last_inactive_at, now,
+													   s->data.inactive_timeout * 1000))
+							conflict = cause;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1819,6 +1838,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 8966188acb..2efb9490c1 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..d046e1d5d7
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,168 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot due to inactive_timeout
+#
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+$standby1->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+});
+
+# Set timeout so that the slot when inactive will get invalidated after the
+# timeout.
+my $inactive_timeout = 5;
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot', inactive_timeout := $inactive_timeout);
+]);
+
+$standby1->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# The inactive replication slot info should be null when the slot is active
+my $result = $primary->safe_psql(
+	'postgres', qq[
+	SELECT last_inactive_at IS NULL, inactive_timeout = $inactive_timeout
+		FROM pg_replication_slots WHERE slot_name = 'sb1_slot';
+]);
+is($result, "t|t",
+	'check the inactive replication slot info for an active slot');
+
+my $logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby1->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_at IS NOT NULL
+            AND slot_name = 'sb1_slot'
+            AND inactive_timeout = $inactive_timeout;
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+my $invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb1_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb1_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for inactive replication slot sb1_slot to be invalidated";
+
+# Testcase end: Invalidate streaming standby's slot due to inactive_timeout
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to inactive_timeout
+my $publisher = $primary;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$subscriber->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', inactive_timeout = $inactive_timeout)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+$result = $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the inactive replication slot info to be updated
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_at IS NOT NULL
+            AND slot_name = 'lsub1_slot'
+            AND inactive_timeout = $inactive_timeout;
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+$invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$publisher->safe_psql('postgres', "CHECKPOINT");
+	if ($publisher->log_contains(
+			'invalidating obsolete replication slot "lsub1_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot lsub1_slot invalidation has been logged');
+
+# Testcase end: Invalidate logical subscriber's slot due to inactive_timeout
+# =============================================================================
+
+done_testing();
-- 
2.34.1

v12-0007-Add-XID-age-based-replication-slot-invalidation.patchapplication/octet-stream; name=v12-0007-Add-XID-age-based-replication-slot-invalidation.patchDownload

From 53960949d0d49560d9af7e14fc183e8bbf1a1b28 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 19 Mar 2024 18:38:52 +0000
Subject: [PATCH v12 7/7] Add XID age based replication slot invalidation

Up until now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres uses replication slot xmin (the
oldest transaction that this slot needs the database to retain) or
catalog_xmin (the oldest transaction affecting the system catalogs
that this slot needs the database to retain), and a new GUC
max_slot_xid_age. The checkpointer then looks at all replication
slots invalidating the slots based on the age set.
---
 doc/src/sgml/config.sgml                      | 21 +++++
 doc/src/sgml/system-views.sgml                |  8 ++
 src/backend/access/transam/xlog.c             | 10 ++
 src/backend/replication/slot.c                | 49 +++++++++-
 src/backend/utils/misc/guc_tables.c           | 10 ++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/replication/slot.h                |  3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 92 +++++++++++++++++++
 8 files changed, 193 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 65a6e6c408..6dd54ffcb7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4544,6 +4544,27 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age">
+      <term><varname>max_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>max_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 9821c6f77a..4400274b04 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2595,6 +2595,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           <literal>inactive_timeout</literal> parameter.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>xid_aged</literal> means that the slot's
+          <literal>xmin</literal> or <literal>catalog_xmin</literal>
+          has reached the age specified by
+          <xref linkend="guc-max-slot-xid-age"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index ea4ece22de..3e61617abc 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7172,6 +7172,11 @@ CreateCheckPoint(int flags)
 	InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
 									   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Make more log segments if needed.  (Do this after recycling old log
 	 * segments, since that may supply some of the needed files.)
@@ -7643,6 +7648,11 @@ CreateRestartPoint(int flags)
 	InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT, 0,
 									   InvalidOid, InvalidTransactionId);
 
+	/* Invalidate replication slots based on xmin or catalog_xmin age */
+	if (max_slot_xid_age > 0)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+
 	/*
 	 * Make more log segments if needed.  (Do this after recycling old log
 	 * segments, since that may supply some of the needed files.)
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 26121e939d..8999eb23c9 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -108,10 +108,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
 	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
+	[RS_INVAL_XID_AGE] = "xid_aged",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			max_slot_xid_age = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -159,6 +161,7 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool IsSlotXIDAged(TransactionId xmin);
 
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
@@ -1554,6 +1557,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_INACTIVE_TIMEOUT:
 			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by slot's inactive_timeout parameter."));
 			break;
+		case RS_INVAL_XID_AGE:
+			appendStringInfoString(&err_detail, _("The replication slot's xmin or catalog_xmin reached the age specified by max_slot_xid_age."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1570,6 +1576,31 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Returns true if slot's passed in xmin/catalog_xmin age is more than
+ * max_slot_xid_age.
+ */
+static bool
+IsSlotXIDAged(TransactionId xmin)
+{
+	TransactionId xid_cur;
+	TransactionId xid_limit;
+
+	if (!TransactionIdIsNormal(xmin))
+		return false;
+
+	xid_cur = ReadNextTransactionId();
+	xid_limit = xmin + max_slot_xid_age;
+
+	if (xid_limit < FirstNormalTransactionId)
+		xid_limit += FirstNormalTransactionId;
+
+	if (TransactionIdFollowsOrEquals(xid_cur, xid_limit))
+		return true;
+
+	return false;
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1685,6 +1716,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 							conflict = cause;
 					}
 					break;
+				case RS_INVAL_XID_AGE:
+					{
+						if (IsSlotXIDAged(s->data.xmin))
+						{
+							conflict = cause;
+							break;
+						}
+
+						if (IsSlotXIDAged(s->data.catalog_xmin))
+						{
+							conflict = cause;
+							break;
+						}
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1839,6 +1885,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 57d9de4dd9..056533a059 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2943,6 +2943,16 @@ struct config_int ConfigureNamesInt[] =
 		check_max_slot_wal_keep_size, NULL, NULL
 	},
 
+	{
+		{"max_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&max_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"wal_sender_timeout", PGC_USERSET, REPLICATION_SENDING,
 			gettext_noop("Sets the maximum time to wait for WAL replication."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 2244ee52f7..b4c928b826 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -334,6 +334,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#max_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 2efb9490c1..385c96aa09 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -55,6 +55,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* inactive slot timeout has occurred */
 	RS_INVAL_INACTIVE_TIMEOUT,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -235,6 +237,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
+extern PGDLLIMPORT int max_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index d046e1d5d7..6eb2dcd5b5 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -165,4 +165,96 @@ ok($invalidated, 'check that slot lsub1_slot invalidation has been logged');
 # Testcase end: Invalidate logical subscriber's slot due to inactive_timeout
 # =============================================================================
 
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot due to max_slot_xid_age
+#
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby2->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb2_slot');
+]);
+
+$standby2->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NOT NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb2_slot';
+]) or die "Timed out waiting for slot xmin to advance";
+
+$primary->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET max_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby2->stop;
+
+$logstart = -s $primary->logfile;
+
+# Do some work to advance xmin
+$primary->safe_psql(
+	'postgres', q{
+do $$
+begin
+  for i in 10000..11000 loop
+    -- use an exception block so that each iteration eats an XID
+    begin
+      insert into tab_int values (i);
+    exception
+      when division_by_zero then null;
+    end;
+  end loop;
+end$$;
+});
+
+$invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb2_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb2_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb2_slot' AND
+		invalidation_reason = 'xid_aged';
+])
+  or die
+  "Timed out while waiting for replication slot sb2_slot to be invalidated";
+
+# Testcase end: Invalidate streaming standby's slot due to max_slot_xid_age
+# =============================================================================
+
 done_testing();
-- 
2.34.1

#75

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#73)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 19, 2024 at 6:12 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Tue, Mar 19, 2024 at 04:20:35PM +0530, Amit Kapila wrote:

On Tue, Mar 19, 2024 at 3:11 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Tue, Mar 19, 2024 at 10:56:25AM +0530, Amit Kapila wrote:

On Mon, Mar 18, 2024 at 8:19 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Agree. While it makes sense to invalidate slots for wal removal in
CreateCheckPoint() (because this is the place where wal is removed), I 'm not
sure this is the right place for the 2 new cases.

Let's focus on the timeout one as proposed above (as probably the simplest one):
as this one is purely related to time and activity what about to invalidate them
when?:

- their usage resume
- in pg_get_replication_slots()

The idea is to invalidate the slot when one resumes activity on it or wants to
get information about it (and among other things wants to know if the slot is
valid or not).

Trying to invalidate at those two places makes sense to me but we
still need to cover the cases where it takes very long to resume the
slot activity and the dangling slot cases where the activity is never
resumed.

I understand it's better to have the slot reflecting its real status internally
but it is a real issue if that's not the case until the activity on it is resumed?
(just asking, not saying we should not)

Sorry, I didn't understand your point. Can you try to explain by example?

Sorry if that was not clear, let me try to rephrase it first: what issue to you
see if the invalidation of such a slot occurs only when its usage resume or
when pg_get_replication_slots() is triggered? I understand that this could lead
to the slot not being invalidated (maybe forever) but is that an issue for an
inactive slot?

It has the risk of preventing WAL and row removal. I think this is the
primary reason we are at the first place planning to have such a
parameter. So, we should have some way to invalidate it even when the
walsender/backend process doesn't use it again.

--
With Regards,
Amit Kapila.

#76

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#74)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 20, 2024 at 12:49 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Following are some open points:

1. Where to do inactive_timeout invalidation exactly if not the checkpointer.

I have suggested to do it at the time of CheckpointReplicationSlots()
and Bertrand suggested to do it whenever we resume using the slot. I
think we should follow both the suggestions.

2. Where to do XID age invalidation exactly if not the checkpointer.
3. How to go about recomputing XID horizons based on max_slot_xid_age.
Does the slot's horizon's need to be adjusted in ComputeXidHorizons()?

I suggest postponing the patch for xid based invalidation for a later
discussion.

4. New invalidation mechanisms interaction with slot sync feature.

Yeah, this is important. My initial thoughts are that synced slots
shouldn't be invalidated on the standby due to timeout.

5. Review comments on 0001 from Bertrand.

Please see the attached v12 patches.

Thanks for quickly updating the patches.

--
With Regards,
Amit Kapila.

#77

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#68)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 18, 2024 at 3:42 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

Looking at 0001:

Thanks for reviewing.

1 ===
+       True if this logical slot conflicted with recovery (and so is now
+       invalidated). When this column is true, check
Worth to add back the physical slot mention "Always NULL for physical slots."?

Will change.

2 ===

@@ -1023,9 +1023,10 @@ CREATE VIEW pg_replication_slots AS
L.wal_status,
L.safe_wal_size,
L.two_phase,
-            L.conflict_reason,
+            L.conflicting,
L.failover,
-            L.synced
+            L.synced,
+            L.invalidation_reason

What about making invalidation_reason close to conflict_reason?

Not required I think. One can pick the required columns in the SELECT
clause anyways.

3 ===

- * Maps a conflict reason for a replication slot to
+ * Maps a invalidation reason for a replication slot to

s/a invalidation/an invalidation/?

Will change.

4 ===

While at it, shouldn't we also rename "conflict" to say "invalidation_cause" in
InvalidatePossiblyObsoleteSlot()?

That's inline with our understanding about conflict vs invalidation,
and keeps the function generic. Will change.

5 ===

+ * rows_removed and wal_level_insufficient are only two reasons

s/are only two/are the only two/?

Will change..

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#78

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#76)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Mar 20, 2024 at 08:58:05AM +0530, Amit Kapila wrote:

On Wed, Mar 20, 2024 at 12:49 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Following are some open points:

1. Where to do inactive_timeout invalidation exactly if not the checkpointer.

I have suggested to do it at the time of CheckpointReplicationSlots()
and Bertrand suggested to do it whenever we resume using the slot. I
think we should follow both the suggestions.

Agree. I also think that pg_get_replication_slots() would be a good place, so
that queries would return the right invalidation status.

4. New invalidation mechanisms interaction with slot sync feature.

Yeah, this is important. My initial thoughts are that synced slots
shouldn't be invalidated on the standby due to timeout.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#79

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#74)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote:

On Mon, Mar 18, 2024 at 3:02 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hm. Are you suggesting inactive_timeout to be a slot level parameter
similar to 'failover' property added recently by
c393308b69d229b664391ac583b9e07418d411b6 and
73292404370c9900a96e2bebdc7144f7010339cf?

Yeah, I have something like that in mind. You can prepare the patch
but it would be good if others involved in this thread can also share
their opinion.

I think it makes sense to put the inactive_timeout granularity at the slot
level (as the activity could vary a lot say between one slot linked to a
subcription and one linked to some plugins). As far max_slot_xid_age I've the
feeling that a new GUC is good enough.

Well, here I'm implementing the above idea.

Thanks!

The attached v12 patches
majorly have the following changes:

2. last_inactive_at and inactive_timeout are now tracked in on-disk
replication slot data structure.

Should last_inactive_at be tracked on disk? Say the engine is down for a period
of time > inactive_timeout then the slot will be invalidated after the engine
re-start (if no activity before we invalidate the slot). Should the time the
engine is down be counted as "inactive" time? I've the feeling it should not, and
that we should only take into account inactive time while the engine is up.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#80

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#74)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote:

On Mon, Mar 18, 2024 at 3:02 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hm. Are you suggesting inactive_timeout to be a slot level parameter
similar to 'failover' property added recently by
c393308b69d229b664391ac583b9e07418d411b6 and
73292404370c9900a96e2bebdc7144f7010339cf?

Yeah, I have something like that in mind. You can prepare the patch
but it would be good if others involved in this thread can also share
their opinion.

I think it makes sense to put the inactive_timeout granularity at the slot
level (as the activity could vary a lot say between one slot linked to a
subcription and one linked to some plugins). As far max_slot_xid_age I've the
feeling that a new GUC is good enough.

Well, here I'm implementing the above idea. The attached v12 patches
majorly have the following changes:

Regarding v12-0004: "Allow setting inactive_timeout in the replication command",
shouldn't we also add an new SQL API say: pg_alter_replication_slot() that would
allow to change the timeout property?

That would allow users to alter this property without the need to make a
replication connection.

But the issue is that it would make it inconsistent with the new inactivetimeout
in the subscription that is added in "v12-0005". But do we need to display
subinactivetimeout in pg_subscription (and even allow it at subscription creation
/ alter) after all? (I've the feeling there is less such a need as compare to
subfailover, subtwophasestate for example).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#81

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#78)

6 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 20, 2024 at 1:04 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Wed, Mar 20, 2024 at 08:58:05AM +0530, Amit Kapila wrote:

On Wed, Mar 20, 2024 at 12:49 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Following are some open points:

1. Where to do inactive_timeout invalidation exactly if not the checkpointer.

I have suggested to do it at the time of CheckpointReplicationSlots()
and Bertrand suggested to do it whenever we resume using the slot. I
think we should follow both the suggestions.

Agree. I also think that pg_get_replication_slots() would be a good place, so
that queries would return the right invalidation status.

I've addressed review comments and attaching the v13 patches with the
following changes:

1. Invalidate replication slot due to inactive_timeout:
1.1 In CheckpointReplicationSlots() to help with automatic invalidation.
1.2 In pg_get_replication_slots to help readers see the latest slot information.
1.3 In ReplicationSlotAcquire for walsenders as typically walsenders
are the ones that use slots for longer durations for streaming
standbys and logical subscribers.
1.4 In ReplicationSlotAcquire when called from
pg_logical_slot_get_changes_guts to help with logical decoding clients
to disallow decoding from invalidated slots.
1.5 In ReplicationSlotAcquire when called from
pg_replication_slot_advance to help with disallowing advancing
invalidated slots.
2. Have a new input parameter bool check_for_invalidation for
ReplicationSlotAcquire(). When true, check for the inactive_timeout
invalidation, if invalidated, error out.
3. Have a new function to just do inactive_timeout invalidation.
4. Do not update last_inactive_at for failover slots on standby to not
invalidate failover slots on the standby.
5. In ReplicationSlotAcquire(), invalidate the slot before making it active.
6. Make last_inactive_at a shared-memory parameter as opposed to an
on-disk parameter to help not count the server downtime for inactive
time.
7. Let the failover slot on standby and pg_upgraded slots get
inactive_timeout parameter from the primary and old cluster
respectively.

Please see the attached v13 patches.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v13-0001-Track-invalidation_reason-in-pg_replication_slot.patchapplication/octet-stream; name=v13-0001-Track-invalidation_reason-in-pg_replication_slot.patchDownload

From 2750c2fd767579cab9a4295d9aca7ef0dd42b9b3 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 20 Mar 2024 22:37:15 +0000
Subject: [PATCH v13 1/6] Track invalidation_reason in pg_replication_slots

Up until now, reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
conflict_reason to show the reasons for slot invalidation, but
only for logical slots.

This commit adds a new column to show invalidation reasons for
both physical and logical slots. And, this commit also turns
conflict_reason text column to conflicting boolean column
(effectively reverting commit 007693f2a). One now can look at the
new invalidation_reason column for logical slots conflict with
recovery.
---
 doc/src/sgml/ref/pgupgrade.sgml               |  4 +-
 doc/src/sgml/system-views.sgml                | 63 +++++++++++--------
 src/backend/catalog/system_views.sql          |  5 +-
 src/backend/replication/logical/slotsync.c    |  2 +-
 src/backend/replication/slot.c                | 49 +++++++--------
 src/backend/replication/slotfuncs.c           | 25 +++++---
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  6 +-
 src/include/replication/slot.h                |  2 +-
 .../t/035_standby_logical_decoding.pl         | 39 ++++++------
 .../t/040_standby_failover_slots_sync.pl      |  4 +-
 src/test/regress/expected/rules.out           |  7 ++-
 12 files changed, 116 insertions(+), 94 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 58c6c2df8b..8de52bf752 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -453,8 +453,8 @@ make prefix=/usr/local/pgsql.new install
       <para>
        All slots on the old cluster must be usable, i.e., there are no slots
        whose
-       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>conflict_reason</structfield>
-       is not <literal>NULL</literal>.
+       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>conflicting</structfield>
+       is not <literal>true</literal>.
       </para>
      </listitem>
      <listitem>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be90edd0e2..95355743ca 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,34 +2525,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflict_reason</structfield> <type>text</type>
+       <structfield>conflicting</structfield> <type>bool</type>
       </para>
       <para>
-       The reason for the logical slot's conflict with recovery. It is always
-       NULL for physical slots, as well as for logical slots which are not
-       invalidated. The non-NULL values indicate that the slot is marked
-       as invalidated. Possible values are:
-       <itemizedlist spacing="compact">
-        <listitem>
-         <para>
-          <literal>wal_removed</literal> means that the required WAL has been
-          removed.
-         </para>
-        </listitem>
-        <listitem>
-         <para>
-          <literal>rows_removed</literal> means that the required rows have
-          been removed.
-         </para>
-        </listitem>
-        <listitem>
-         <para>
-          <literal>wal_level_insufficient</literal> means that the
-          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
-          perform logical decoding.
-         </para>
-        </listitem>
-       </itemizedlist>
+       True if this logical slot conflicted with recovery (and so is now
+       invalidated). When this column is true, check
+       <structfield>invalidation_reason</structfield> column for the conflict
+       reason. Always NULL for physical slots.
       </para></entry>
      </row>
 
@@ -2581,6 +2560,38 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>invalidation_reason</structfield> <type>text</type>
+      </para>
+      <para>
+       The reason for the slot's invalidation. It is set for both logical and
+       physical slots. <literal>NULL</literal> if the slot is not invalidated.
+       Possible values are:
+       <itemizedlist spacing="compact">
+        <listitem>
+         <para>
+          <literal>wal_removed</literal> means that the required WAL has been
+          removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>rows_removed</literal> means that the required rows have
+          been removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>wal_level_insufficient</literal> means that the
+          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
+          perform logical decoding.
+         </para>
+        </listitem>
+       </itemizedlist>
+      </para></entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 04227a72d1..cd22dad959 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,9 +1023,10 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflict_reason,
+            L.conflicting,
             L.failover,
-            L.synced
+            L.synced,
+            L.invalidation_reason
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 7b180bdb5c..30480960c5 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -663,7 +663,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, conflict_reason"
+		" database, invalidation_reason"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 91ca397857..cdf0c450c5 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -1525,14 +1525,14 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	initial_effective_xmin = InvalidXLogRecPtr;
 	XLogRecPtr	initial_catalog_effective_xmin = InvalidXLogRecPtr;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
-	ReplicationSlotInvalidationCause conflict_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
 
 	for (;;)
 	{
 		XLogRecPtr	restart_lsn;
 		NameData	slotname;
 		int			active_pid = 0;
-		ReplicationSlotInvalidationCause conflict = RS_INVAL_NONE;
+		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1554,17 +1554,14 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		restart_lsn = s->data.restart_lsn;
 
-		/*
-		 * If the slot is already invalid or is a non conflicting slot, we
-		 * don't need to do anything.
-		 */
+		/* we do nothing if the slot is already invalid */
 		if (s->data.invalidated == RS_INVAL_NONE)
 		{
 			/*
 			 * The slot's mutex will be released soon, and it is possible that
 			 * those values change since the process holding the slot has been
 			 * terminated (if any), so record them here to ensure that we
-			 * would report the correct conflict cause.
+			 * would report the correct invalidation cause.
 			 */
 			if (!terminated)
 			{
@@ -1578,7 +1575,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				case RS_INVAL_WAL_REMOVED:
 					if (initial_restart_lsn != InvalidXLogRecPtr &&
 						initial_restart_lsn < oldestLSN)
-						conflict = cause;
+						invalidation_cause = cause;
 					break;
 				case RS_INVAL_HORIZON:
 					if (!SlotIsLogical(s))
@@ -1589,15 +1586,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (TransactionIdIsValid(initial_effective_xmin) &&
 						TransactionIdPrecedesOrEquals(initial_effective_xmin,
 													  snapshotConflictHorizon))
-						conflict = cause;
+						invalidation_cause = cause;
 					else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
 							 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
 														   snapshotConflictHorizon))
-						conflict = cause;
+						invalidation_cause = cause;
 					break;
 				case RS_INVAL_WAL_LEVEL:
 					if (SlotIsLogical(s))
-						conflict = cause;
+						invalidation_cause = cause;
 					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
@@ -1605,14 +1602,14 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		}
 
 		/*
-		 * The conflict cause recorded previously should not change while the
-		 * process owning the slot (if any) has been terminated.
+		 * The invalidation cause recorded previously should not change while
+		 * the process owning the slot (if any) has been terminated.
 		 */
-		Assert(!(conflict_prev != RS_INVAL_NONE && terminated &&
-				 conflict_prev != conflict));
+		Assert(!(invalidation_cause_prev != RS_INVAL_NONE && terminated &&
+				 invalidation_cause_prev != invalidation_cause));
 
-		/* if there's no conflict, we're done */
-		if (conflict == RS_INVAL_NONE)
+		/* if there's no invalidation, we're done */
+		if (invalidation_cause == RS_INVAL_NONE)
 		{
 			SpinLockRelease(&s->mutex);
 			if (released_lock)
@@ -1632,13 +1629,13 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
-			s->data.invalidated = conflict;
+			s->data.invalidated = invalidation_cause;
 
 			/*
 			 * XXX: We should consider not overwriting restart_lsn and instead
 			 * just rely on .invalidated.
 			 */
-			if (conflict == RS_INVAL_WAL_REMOVED)
+			if (invalidation_cause == RS_INVAL_WAL_REMOVED)
 				s->data.restart_lsn = InvalidXLogRecPtr;
 
 			/* Let caller know */
@@ -1681,7 +1678,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			 */
 			if (last_signaled_pid != active_pid)
 			{
-				ReportSlotInvalidation(conflict, true, active_pid,
+				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
 									   oldestLSN, snapshotConflictHorizon);
 
@@ -1694,7 +1691,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 				last_signaled_pid = active_pid;
 				terminated = true;
-				conflict_prev = conflict;
+				invalidation_cause_prev = invalidation_cause;
 			}
 
 			/* Wait until the slot is released. */
@@ -1727,7 +1724,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			ReplicationSlotSave();
 			ReplicationSlotRelease();
 
-			ReportSlotInvalidation(conflict, false, active_pid,
+			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
 								   oldestLSN, snapshotConflictHorizon);
 
@@ -2356,21 +2353,21 @@ RestoreSlotFromDisk(const char *name)
 }
 
 /*
- * Maps a conflict reason for a replication slot to
+ * Maps an invalidation reason for a replication slot to
  * ReplicationSlotInvalidationCause.
  */
 ReplicationSlotInvalidationCause
-GetSlotInvalidationCause(const char *conflict_reason)
+GetSlotInvalidationCause(const char *invalidation_reason)
 {
 	ReplicationSlotInvalidationCause cause;
 	ReplicationSlotInvalidationCause result = RS_INVAL_NONE;
 	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
 
-	Assert(conflict_reason);
+	Assert(invalidation_reason);
 
 	for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
 	{
-		if (strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0)
+		if (strcmp(SlotInvalidationCauses[cause], invalidation_reason) == 0)
 		{
 			found = true;
 			result = cause;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index ad79e1fccd..dfaac999f1 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 17
+#define PG_GET_REPLICATION_SLOTS_COLS 18
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -263,6 +263,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		bool		nulls[PG_GET_REPLICATION_SLOTS_COLS];
 		WALAvailability walstate;
 		int			i;
+		ReplicationSlotInvalidationCause cause;
 
 		if (!slot->in_use)
 			continue;
@@ -409,22 +410,32 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.data.database == InvalidOid)
+		cause = slot_contents.data.invalidated;
+
+		if (SlotIsPhysical(&slot_contents))
 			nulls[i++] = true;
 		else
 		{
-			ReplicationSlotInvalidationCause cause = slot_contents.data.invalidated;
-
-			if (cause == RS_INVAL_NONE)
-				nulls[i++] = true;
+			/*
+			 * rows_removed and wal_level_insufficient are the only two
+			 * reasons for the logical slot's conflict with recovery.
+			 */
+			if (cause == RS_INVAL_HORIZON ||
+				cause == RS_INVAL_WAL_LEVEL)
+				values[i++] = BoolGetDatum(true);
 			else
-				values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+				values[i++] = BoolGetDatum(false);
 		}
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
 		values[i++] = BoolGetDatum(slot_contents.data.synced);
 
+		if (cause == RS_INVAL_NONE)
+			nulls[i++] = true;
+		else
+			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index b5b8d11602..34a157f792 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -676,13 +676,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+							"(CASE WHEN conflicting THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 042f66f714..cf116bc548 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11133,9 +11133,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 425effad21..7f25a083ee 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -273,7 +273,7 @@ extern void CheckPointReplicationSlots(bool is_shutdown);
 extern void CheckSlotRequirements(void);
 extern void CheckSlotPermissions(void);
 extern ReplicationSlotInvalidationCause
-			GetSlotInvalidationCause(const char *conflict_reason);
+			GetSlotInvalidationCause(const char *invalidation_reason);
 
 extern bool SlotExistsInStandbySlotNames(const char *slot_name);
 extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 88b03048c4..2203841ca1 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,7 +168,7 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflict_reason in pg_replication_slots.
+# Check reason for conflict in pg_replication_slots.
 sub check_slots_conflict_reason
 {
 	my ($slot_prefix, $reason) = @_;
@@ -178,15 +178,15 @@ sub check_slots_conflict_reason
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$active_slot' and conflicting;));
 
-	is($res, "$reason", "$active_slot conflict_reason is $reason");
+	is($res, "$reason", "$active_slot reason for conflict is $reason");
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$inactive_slot' and conflicting;));
 
-	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
+	is($res, "$reason", "$inactive_slot reason for conflict is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -293,13 +293,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflict_reason is NULL for physical slot
+# Check conflicting is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT conflicting is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflict_reason as NULL");
+is($res, 't', "Physical slot reports conflicting as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -524,7 +524,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Ensure that replication slot stats are not removed after invalidation.
@@ -551,7 +551,7 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify conflict_reason is retained across a restart.
+# Verify reason for conflict is retained across a restart.
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
@@ -560,7 +560,8 @@ check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
+	"SELECT restart_lsn FROM pg_replication_slots
+		WHERE slot_name = 'vacuum_full_activeslot' AND conflicting;"
 );
 
 chomp($restart_lsn);
@@ -611,7 +612,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
@@ -647,7 +648,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
@@ -696,14 +697,14 @@ ok( $node_standby->poll_query_until(
 	'confl_active_logicalslot not updated'
 ) or die "Timed out waiting confl_active_logicalslot to be updated";
 
-# Verify slots are reported as non conflicting in pg_replication_slots
+# Verify slots are reported as valid in pg_replication_slots
 is( $node_standby->safe_psql(
 		'postgres',
 		q[select bool_or(conflicting) from
-		  (select conflict_reason is not NULL as conflicting
-		   from pg_replication_slots WHERE slot_type = 'logical')]),
+		  (select conflicting from pg_replication_slots
+			where slot_type = 'logical')]),
 	'f',
-	'Logical slots are reported as non conflicting');
+	'Logical slots are reported as valid');
 
 # Turn hot_standby_feedback back on
 change_hot_standby_feedback_and_wait_for_xmins(1, 0);
@@ -739,7 +740,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
@@ -783,7 +784,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+# Verify reason for conflict is 'wal_level_insufficient' in pg_replication_slots
 check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 0ea1f3d323..f47bfd78eb 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -228,7 +228,7 @@ $standby1->safe_psql('postgres', "CHECKPOINT");
 # Check if the synced slot is invalidated
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'synchronized slot has been invalidated');
@@ -274,7 +274,7 @@ $standby1->wait_for_log(qr/dropped replication slot "lsub1_slot" of dbid [0-9]+/
 # flagged as 'synced'
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'logical slot is re-synced');
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 84e359f6ed..19c44c0cb7 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,10 +1473,11 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflict_reason,
+    l.conflicting,
     l.failover,
-    l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced)
+    l.synced,
+    l.invalidation_reason
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v13-0002-Track-last_inactive_at-for-replication-slots-in-.patchapplication/octet-stream; name=v13-0002-Track-last_inactive_at-for-replication-slots-in-.patchDownload

From b1e31a64e36ae27017813d7f1f602a0067c7cc39 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 20 Mar 2024 22:37:40 +0000
Subject: [PATCH v13 2/6] Track last_inactive_at for replication slots in
 shared memory

---
 src/backend/catalog/system_views.sql |  3 ++-
 src/backend/replication/slot.c       | 16 ++++++++++++++++
 src/backend/replication/slotfuncs.c  |  7 ++++++-
 src/include/catalog/pg_proc.dat      |  6 +++---
 src/include/replication/slot.h       |  3 +++
 src/test/regress/expected/rules.out  |  5 +++--
 6 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index cd22dad959..2fa4272006 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1026,7 +1026,8 @@ CREATE VIEW pg_replication_slots AS
             L.conflicting,
             L.failover,
             L.synced,
-            L.invalidation_reason
+            L.invalidation_reason,
+            L.last_inactive_at
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index cdf0c450c5..146f0fbf84 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -409,6 +409,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->candidate_restart_valid = InvalidXLogRecPtr;
 	slot->candidate_restart_lsn = InvalidXLogRecPtr;
 	slot->last_saved_confirmed_flush = InvalidXLogRecPtr;
+	slot->last_inactive_at = 0;
 
 	/*
 	 * Create the slot on disk.  We haven't actually marked the slot allocated
@@ -622,6 +623,13 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
+	if (s->data.persistency == RS_PERSISTENT)
+	{
+		SpinLockAcquire(&s->mutex);
+		s->last_inactive_at = 0;
+		SpinLockRelease(&s->mutex);
+	}
+
 	if (am_walsender)
 	{
 		ereport(log_replication_commands ? LOG : DEBUG1,
@@ -691,6 +699,13 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
+	if (slot->data.persistency == RS_PERSISTENT)
+	{
+		SpinLockAcquire(&slot->mutex);
+		slot->last_inactive_at = GetCurrentTimestamp();
+		SpinLockRelease(&slot->mutex);
+	}
+
 	MyReplicationSlot = NULL;
 
 	/* might not have been set when we've been a plain slot */
@@ -2341,6 +2356,7 @@ RestoreSlotFromDisk(const char *name)
 
 		slot->in_use = true;
 		slot->active_pid = 0;
+		slot->last_inactive_at = 0;
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index dfaac999f1..2c33cc0c16 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 18
+#define PG_GET_REPLICATION_SLOTS_COLS 19
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -436,6 +436,11 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		else
 			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
 
+		if (slot_contents.last_inactive_at > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.last_inactive_at);
+		else
+			nulls[i++] = true;
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index cf116bc548..d89a223a60 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11133,9 +11133,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text,timestamptz}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason,last_inactive_at}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7f25a083ee..b4bb7f5e99 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -201,6 +201,9 @@ typedef struct ReplicationSlot
 	 * forcibly flushed or not.
 	 */
 	XLogRecPtr	last_saved_confirmed_flush;
+
+	/* When did this slot become inactive last time? */
+	TimestampTz last_inactive_at;
 } ReplicationSlot;
 
 #define SlotIsPhysical(slot) ((slot)->data.database == InvalidOid)
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 19c44c0cb7..88fbd6a53c 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1476,8 +1476,9 @@ pg_replication_slots| SELECT l.slot_name,
     l.conflicting,
     l.failover,
     l.synced,
-    l.invalidation_reason
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason)
+    l.invalidation_reason,
+    l.last_inactive_at
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason, last_inactive_at)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v13-0003-Allow-setting-inactive_timeout-for-replication-s.patchapplication/octet-stream; name=v13-0003-Allow-setting-inactive_timeout-for-replication-s.patchDownload

From 9339d8203cafc03f7590c6770e7966321b8e3ea5 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 20 Mar 2024 22:37:54 +0000
Subject: [PATCH v13 3/6] Allow setting inactive_timeout for replication slots
 via SQL API

---
 contrib/test_decoding/expected/slot.out       | 102 ++++++++++++++++++
 contrib/test_decoding/sql/slot.sql            |  34 ++++++
 doc/src/sgml/func.sgml                        |  18 ++--
 doc/src/sgml/system-views.sgml                |   9 ++
 src/backend/catalog/system_functions.sql      |   2 +
 src/backend/catalog/system_views.sql          |   3 +-
 src/backend/replication/logical/slotsync.c    |  17 ++-
 src/backend/replication/slot.c                |  20 +++-
 src/backend/replication/slotfuncs.c           |  31 +++++-
 src/backend/replication/walsender.c           |   4 +-
 src/bin/pg_upgrade/info.c                     |   6 +-
 src/bin/pg_upgrade/pg_upgrade.c               |   5 +-
 src/bin/pg_upgrade/pg_upgrade.h               |   2 +
 src/include/catalog/pg_proc.dat               |  22 ++--
 src/include/replication/slot.h                |   5 +-
 .../t/040_standby_failover_slots_sync.pl      |  11 +-
 src/test/regress/expected/rules.out           |   5 +-
 17 files changed, 257 insertions(+), 39 deletions(-)

diff --git a/contrib/test_decoding/expected/slot.out b/contrib/test_decoding/expected/slot.out
index 349ab2d380..6771520afb 100644
--- a/contrib/test_decoding/expected/slot.out
+++ b/contrib/test_decoding/expected/slot.out
@@ -466,3 +466,105 @@ SELECT pg_drop_replication_slot('physical_slot');
  
 (1 row)
 
+-- Test negative value for inactive_timeout option for slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', inactive_timeout := -300);  -- error
+ERROR:  "inactive_timeout" must not be negative
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', inactive_timeout := -600);  -- error
+ERROR:  "inactive_timeout" must not be negative
+-- Test inactive_timeout option for temporary slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', temporary := true, inactive_timeout := 300);  -- error
+ERROR:  cannot set inactive_timeout for a temporary replication slot
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', temporary := true, inactive_timeout := 600);  -- error
+ERROR:  cannot set inactive_timeout for a temporary replication slot
+-- Test inactive_timeout option of physical slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot1', immediately_reserve := true, inactive_timeout := 300);
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot2');
+ ?column? 
+----------
+ init
+(1 row)
+
+-- Copy physical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_slot1', dst_slot_name := 'it_phy_slot3');
+ ?column? 
+----------
+ copy
+(1 row)
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+  slot_name   | slot_type | inactive_timeout 
+--------------+-----------+------------------
+ it_phy_slot1 | physical  |              300
+ it_phy_slot2 | physical  |                0
+ it_phy_slot3 | physical  |              300
+(3 rows)
+
+SELECT pg_drop_replication_slot('it_phy_slot1');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_phy_slot2');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_phy_slot3');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+-- Test inactive_timeout option of logical slots.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot2', plugin := 'test_decoding');
+ ?column? 
+----------
+ init
+(1 row)
+
+-- Copy logical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slot1', dst_slot_name := 'it_log_slot3');
+ ?column? 
+----------
+ copy
+(1 row)
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+  slot_name   | slot_type | inactive_timeout 
+--------------+-----------+------------------
+ it_log_slot1 | logical   |              600
+ it_log_slot2 | logical   |                0
+ it_log_slot3 | logical   |              600
+(3 rows)
+
+SELECT pg_drop_replication_slot('it_log_slot1');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_log_slot2');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_log_slot3');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
diff --git a/contrib/test_decoding/sql/slot.sql b/contrib/test_decoding/sql/slot.sql
index 580e3ae3be..443e91da07 100644
--- a/contrib/test_decoding/sql/slot.sql
+++ b/contrib/test_decoding/sql/slot.sql
@@ -190,3 +190,37 @@ SELECT pg_drop_replication_slot('failover_true_slot');
 SELECT pg_drop_replication_slot('failover_false_slot');
 SELECT pg_drop_replication_slot('failover_default_slot');
 SELECT pg_drop_replication_slot('physical_slot');
+
+-- Test negative value for inactive_timeout option for slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', inactive_timeout := -300);  -- error
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', inactive_timeout := -600);  -- error
+
+-- Test inactive_timeout option for temporary slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', temporary := true, inactive_timeout := 300);  -- error
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', temporary := true, inactive_timeout := 600);  -- error
+
+-- Test inactive_timeout option of physical slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot1', immediately_reserve := true, inactive_timeout := 300);
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot2');
+
+-- Copy physical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_slot1', dst_slot_name := 'it_phy_slot3');
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+
+SELECT pg_drop_replication_slot('it_phy_slot1');
+SELECT pg_drop_replication_slot('it_phy_slot2');
+SELECT pg_drop_replication_slot('it_phy_slot3');
+
+-- Test inactive_timeout option of logical slots.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot2', plugin := 'test_decoding');
+
+-- Copy logical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slot1', dst_slot_name := 'it_log_slot3');
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+
+SELECT pg_drop_replication_slot('it_log_slot1');
+SELECT pg_drop_replication_slot('it_log_slot2');
+SELECT pg_drop_replication_slot('it_log_slot3');
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 030ea8affd..9467ff86b3 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28163,7 +28163,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <indexterm>
          <primary>pg_create_physical_replication_slot</primary>
         </indexterm>
-        <function>pg_create_physical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type> <optional>, <parameter>immediately_reserve</parameter> <type>boolean</type>, <parameter>temporary</parameter> <type>boolean</type> </optional> )
+        <function>pg_create_physical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type> <optional>, <parameter>immediately_reserve</parameter> <type>boolean</type>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>inactive_timeout</parameter> <type>integer</type> </optional>)
         <returnvalue>record</returnvalue>
         ( <parameter>slot_name</parameter> <type>name</type>,
         <parameter>lsn</parameter> <type>pg_lsn</type> )
@@ -28180,9 +28180,12 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         parameter, <parameter>temporary</parameter>, when set to true, specifies that
         the slot should not be permanently stored to disk and is only meant
         for use by the current session. Temporary slots are also
-        released upon any error. This function corresponds
-        to the replication protocol command <literal>CREATE_REPLICATION_SLOT
-        ... PHYSICAL</literal>.
+        released upon any error. The optional fourth
+        parameter, <parameter>inactive_timeout</parameter>, when set to a
+        non-zero value, specifies the amount of time in seconds the slot is
+        allowed to be inactive. This function corresponds to the replication
+        protocol command
+        <literal>CREATE_REPLICATION_SLOT ... PHYSICAL</literal>.
        </para></entry>
       </row>
 
@@ -28207,7 +28210,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <indexterm>
          <primary>pg_create_logical_replication_slot</primary>
         </indexterm>
-        <function>pg_create_logical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type>, <parameter>plugin</parameter> <type>name</type> <optional>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>twophase</parameter> <type>boolean</type>, <parameter>failover</parameter> <type>boolean</type> </optional> )
+        <function>pg_create_logical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type>, <parameter>plugin</parameter> <type>name</type> <optional>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>twophase</parameter> <type>boolean</type>, <parameter>failover</parameter> <type>boolean</type>, <parameter>inactive_timeout</parameter> <type>integer</type> </optional> )
         <returnvalue>record</returnvalue>
         ( <parameter>slot_name</parameter> <type>name</type>,
         <parameter>lsn</parameter> <type>pg_lsn</type> )
@@ -28226,7 +28229,10 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <parameter>failover</parameter>, when set to true,
         specifies that this slot is enabled to be synced to the
         standbys so that logical replication can be resumed after
-        failover. A call to this function has the same effect as
+        failover.  The optional sixth parameter,
+        <parameter>inactive_timeout</parameter>, when set to a
+        non-zero value, specifies the amount of time in seconds the slot is
+        allowed to be inactive. A call to this function has the same effect as
         the replication protocol command
         <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
        </para></entry>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 95355743ca..ec60c43038 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2750,6 +2750,15 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        ID of role
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_timeout</structfield> <type>integer</type>
+      </para>
+      <para>
+        The amount of time in seconds the slot is allowed to be inactive.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index fe2bb50f46..af27616657 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -469,6 +469,7 @@ AS 'pg_logical_emit_message_bytea';
 CREATE OR REPLACE FUNCTION pg_create_physical_replication_slot(
     IN slot_name name, IN immediately_reserve boolean DEFAULT false,
     IN temporary boolean DEFAULT false,
+    IN inactive_timeout int DEFAULT 0,
     OUT slot_name name, OUT lsn pg_lsn)
 RETURNS RECORD
 LANGUAGE INTERNAL
@@ -480,6 +481,7 @@ CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot(
     IN temporary boolean DEFAULT false,
     IN twophase boolean DEFAULT false,
     IN failover boolean DEFAULT false,
+    IN inactive_timeout int DEFAULT 0,
     OUT slot_name name, OUT lsn pg_lsn)
 RETURNS RECORD
 LANGUAGE INTERNAL
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2fa4272006..a43048ae93 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1027,7 +1027,8 @@ CREATE VIEW pg_replication_slots AS
             L.failover,
             L.synced,
             L.invalidation_reason,
-            L.last_inactive_at
+            L.last_inactive_at,
+            L.inactive_timeout
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..c01876ceeb 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -131,6 +131,7 @@ typedef struct RemoteSlot
 	char	   *database;
 	bool		two_phase;
 	bool		failover;
+	int			inactive_timeout;
 	XLogRecPtr	restart_lsn;
 	XLogRecPtr	confirmed_lsn;
 	TransactionId catalog_xmin;
@@ -167,7 +168,8 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		remote_slot->two_phase == slot->data.two_phase &&
 		remote_slot->failover == slot->data.failover &&
 		remote_slot->confirmed_lsn == slot->data.confirmed_flush &&
-		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
+		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 &&
+		remote_slot->inactive_timeout == slot->data.inactive_timeout)
 		return false;
 
 	/* Avoid expensive operations while holding a spinlock. */
@@ -182,6 +184,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 	slot->data.confirmed_flush = remote_slot->confirmed_lsn;
 	slot->data.catalog_xmin = remote_slot->catalog_xmin;
 	slot->effective_catalog_xmin = remote_slot->catalog_xmin;
+	slot->data.inactive_timeout = remote_slot->inactive_timeout;
 	SpinLockRelease(&slot->mutex);
 
 	if (xmin_changed)
@@ -607,7 +610,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
 							  remote_slot->two_phase,
 							  remote_slot->failover,
-							  true);
+							  true, 0);
 
 		/* For shorter lines. */
 		slot = MyReplicationSlot;
@@ -627,6 +630,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		SpinLockAcquire(&slot->mutex);
 		slot->effective_catalog_xmin = xmin_horizon;
 		slot->data.catalog_xmin = xmin_horizon;
+		slot->data.inactive_timeout = remote_slot->inactive_timeout;
 		SpinLockRelease(&slot->mutex);
 		ReplicationSlotsComputeRequiredXmin(true);
 		LWLockRelease(ProcArrayLock);
@@ -652,9 +656,9 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 static bool
 synchronize_slots(WalReceiverConn *wrconn)
 {
-#define SLOTSYNC_COLUMN_COUNT 9
+#define SLOTSYNC_COLUMN_COUNT 10
 	Oid			slotRow[SLOTSYNC_COLUMN_COUNT] = {TEXTOID, TEXTOID, LSNOID,
-	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID};
+	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID, INT4OID};
 
 	WalRcvExecResult *res;
 	TupleTableSlot *tupslot;
@@ -663,7 +667,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, invalidation_reason"
+		" database, invalidation_reason, inactive_timeout"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
@@ -743,6 +747,9 @@ synchronize_slots(WalReceiverConn *wrconn)
 		remote_slot->invalidated = isnull ? RS_INVAL_NONE :
 			GetSlotInvalidationCause(TextDatumGetCString(d));
 
+		remote_slot->inactive_timeout = DatumGetInt32(slot_getattr(tupslot, ++col,
+																   &isnull));
+
 		/* Sanity check */
 		Assert(col == SLOTSYNC_COLUMN_COUNT);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 146f0fbf84..195771920f 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -129,7 +129,7 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	5		/* version for new files */
+#define SLOT_VERSION	6		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -304,11 +304,14 @@ ReplicationSlotValidateName(const char *name, int elevel)
  * failover: If enabled, allows the slot to be synced to standbys so
  *     that logical replication can be resumed after failover.
  * synced: True if the slot is synchronized from the primary server.
+ * inactive_timeout: The amount of time in seconds the slot is allowed to be
+ *     inactive.
  */
 void
 ReplicationSlotCreate(const char *name, bool db_specific,
 					  ReplicationSlotPersistency persistency,
-					  bool two_phase, bool failover, bool synced)
+					  bool two_phase, bool failover, bool synced,
+					  int inactive_timeout)
 {
 	ReplicationSlot *slot = NULL;
 	int			i;
@@ -345,6 +348,18 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 					errmsg("cannot enable failover for a temporary replication slot"));
 	}
 
+	if (inactive_timeout > 0)
+	{
+		/*
+		 * Do not allow users to set inactive_timeout for temporary slots,
+		 * because temporary slots will not be saved to the disk.
+		 */
+		if (persistency == RS_TEMPORARY)
+			ereport(ERROR,
+					errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					errmsg("cannot set inactive_timeout for a temporary replication slot"));
+	}
+
 	/*
 	 * If some other backend ran this code concurrently with us, we'd likely
 	 * both allocate the same slot, and that would be bad.  We'd also be at
@@ -398,6 +413,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
 	slot->data.synced = synced;
+	slot->data.inactive_timeout = inactive_timeout;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 2c33cc0c16..55ff73cc78 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -38,14 +38,15 @@
  */
 static void
 create_physical_replication_slot(char *name, bool immediately_reserve,
-								 bool temporary, XLogRecPtr restart_lsn)
+								 bool temporary, int inactive_timeout,
+								 XLogRecPtr restart_lsn)
 {
 	Assert(!MyReplicationSlot);
 
 	/* acquire replication slot, this will check for conflicting names */
 	ReplicationSlotCreate(name, false,
 						  temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
-						  false, false);
+						  false, false, inactive_timeout);
 
 	if (immediately_reserve)
 	{
@@ -71,6 +72,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 	Name		name = PG_GETARG_NAME(0);
 	bool		immediately_reserve = PG_GETARG_BOOL(1);
 	bool		temporary = PG_GETARG_BOOL(2);
+	int			inactive_timeout = PG_GETARG_INT32(3);
 	Datum		values[2];
 	bool		nulls[2];
 	TupleDesc	tupdesc;
@@ -84,9 +86,15 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 
 	CheckSlotRequirements();
 
+	if (inactive_timeout < 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("\"inactive_timeout\" must not be negative")));
+
 	create_physical_replication_slot(NameStr(*name),
 									 immediately_reserve,
 									 temporary,
+									 inactive_timeout,
 									 InvalidXLogRecPtr);
 
 	values[0] = NameGetDatum(&MyReplicationSlot->data.name);
@@ -120,7 +128,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 static void
 create_logical_replication_slot(char *name, char *plugin,
 								bool temporary, bool two_phase,
-								bool failover,
+								bool failover, int inactive_timeout,
 								XLogRecPtr restart_lsn,
 								bool find_startpoint)
 {
@@ -138,7 +146,7 @@ create_logical_replication_slot(char *name, char *plugin,
 	 */
 	ReplicationSlotCreate(name, true,
 						  temporary ? RS_TEMPORARY : RS_EPHEMERAL, two_phase,
-						  failover, false);
+						  failover, false, inactive_timeout);
 
 	/*
 	 * Create logical decoding context to find start point or, if we don't
@@ -177,6 +185,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 	bool		temporary = PG_GETARG_BOOL(2);
 	bool		two_phase = PG_GETARG_BOOL(3);
 	bool		failover = PG_GETARG_BOOL(4);
+	int			inactive_timeout = PG_GETARG_INT32(5);
 	Datum		result;
 	TupleDesc	tupdesc;
 	HeapTuple	tuple;
@@ -190,11 +199,17 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 
 	CheckLogicalDecodingRequirements();
 
+	if (inactive_timeout < 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("\"inactive_timeout\" must not be negative")));
+
 	create_logical_replication_slot(NameStr(*name),
 									NameStr(*plugin),
 									temporary,
 									two_phase,
 									failover,
+									inactive_timeout,
 									InvalidXLogRecPtr,
 									true);
 
@@ -239,7 +254,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 19
+#define PG_GET_REPLICATION_SLOTS_COLS 20
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -441,6 +456,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		else
 			nulls[i++] = true;
 
+		values[i++] = Int32GetDatum(slot_contents.data.inactive_timeout);
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
@@ -720,6 +737,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 	XLogRecPtr	src_restart_lsn;
 	bool		src_islogical;
 	bool		temporary;
+	int			inactive_timeout;
 	char	   *plugin;
 	Datum		values[2];
 	bool		nulls[2];
@@ -776,6 +794,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 	src_restart_lsn = first_slot_contents.data.restart_lsn;
 	temporary = (first_slot_contents.data.persistency == RS_TEMPORARY);
 	plugin = logical_slot ? NameStr(first_slot_contents.data.plugin) : NULL;
+	inactive_timeout = first_slot_contents.data.inactive_timeout;
 
 	/* Check type of replication slot */
 	if (src_islogical != logical_slot)
@@ -823,6 +842,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 										temporary,
 										false,
 										false,
+										inactive_timeout,
 										src_restart_lsn,
 										false);
 	}
@@ -830,6 +850,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 		create_physical_replication_slot(NameStr(*dst_name),
 										 true,
 										 temporary,
+										 inactive_timeout,
 										 src_restart_lsn);
 
 	/*
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bc40c454de..5315c08650 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1221,7 +1221,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	{
 		ReplicationSlotCreate(cmd->slotname, false,
 							  cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
-							  false, false, false);
+							  false, false, false, 0);
 
 		if (reserve_wal)
 		{
@@ -1252,7 +1252,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 		 */
 		ReplicationSlotCreate(cmd->slotname, true,
 							  cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
-							  two_phase, failover, false);
+							  two_phase, failover, false, 0);
 
 		/*
 		 * Do options check early so that we can bail before calling the
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 34a157f792..6817e9be67 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -676,7 +676,8 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid, "
+							"inactive_timeout "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
@@ -696,6 +697,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 		int			i_failover;
 		int			i_caught_up;
 		int			i_invalid;
+		int			i_inactive_timeout;
 
 		slotinfos = (LogicalSlotInfo *) pg_malloc(sizeof(LogicalSlotInfo) * num_slots);
 
@@ -705,6 +707,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 		i_failover = PQfnumber(res, "failover");
 		i_caught_up = PQfnumber(res, "caught_up");
 		i_invalid = PQfnumber(res, "invalid");
+		i_inactive_timeout = PQfnumber(res, "inactive_timeout");
 
 		for (int slotnum = 0; slotnum < num_slots; slotnum++)
 		{
@@ -716,6 +719,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 			curr->failover = (strcmp(PQgetvalue(res, slotnum, i_failover), "t") == 0);
 			curr->caught_up = (strcmp(PQgetvalue(res, slotnum, i_caught_up), "t") == 0);
 			curr->invalid = (strcmp(PQgetvalue(res, slotnum, i_invalid), "t") == 0);
+			curr->inactive_timeout = atooid(PQgetvalue(res, slotnum, i_inactive_timeout));
 		}
 	}
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index f6143b6bc4..2656056103 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -931,9 +931,10 @@ create_logical_replication_slots(void)
 			appendPQExpBuffer(query, ", ");
 			appendStringLiteralConn(query, slot_info->plugin, conn);
 
-			appendPQExpBuffer(query, ", false, %s, %s);",
+			appendPQExpBuffer(query, ", false, %s, %s, %d);",
 							  slot_info->two_phase ? "true" : "false",
-							  slot_info->failover ? "true" : "false");
+							  slot_info->failover ? "true" : "false",
+							  slot_info->inactive_timeout);
 
 			PQclear(executeQueryOrDie(conn, "%s", query->data));
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 92bcb693fb..eb86d000b1 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -162,6 +162,8 @@ typedef struct
 	bool		invalid;		/* if true, the slot is unusable */
 	bool		failover;		/* is the slot designated to be synced to the
 								 * physical standby? */
+	int			inactive_timeout;	/* The amount of time in seconds the slot
+									 * is allowed to be inactive. */
 } LogicalSlotInfo;
 
 typedef struct
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d89a223a60..80c281c8a5 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11105,10 +11105,10 @@
 # replication slots
 { oid => '3779', descr => 'create a physical replication slot',
   proname => 'pg_create_physical_replication_slot', provolatile => 'v',
-  proparallel => 'u', prorettype => 'record', proargtypes => 'name bool bool',
-  proallargtypes => '{name,bool,bool,name,pg_lsn}',
-  proargmodes => '{i,i,i,o,o}',
-  proargnames => '{slot_name,immediately_reserve,temporary,slot_name,lsn}',
+  proparallel => 'u', prorettype => 'record', proargtypes => 'name bool bool int4',
+  proallargtypes => '{name,bool,bool,int4,name,pg_lsn}',
+  proargmodes => '{i,i,i,i,o,o}',
+  proargnames => '{slot_name,immediately_reserve,temporary,inactive_timeout,slot_name,lsn}',
   prosrc => 'pg_create_physical_replication_slot' },
 { oid => '4220',
   descr => 'copy a physical replication slot, changing temporality',
@@ -11133,17 +11133,17 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text,timestamptz}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason,last_inactive_at}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,bool,bool,text,timestamptz,int4}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,failover,synced,invalidation_reason,last_inactive_at,inactive_timeout}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
   proparallel => 'u', prorettype => 'record',
-  proargtypes => 'name name bool bool bool',
-  proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
-  proargmodes => '{i,i,i,i,i,o,o}',
-  proargnames => '{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
+  proargtypes => 'name name bool bool bool int4',
+  proallargtypes => '{name,name,bool,bool,bool,int4,name,pg_lsn}',
+  proargmodes => '{i,i,i,i,i,i,o,o}',
+  proargnames => '{slot_name,plugin,temporary,twophase,failover,inactive_timeout,slot_name,lsn}',
   prosrc => 'pg_create_logical_replication_slot' },
 { oid => '4222',
   descr => 'copy a logical replication slot, changing temporality and plugin',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index b4bb7f5e99..ff62542b03 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -127,6 +127,9 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* The amount of time in seconds the slot is allowed to be inactive. */
+	int			inactive_timeout;
 } ReplicationSlotPersistentData;
 
 /*
@@ -239,7 +242,7 @@ extern void ReplicationSlotsShmemInit(void);
 extern void ReplicationSlotCreate(const char *name, bool db_specific,
 								  ReplicationSlotPersistency persistency,
 								  bool two_phase, bool failover,
-								  bool synced);
+								  bool synced, int inactive_timeout);
 extern void ReplicationSlotPersist(void);
 extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index f47bfd78eb..e4e244effb 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -153,7 +153,7 @@ $primary->append_conf('postgresql.conf', "log_min_messages = 'debug2'");
 $primary->reload;
 
 $primary->psql('postgres',
-	q{SELECT pg_create_logical_replication_slot('lsub2_slot', 'test_decoding', false, false, true);}
+	q{SELECT pg_create_logical_replication_slot('lsub2_slot', 'test_decoding', false, false, true, 3600);}
 );
 
 $primary->psql('postgres',
@@ -190,6 +190,15 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Confirm that the synced slot on the standby has got inactive_timeout from the
+# primary.
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT inactive_timeout FROM pg_replication_slots WHERE slot_name = 'lsub2_slot' AND synced AND NOT temporary;}
+	),
+	"3600",
+	'synced logical slot has got inactive_timeout on standby');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 88fbd6a53c..1c683ceaca 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1477,8 +1477,9 @@ pg_replication_slots| SELECT l.slot_name,
     l.failover,
     l.synced,
     l.invalidation_reason,
-    l.last_inactive_at
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason, last_inactive_at)
+    l.last_inactive_at,
+    l.inactive_timeout
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, failover, synced, invalidation_reason, last_inactive_at, inactive_timeout)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v13-0004-Allow-setting-inactive_timeout-in-the-replicatio.patchapplication/octet-stream; name=v13-0004-Allow-setting-inactive_timeout-in-the-replicatio.patchDownload

From faee83481f2e04d5ac5a62657d2eadecb7205247 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 20 Mar 2024 22:38:24 +0000
Subject: [PATCH v13 4/6] Allow setting inactive_timeout in the replication
 command

---
 doc/src/sgml/protocol.sgml                    | 20 ++++++
 src/backend/commands/subscriptioncmds.c       |  6 +-
 .../libpqwalreceiver/libpqwalreceiver.c       | 61 ++++++++++++++++---
 src/backend/replication/logical/tablesync.c   |  1 +
 src/backend/replication/slot.c                | 30 ++++++++-
 src/backend/replication/walreceiver.c         |  2 +-
 src/backend/replication/walsender.c           | 38 +++++++++---
 src/include/replication/slot.h                |  3 +-
 src/include/replication/walreceiver.h         | 11 ++--
 src/test/recovery/t/001_stream_rep.pl         | 50 +++++++++++++++
 10 files changed, 195 insertions(+), 27 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index a5cb19357f..2ffa1b470a 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2068,6 +2068,16 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>INACTIVE_TIMEOUT [ <replaceable class="parameter">integer</replaceable> ]</literal></term>
+        <listitem>
+         <para>
+          If set to a non-zero value, specifies the amount of time in seconds
+          the slot is allowed to be inactive. The default is zero.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
       <para>
@@ -2168,6 +2178,16 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>INACTIVE_TIMEOUT [ <replaceable class="parameter">integer</replaceable> ]</literal></term>
+        <listitem>
+         <para>
+          If set to a non-zero value, specifies the amount of time in seconds
+          the slot is allowed to be inactive. The default is zero.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </listitem>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 5a47fa984d..4562de49c4 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -827,7 +827,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 					twophase_enabled = true;
 
 				walrcv_create_slot(wrconn, opts.slot_name, false, twophase_enabled,
-								   opts.failover, CRS_NOEXPORT_SNAPSHOT, NULL);
+								   opts.failover, 0, CRS_NOEXPORT_SNAPSHOT, NULL);
 
 				if (twophase_enabled)
 					UpdateTwoPhaseState(subid, LOGICALREP_TWOPHASE_STATE_ENABLED);
@@ -849,7 +849,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 			else if (opts.slot_name &&
 					 (opts.failover || walrcv_server_version(wrconn) >= 170000))
 			{
-				walrcv_alter_slot(wrconn, opts.slot_name, opts.failover);
+				walrcv_alter_slot(wrconn, opts.slot_name, &opts.failover, NULL);
 			}
 		}
 		PG_FINALLY();
@@ -1541,7 +1541,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 
 		PG_TRY();
 		{
-			walrcv_alter_slot(wrconn, sub->slotname, opts.failover);
+			walrcv_alter_slot(wrconn, sub->slotname, &opts.failover, NULL);
 		}
 		PG_FINALLY();
 		{
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 761bf0f677..126250a076 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -77,10 +77,11 @@ static char *libpqrcv_create_slot(WalReceiverConn *conn,
 								  bool temporary,
 								  bool two_phase,
 								  bool failover,
+								  int inactive_timeout,
 								  CRSSnapshotAction snapshot_action,
 								  XLogRecPtr *lsn);
 static void libpqrcv_alter_slot(WalReceiverConn *conn, const char *slotname,
-								bool failover);
+								bool *failover, int *inactive_timeout);
 static pid_t libpqrcv_get_backend_pid(WalReceiverConn *conn);
 static WalRcvExecResult *libpqrcv_exec(WalReceiverConn *conn,
 									   const char *query,
@@ -1008,7 +1009,8 @@ libpqrcv_send(WalReceiverConn *conn, const char *buffer, int nbytes)
  */
 static char *
 libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname,
-					 bool temporary, bool two_phase, bool failover,
+					 bool temporary, bool two_phase,
+					 bool failover, int inactive_timeout,
 					 CRSSnapshotAction snapshot_action, XLogRecPtr *lsn)
 {
 	PGresult   *res;
@@ -1048,6 +1050,15 @@ libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname,
 				appendStringInfoChar(&cmd, ' ');
 		}
 
+		if (inactive_timeout > 0)
+		{
+			appendStringInfo(&cmd, "INACTIVE_TIMEOUT %d", inactive_timeout);
+			if (use_new_options_syntax)
+				appendStringInfoString(&cmd, ", ");
+			else
+				appendStringInfoChar(&cmd, ' ');
+		}
+
 		if (use_new_options_syntax)
 		{
 			switch (snapshot_action)
@@ -1084,10 +1095,24 @@ libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname,
 	}
 	else
 	{
+		appendStringInfoString(&cmd, " PHYSICAL ");
 		if (use_new_options_syntax)
-			appendStringInfoString(&cmd, " PHYSICAL (RESERVE_WAL)");
-		else
-			appendStringInfoString(&cmd, " PHYSICAL RESERVE_WAL");
+			appendStringInfoChar(&cmd, '(');
+
+		appendStringInfoString(&cmd, "RESERVE_WAL");
+
+		if (inactive_timeout > 0)
+		{
+			if (use_new_options_syntax)
+				appendStringInfoString(&cmd, ", ");
+			else
+				appendStringInfoChar(&cmd, ' ');
+
+			appendStringInfo(&cmd, "INACTIVE_TIMEOUT %d", inactive_timeout);
+		}
+
+		if (use_new_options_syntax)
+			appendStringInfoChar(&cmd, ')');
 	}
 
 	res = libpqrcv_PQexec(conn->streamConn, cmd.data);
@@ -1121,15 +1146,33 @@ libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname,
  */
 static void
 libpqrcv_alter_slot(WalReceiverConn *conn, const char *slotname,
-					bool failover)
+					bool *failover, int *inactive_timeout)
 {
 	StringInfoData cmd;
 	PGresult   *res;
+	bool		specified_prev_opt = false;
 
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "ALTER_REPLICATION_SLOT %s ( FAILOVER %s )",
-					 quote_identifier(slotname),
-					 failover ? "true" : "false");
+	appendStringInfo(&cmd, "ALTER_REPLICATION_SLOT %s (",
+					 quote_identifier(slotname));
+
+	if (failover != NULL)
+	{
+		appendStringInfo(&cmd, "FAILOVER %s",
+						 *failover ? "true" : "false");
+		specified_prev_opt = true;
+	}
+
+	if (inactive_timeout != NULL)
+	{
+		if (specified_prev_opt)
+			appendStringInfoString(&cmd, ", ");
+
+		appendStringInfo(&cmd, "INACTIVE_TIMEOUT %d", *inactive_timeout);
+		specified_prev_opt = true;
+	}
+
+	appendStringInfoChar(&cmd, ')');
 
 	res = libpqrcv_PQexec(conn->streamConn, cmd.data);
 	pfree(cmd.data);
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 1061d5b61b..59f8e5fbaa 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1431,6 +1431,7 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 	walrcv_create_slot(LogRepWorkerWalRcvConn,
 					   slotname, false /* permanent */ , false /* two_phase */ ,
 					   MySubscription->failover,
+					   0,
 					   CRS_USE_SNAPSHOT, origin_startpos);
 
 	/*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 195771920f..aba5e981d7 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -808,8 +808,10 @@ ReplicationSlotDrop(const char *name, bool nowait)
  * Change the definition of the slot identified by the specified name.
  */
 void
-ReplicationSlotAlter(const char *name, bool failover)
+ReplicationSlotAlter(const char *name, bool failover, int inactive_timeout)
 {
+	bool		lock_acquired;
+
 	Assert(MyReplicationSlot == NULL);
 
 	ReplicationSlotAcquire(name, false);
@@ -852,10 +854,36 @@ ReplicationSlotAlter(const char *name, bool failover)
 				errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				errmsg("cannot enable failover for a temporary replication slot"));
 
+	/*
+	 * Do not allow users to set inactive_timeout for temporary slots because
+	 * temporary, slots will not be saved to the disk.
+	 */
+	if (inactive_timeout > 0 && MyReplicationSlot->data.persistency == RS_TEMPORARY)
+		ereport(ERROR,
+				errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				errmsg("cannot set inactive_timeout for a temporary replication slot"));
+
+	lock_acquired = false;
 	if (MyReplicationSlot->data.failover != failover)
 	{
 		SpinLockAcquire(&MyReplicationSlot->mutex);
+		lock_acquired = true;
 		MyReplicationSlot->data.failover = failover;
+	}
+
+	if (MyReplicationSlot->data.inactive_timeout != inactive_timeout)
+	{
+		if (!lock_acquired)
+		{
+			SpinLockAcquire(&MyReplicationSlot->mutex);
+			lock_acquired = true;
+		}
+
+		MyReplicationSlot->data.inactive_timeout = inactive_timeout;
+	}
+
+	if (lock_acquired)
+	{
 		SpinLockRelease(&MyReplicationSlot->mutex);
 
 		ReplicationSlotMarkDirty();
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index acda5f68d9..ac2ebb0c69 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -389,7 +389,7 @@ WalReceiverMain(char *startup_data, size_t startup_data_len)
 					 "pg_walreceiver_%lld",
 					 (long long int) walrcv_get_backend_pid(wrconn));
 
-			walrcv_create_slot(wrconn, slotname, true, false, false, 0, NULL);
+			walrcv_create_slot(wrconn, slotname, true, false, false, 0, 0, NULL);
 
 			SpinLockAcquire(&walrcv->mutex);
 			strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 5315c08650..0420274247 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1123,13 +1123,15 @@ static void
 parseCreateReplSlotOptions(CreateReplicationSlotCmd *cmd,
 						   bool *reserve_wal,
 						   CRSSnapshotAction *snapshot_action,
-						   bool *two_phase, bool *failover)
+						   bool *two_phase, bool *failover,
+						   int *inactive_timeout)
 {
 	ListCell   *lc;
 	bool		snapshot_action_given = false;
 	bool		reserve_wal_given = false;
 	bool		two_phase_given = false;
 	bool		failover_given = false;
+	bool		inactive_timeout_given = false;
 
 	/* Parse options */
 	foreach(lc, cmd->options)
@@ -1188,6 +1190,15 @@ parseCreateReplSlotOptions(CreateReplicationSlotCmd *cmd,
 			failover_given = true;
 			*failover = defGetBoolean(defel);
 		}
+		else if (strcmp(defel->defname, "inactive_timeout") == 0)
+		{
+			if (inactive_timeout_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			inactive_timeout_given = true;
+			*inactive_timeout = defGetInt32(defel);
+		}
 		else
 			elog(ERROR, "unrecognized option: %s", defel->defname);
 	}
@@ -1205,6 +1216,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	bool		reserve_wal = false;
 	bool		two_phase = false;
 	bool		failover = false;
+	int			inactive_timeout = 0;
 	CRSSnapshotAction snapshot_action = CRS_EXPORT_SNAPSHOT;
 	DestReceiver *dest;
 	TupOutputState *tstate;
@@ -1215,13 +1227,13 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	Assert(!MyReplicationSlot);
 
 	parseCreateReplSlotOptions(cmd, &reserve_wal, &snapshot_action, &two_phase,
-							   &failover);
+							   &failover, &inactive_timeout);
 
 	if (cmd->kind == REPLICATION_KIND_PHYSICAL)
 	{
 		ReplicationSlotCreate(cmd->slotname, false,
 							  cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
-							  false, false, false, 0);
+							  false, false, false, inactive_timeout);
 
 		if (reserve_wal)
 		{
@@ -1252,7 +1264,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 		 */
 		ReplicationSlotCreate(cmd->slotname, true,
 							  cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
-							  two_phase, failover, false, 0);
+							  two_phase, failover, false, inactive_timeout);
 
 		/*
 		 * Do options check early so that we can bail before calling the
@@ -1411,9 +1423,11 @@ DropReplicationSlot(DropReplicationSlotCmd *cmd)
  * Process extra options given to ALTER_REPLICATION_SLOT.
  */
 static void
-ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover)
+ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover,
+						  int *inactive_timeout)
 {
 	bool		failover_given = false;
+	bool		inactive_timeout_given = false;
 
 	/* Parse options */
 	foreach_ptr(DefElem, defel, cmd->options)
@@ -1427,6 +1441,15 @@ ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover)
 			failover_given = true;
 			*failover = defGetBoolean(defel);
 		}
+		else if (strcmp(defel->defname, "inactive_timeout") == 0)
+		{
+			if (inactive_timeout_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			inactive_timeout_given = true;
+			*inactive_timeout = defGetInt32(defel);
+		}
 		else
 			elog(ERROR, "unrecognized option: %s", defel->defname);
 	}
@@ -1439,9 +1462,10 @@ static void
 AlterReplicationSlot(AlterReplicationSlotCmd *cmd)
 {
 	bool		failover = false;
+	int			inactive_timeout = 0;
 
-	ParseAlterReplSlotOptions(cmd, &failover);
-	ReplicationSlotAlter(cmd->slotname, failover);
+	ParseAlterReplSlotOptions(cmd, &failover, &inactive_timeout);
+	ReplicationSlotAlter(cmd->slotname, failover, inactive_timeout);
 }
 
 /*
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index ff62542b03..77def17386 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -246,7 +246,8 @@ extern void ReplicationSlotCreate(const char *name, bool db_specific,
 extern void ReplicationSlotPersist(void);
 extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
-extern void ReplicationSlotAlter(const char *name, bool failover);
+extern void ReplicationSlotAlter(const char *name, bool failover,
+								 int inactive_timeout);
 
 extern void ReplicationSlotAcquire(const char *name, bool nowait);
 extern void ReplicationSlotRelease(void);
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 12f71fa99b..038812fd24 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -366,6 +366,7 @@ typedef char *(*walrcv_create_slot_fn) (WalReceiverConn *conn,
 										bool temporary,
 										bool two_phase,
 										bool failover,
+										int inactive_timeout,
 										CRSSnapshotAction snapshot_action,
 										XLogRecPtr *lsn);
 
@@ -377,7 +378,7 @@ typedef char *(*walrcv_create_slot_fn) (WalReceiverConn *conn,
  */
 typedef void (*walrcv_alter_slot_fn) (WalReceiverConn *conn,
 									  const char *slotname,
-									  bool failover);
+									  bool *failover, int *inactive_timeout);
 
 /*
  * walrcv_get_backend_pid_fn
@@ -453,10 +454,10 @@ extern PGDLLIMPORT WalReceiverFunctionsType *WalReceiverFunctions;
 	WalReceiverFunctions->walrcv_receive(conn, buffer, wait_fd)
 #define walrcv_send(conn, buffer, nbytes) \
 	WalReceiverFunctions->walrcv_send(conn, buffer, nbytes)
-#define walrcv_create_slot(conn, slotname, temporary, two_phase, failover, snapshot_action, lsn) \
-	WalReceiverFunctions->walrcv_create_slot(conn, slotname, temporary, two_phase, failover, snapshot_action, lsn)
-#define walrcv_alter_slot(conn, slotname, failover) \
-	WalReceiverFunctions->walrcv_alter_slot(conn, slotname, failover)
+#define walrcv_create_slot(conn, slotname, temporary, two_phase, failover, inactive_timeout, snapshot_action, lsn) \
+	WalReceiverFunctions->walrcv_create_slot(conn, slotname, temporary, two_phase, failover, inactive_timeout, snapshot_action, lsn)
+#define walrcv_alter_slot(conn, slotname, failover, inactive_timeout) \
+	WalReceiverFunctions->walrcv_alter_slot(conn, slotname, failover, inactive_timeout)
 #define walrcv_get_backend_pid(conn) \
 	WalReceiverFunctions->walrcv_get_backend_pid(conn)
 #define walrcv_exec(conn, exec, nRetTypes, retTypes) \
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 5311ade509..db00b6aa24 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -604,4 +604,54 @@ ok( pump_until(
 	'base backup cleanly canceled');
 $sigchld_bb->finish();
 
+# Drop any existing slots on the primary, for the follow-up tests.
+$node_primary->safe_psql('postgres',
+	"SELECT pg_drop_replication_slot(slot_name) FROM pg_replication_slots;");
+
+# Test setting inactive_timeout option via replication commands.
+$node_primary->append_conf(
+	'postgresql.conf', qq(
+wal_level = logical
+));
+$node_primary->restart;
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_phy_slot1 PHYSICAL (RESERVE_WAL, INACTIVE_TIMEOUT 100);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_phy_slot2 PHYSICAL (RESERVE_WAL);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"ALTER_REPLICATION_SLOT it_phy_slot2 (INACTIVE_TIMEOUT 200);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_log_slot1 LOGICAL pgoutput (TWO_PHASE, INACTIVE_TIMEOUT 300);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_log_slot2 LOGICAL pgoutput;",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"ALTER_REPLICATION_SLOT it_log_slot2 (INACTIVE_TIMEOUT 400);",
+	extra_params => [ '-d', $connstr_db ]);
+
+my $slot_info_expected = 'it_log_slot1|logical|300
+it_log_slot2|logical|400
+it_phy_slot1|physical|100
+it_phy_slot2|physical|0';
+
+my $slot_info = $node_primary->safe_psql('postgres',
+	qq[SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;]);
+is($slot_info, $slot_info_expected, "replication slots with inactive_timeout on primary exist");
+
 done_testing();
-- 
2.34.1

v13-0005-Add-inactive_timeout-option-to-subscriptions.patchapplication/octet-stream; name=v13-0005-Add-inactive_timeout-option-to-subscriptions.patchDownload

From 4c547f888a329d759cd2b5df0d6a08ed39344912 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 20 Mar 2024 22:38:44 +0000
Subject: [PATCH v13 5/6] Add inactive_timeout option to subscriptions

---
 doc/src/sgml/catalogs.sgml                  |  11 ++
 doc/src/sgml/ref/alter_subscription.sgml    |   5 +-
 doc/src/sgml/ref/create_subscription.sgml   |  12 ++
 src/backend/catalog/pg_subscription.c       |   1 +
 src/backend/catalog/system_views.sql        |   3 +-
 src/backend/commands/subscriptioncmds.c     |  83 +++++++++--
 src/backend/replication/logical/tablesync.c |   2 +-
 src/bin/pg_dump/pg_dump.c                   |  22 ++-
 src/bin/pg_dump/pg_dump.h                   |   1 +
 src/bin/pg_upgrade/t/003_logical_slots.pl   |   6 +-
 src/bin/psql/describe.c                     |   7 +-
 src/bin/psql/tab-complete.c                 |  14 +-
 src/include/catalog/pg_subscription.h       |   9 ++
 src/test/regress/expected/subscription.out  | 152 ++++++++++----------
 src/test/subscription/t/001_rep_changes.pl  |  63 ++++++++
 15 files changed, 288 insertions(+), 103 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index b7980eb499..4126e2d3cd 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -8028,6 +8028,17 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>subinactivetimeout</structfield> <type>int4</type>
+      </para>
+      <para>
+        When set to a non-zero value, specifies the amount of time in seconds
+        the associated replication slots (i.e. the main slot and the table
+        sync slots) in the upstream database are allowed to be inactive.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>subconninfo</structfield> <type>text</type>
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 413ce68ce2..d02d6232de 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -227,8 +227,9 @@ ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> RENAME TO <
       <link linkend="sql-createsubscription-params-with-disable-on-error"><literal>disable_on_error</literal></link>,
       <link linkend="sql-createsubscription-params-with-password-required"><literal>password_required</literal></link>,
       <link linkend="sql-createsubscription-params-with-run-as-owner"><literal>run_as_owner</literal></link>,
-      <link linkend="sql-createsubscription-params-with-origin"><literal>origin</literal></link>, and
-      <link linkend="sql-createsubscription-params-with-failover"><literal>failover</literal></link>.
+      <link linkend="sql-createsubscription-params-with-origin"><literal>origin</literal></link>,
+      <link linkend="sql-createsubscription-params-with-failover"><literal>failover</literal></link>, and
+      <link linkend="sql-createsubscription-params-with-inactive-timeout"><literal>inactive_timeout</literal></link>.
       Only a superuser can set <literal>password_required = false</literal>.
      </para>
 
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 15794731bb..7be4610921 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -414,6 +414,18 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry id="sql-createsubscription-params-with-inactive-timeout">
+        <term><literal>inactive_timeout</literal> (<type>int4</type>)</term>
+        <listitem>
+         <para>
+          When set to a non-zero value, specifies the amount of time in seconds
+          the associated replication slots (i.e. the main slot and the table
+          sync slots) in the upstream database are allowed to be inactive.
+          The default is <literal>0</literal>.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist></para>
 
     </listitem>
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 9efc9159f2..f874146e72 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -72,6 +72,7 @@ GetSubscription(Oid subid, bool missing_ok)
 	sub->passwordrequired = subform->subpasswordrequired;
 	sub->runasowner = subform->subrunasowner;
 	sub->failover = subform->subfailover;
+	sub->inactivetimeout = subform->subinactivetimeout;
 
 	/* Get conninfo */
 	datum = SysCacheGetAttrNotNull(SUBSCRIPTIONOID,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index a43048ae93..6005315ce3 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1363,7 +1363,8 @@ REVOKE ALL ON pg_subscription FROM public;
 GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
               subbinary, substream, subtwophasestate, subdisableonerr,
 			  subpasswordrequired, subrunasowner, subfailover,
-              subslotname, subsynccommit, subpublications, suborigin)
+              subinactivetimeout, subslotname, subsynccommit,
+              subpublications, suborigin)
     ON pg_subscription TO public;
 
 CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 4562de49c4..c39f765af8 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -70,8 +70,9 @@
 #define SUBOPT_PASSWORD_REQUIRED	0x00000800
 #define SUBOPT_RUN_AS_OWNER			0x00001000
 #define SUBOPT_FAILOVER				0x00002000
-#define SUBOPT_LSN					0x00004000
-#define SUBOPT_ORIGIN				0x00008000
+#define SUBOPT_INACTIVE_TIMEOUT		0x00004000
+#define SUBOPT_LSN					0x00008000
+#define SUBOPT_ORIGIN				0x00010000
 
 /* check if the 'val' has 'bits' set */
 #define IsSet(val, bits)  (((val) & (bits)) == (bits))
@@ -97,6 +98,7 @@ typedef struct SubOpts
 	bool		passwordrequired;
 	bool		runasowner;
 	bool		failover;
+	int			inactivetimeout;
 	char	   *origin;
 	XLogRecPtr	lsn;
 } SubOpts;
@@ -159,6 +161,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
 		opts->runasowner = false;
 	if (IsSet(supported_opts, SUBOPT_FAILOVER))
 		opts->failover = false;
+	if (IsSet(supported_opts, SUBOPT_INACTIVE_TIMEOUT))
+		opts->inactivetimeout = 0;
 	if (IsSet(supported_opts, SUBOPT_ORIGIN))
 		opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
 
@@ -316,6 +320,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
 			opts->specified_opts |= SUBOPT_FAILOVER;
 			opts->failover = defGetBoolean(defel);
 		}
+		else if (IsSet(supported_opts, SUBOPT_INACTIVE_TIMEOUT) &&
+				 strcmp(defel->defname, "inactive_timeout") == 0)
+		{
+			if (IsSet(opts->specified_opts, SUBOPT_INACTIVE_TIMEOUT))
+				errorConflictingDefElem(defel, pstate);
+
+			opts->specified_opts |= SUBOPT_INACTIVE_TIMEOUT;
+			opts->inactivetimeout = defGetInt32(defel);
+		}
 		else if (IsSet(supported_opts, SUBOPT_ORIGIN) &&
 				 strcmp(defel->defname, "origin") == 0)
 		{
@@ -610,7 +623,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 					  SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
 					  SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
 					  SUBOPT_DISABLE_ON_ERR | SUBOPT_PASSWORD_REQUIRED |
-					  SUBOPT_RUN_AS_OWNER | SUBOPT_FAILOVER | SUBOPT_ORIGIN);
+					  SUBOPT_RUN_AS_OWNER | SUBOPT_FAILOVER |
+					  SUBOPT_INACTIVE_TIMEOUT | SUBOPT_ORIGIN);
 	parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
 
 	/*
@@ -717,6 +731,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 	values[Anum_pg_subscription_subpasswordrequired - 1] = BoolGetDatum(opts.passwordrequired);
 	values[Anum_pg_subscription_subrunasowner - 1] = BoolGetDatum(opts.runasowner);
 	values[Anum_pg_subscription_subfailover - 1] = BoolGetDatum(opts.failover);
+	values[Anum_pg_subscription_subinactivetimeout - 1] = Int32GetDatum(opts.inactivetimeout);
 	values[Anum_pg_subscription_subconninfo - 1] =
 		CStringGetTextDatum(conninfo);
 	if (opts.slot_name)
@@ -827,7 +842,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 					twophase_enabled = true;
 
 				walrcv_create_slot(wrconn, opts.slot_name, false, twophase_enabled,
-								   opts.failover, 0, CRS_NOEXPORT_SNAPSHOT, NULL);
+								   opts.failover, opts.inactivetimeout,
+								   CRS_NOEXPORT_SNAPSHOT, NULL);
 
 				if (twophase_enabled)
 					UpdateTwoPhaseState(subid, LOGICALREP_TWOPHASE_STATE_ENABLED);
@@ -840,16 +856,19 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 			/*
 			 * If the slot_name is specified without the create_slot option,
 			 * it is possible that the user intends to use an existing slot on
-			 * the publisher, so here we alter the failover property of the
-			 * slot to match the failover value in subscription.
+			 * the publisher, so here we alter the failover and
+			 * inactive_timeout properties of the slot to match the failover
+			 * and inactive_timeout values in subscription.
 			 *
-			 * We do not need to change the failover to false if the server
-			 * does not support failover (e.g. pre-PG17).
+			 * We do not need to change the failover to false and
+			 * inactive_timeout to zero if the server does not support them
+			 * (e.g. pre-PG17).
 			 */
 			else if (opts.slot_name &&
-					 (opts.failover || walrcv_server_version(wrconn) >= 170000))
+					 (opts.failover || opts.inactivetimeout > 0 ||
+					  walrcv_server_version(wrconn) >= 170000))
 			{
-				walrcv_alter_slot(wrconn, opts.slot_name, &opts.failover, NULL);
+				walrcv_alter_slot(wrconn, opts.slot_name, &opts.failover, &opts.inactivetimeout);
 			}
 		}
 		PG_FINALLY();
@@ -1168,7 +1187,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 								  SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
 								  SUBOPT_PASSWORD_REQUIRED |
 								  SUBOPT_RUN_AS_OWNER | SUBOPT_FAILOVER |
-								  SUBOPT_ORIGIN);
+								  SUBOPT_INACTIVE_TIMEOUT | SUBOPT_ORIGIN);
 
 				parse_subscription_options(pstate, stmt->options,
 										   supported_opts, &opts);
@@ -1272,6 +1291,19 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 					replaces[Anum_pg_subscription_subfailover - 1] = true;
 				}
 
+				if (IsSet(opts.specified_opts, SUBOPT_INACTIVE_TIMEOUT))
+				{
+					if (!sub->slotname)
+						ereport(ERROR,
+								(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+								 errmsg("cannot set %s for a subscription that does not have a slot name",
+										"inactive_timeout")));
+
+					values[Anum_pg_subscription_subinactivetimeout - 1] =
+						BoolGetDatum(opts.inactivetimeout);
+					replaces[Anum_pg_subscription_subinactivetimeout - 1] = true;
+				}
+
 				if (IsSet(opts.specified_opts, SUBOPT_ORIGIN))
 				{
 					values[Anum_pg_subscription_suborigin - 1] =
@@ -1550,6 +1582,35 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 		PG_END_TRY();
 	}
 
+	if (replaces[Anum_pg_subscription_subinactivetimeout - 1])
+	{
+		bool		must_use_password;
+		char	   *err;
+		WalReceiverConn *wrconn;
+
+		/* Load the library providing us libpq calls. */
+		load_file("libpqwalreceiver", false);
+
+		/* Try to connect to the publisher. */
+		must_use_password = sub->passwordrequired && !sub->ownersuperuser;
+		wrconn = walrcv_connect(sub->conninfo, true, true, must_use_password,
+								sub->name, &err);
+		if (!wrconn)
+			ereport(ERROR,
+					(errcode(ERRCODE_CONNECTION_FAILURE),
+					 errmsg("could not connect to the publisher: %s", err)));
+
+		PG_TRY();
+		{
+			walrcv_alter_slot(wrconn, sub->slotname, NULL, &opts.inactivetimeout);
+		}
+		PG_FINALLY();
+		{
+			walrcv_disconnect(wrconn);
+		}
+		PG_END_TRY();
+	}
+
 	table_close(rel, RowExclusiveLock);
 
 	ObjectAddressSet(myself, SubscriptionRelationId, subid);
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 59f8e5fbaa..c660f1e65e 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1431,7 +1431,7 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 	walrcv_create_slot(LogRepWorkerWalRcvConn,
 					   slotname, false /* permanent */ , false /* two_phase */ ,
 					   MySubscription->failover,
-					   0,
+					   MySubscription->inactivetimeout,
 					   CRS_USE_SNAPSHOT, origin_startpos);
 
 	/*
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index d275b31605..a31d5a213f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -4673,6 +4673,7 @@ getSubscriptions(Archive *fout)
 	int			i_suboriginremotelsn;
 	int			i_subenabled;
 	int			i_subfailover;
+	int			i_subinactivetimeout;
 	int			i,
 				ntups;
 
@@ -4739,11 +4740,13 @@ getSubscriptions(Archive *fout)
 	if (dopt->binary_upgrade && fout->remoteVersion >= 170000)
 		appendPQExpBufferStr(query, " o.remote_lsn AS suboriginremotelsn,\n"
 							 " s.subenabled,\n"
-							 " s.subfailover\n");
+							 " s.subfailover,\n"
+							 " s.subinactivetimeout\n");
 	else
 		appendPQExpBufferStr(query, " NULL AS suboriginremotelsn,\n"
 							 " false AS subenabled,\n"
-							 " false AS subfailover\n");
+							 " false AS subfailover,\n"
+							 " 0 AS subinactivetimeout\n");
 
 	appendPQExpBufferStr(query,
 						 "FROM pg_subscription s\n");
@@ -4783,6 +4786,7 @@ getSubscriptions(Archive *fout)
 	i_suboriginremotelsn = PQfnumber(res, "suboriginremotelsn");
 	i_subenabled = PQfnumber(res, "subenabled");
 	i_subfailover = PQfnumber(res, "subfailover");
+	i_subinactivetimeout = PQfnumber(res, "subinactivetimeout");
 
 	subinfo = pg_malloc(ntups * sizeof(SubscriptionInfo));
 
@@ -4829,6 +4833,8 @@ getSubscriptions(Archive *fout)
 			pg_strdup(PQgetvalue(res, i, i_subenabled));
 		subinfo[i].subfailover =
 			pg_strdup(PQgetvalue(res, i, i_subfailover));
+		subinfo[i].subinactivetimeout =
+			atooid(PQgetvalue(res, i, i_subinactivetimeout));
 
 		/* Decide whether we want to dump it */
 		selectDumpableObject(&(subinfo[i].dobj), fout);
@@ -5110,6 +5116,18 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo)
 			appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s SET(failover = true);\n", qsubname);
 		}
 
+		if (subinfo->subinactivetimeout > 0)
+		{
+			/*
+			 * Preserve subscription's inactive_timeout option to be able to
+			 * use it after the upgrade.
+			 */
+			appendPQExpBufferStr(query,
+								 "\n-- For binary upgrade, must preserve the subscriber's inactive_timeout option.\n");
+			appendPQExpBuffer(query, "ALTER SUBSCRIPTION %s SET(inactive_timeout = %d);\n",
+							  qsubname, subinfo->subinactivetimeout);
+		}
+
 		if (strcmp(subinfo->subenabled, "t") == 0)
 		{
 			/*
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 9bc93520b4..bfaedcd7e2 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -667,6 +667,7 @@ typedef struct _SubscriptionInfo
 	char	   *suborigin;
 	char	   *suboriginremotelsn;
 	char	   *subfailover;
+	int			subinactivetimeout;
 } SubscriptionInfo;
 
 /*
diff --git a/src/bin/pg_upgrade/t/003_logical_slots.pl b/src/bin/pg_upgrade/t/003_logical_slots.pl
index 83d71c3084..8aa34d66cc 100644
--- a/src/bin/pg_upgrade/t/003_logical_slots.pl
+++ b/src/bin/pg_upgrade/t/003_logical_slots.pl
@@ -172,7 +172,7 @@ $sub->start;
 $sub->safe_psql(
 	'postgres', qq[
 	CREATE TABLE tbl (a int);
-	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (two_phase = 'true', failover = 'true')
+	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (two_phase = 'true', failover = 'true', inactive_timeout = 3600)
 ]);
 $sub->wait_for_subscription_sync($oldpub, 'regress_sub');
 
@@ -192,8 +192,8 @@ command_ok([@pg_upgrade_cmd], 'run of pg_upgrade of old cluster');
 # Check that the slot 'regress_sub' has migrated to the new cluster
 $newpub->start;
 my $result = $newpub->safe_psql('postgres',
-	"SELECT slot_name, two_phase, failover FROM pg_replication_slots");
-is($result, qq(regress_sub|t|t), 'check the slot exists on new cluster');
+	"SELECT slot_name, two_phase, failover, inactive_timeout = 3600 FROM pg_replication_slots");
+is($result, qq(regress_sub|t|t|t), 'check the slot exists on new cluster');
 
 # Update the connection
 my $new_connstr = $newpub->connstr . ' dbname=postgres';
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 6433497bcd..73fcfa421d 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6581,7 +6581,7 @@ describeSubscriptions(const char *pattern, bool verbose)
 	printQueryOpt myopt = pset.popt;
 	static const bool translate_columns[] = {false, false, false, false,
 		false, false, false, false, false, false, false, false, false, false,
-	false};
+	false, false};
 
 	if (pset.sversion < 100000)
 	{
@@ -6650,6 +6650,11 @@ describeSubscriptions(const char *pattern, bool verbose)
 							  ", subfailover AS \"%s\"\n",
 							  gettext_noop("Failover"));
 
+		if (pset.sversion >= 170000)
+			appendPQExpBuffer(&buf,
+							  ", subinactivetimeout AS \"%s\"\n",
+							  gettext_noop("Inactive timeout"));
+
 		appendPQExpBuffer(&buf,
 						  ",  subsynccommit AS \"%s\"\n"
 						  ",  subconninfo AS \"%s\"\n",
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 56d723de8a..bf7349bae1 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1946,9 +1946,10 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("(", "PUBLICATION");
 	/* ALTER SUBSCRIPTION <name> SET ( */
 	else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
-		COMPLETE_WITH("binary", "disable_on_error", "failover", "origin",
-					  "password_required", "run_as_owner", "slot_name",
-					  "streaming", "synchronous_commit");
+		COMPLETE_WITH("binary", "disable_on_error", "failover",
+					  "inactive_timeout", "origin", "password_required",
+					  "run_as_owner", "slot_name", "streaming",
+					  "synchronous_commit");
 	/* ALTER SUBSCRIPTION <name> SKIP ( */
 	else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
 		COMPLETE_WITH("lsn");
@@ -3344,9 +3345,10 @@ psql_completion(const char *text, int start, int end)
 	/* Complete "CREATE SUBSCRIPTION <name> ...  WITH ( <opt>" */
 	else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
 		COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
-					  "disable_on_error", "enabled", "failover", "origin",
-					  "password_required", "run_as_owner", "slot_name",
-					  "streaming", "synchronous_commit", "two_phase");
+					  "disable_on_error", "enabled", "failover",
+					  "inactive_timeout", "origin", "password_required",
+					  "run_as_owner", "slot_name", "streaming",
+					  "synchronous_commit", "two_phase");
 
 /* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
 
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 0aa14ec4a2..1113cdf690 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -98,6 +98,11 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
 								 * slots) in the upstream database are enabled
 								 * to be synchronized to the standbys. */
 
+	int32		subinactivetimeout; /* Associated replication slots (i.e. the
+									 * main slot and the table sync slots) in
+									 * the upstream database are allowed to be
+									 * inactive this amount of time. */
+
 #ifdef CATALOG_VARLEN			/* variable-length fields start here */
 	/* Connection string to the publisher */
 	text		subconninfo BKI_FORCE_NOT_NULL;
@@ -151,6 +156,10 @@ typedef struct Subscription
 								 * (i.e. the main slot and the table sync
 								 * slots) in the upstream database are enabled
 								 * to be synchronized to the standbys. */
+	int32		inactivetimeout;	/* Associated replication slots (i.e. the
+									 * main slot and the table sync slots) in
+									 * the upstream database are allowed to be
+									 * inactive this amount of time. */
 	char	   *conninfo;		/* Connection string to the publisher */
 	char	   *slotname;		/* Name of the replication slot */
 	char	   *synccommit;		/* Synchronous commit setting for worker */
diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out
index 1eee6b17b8..5d5dc1561d 100644
--- a/src/test/regress/expected/subscription.out
+++ b/src/test/regress/expected/subscription.out
@@ -118,18 +118,18 @@ CREATE SUBSCRIPTION regress_testsub4 CONNECTION 'dbname=regress_doesnotexist' PU
 WARNING:  subscription was created, but is not connected
 HINT:  To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
 \dRs+ regress_testsub4
-                                                                                                                 List of subscriptions
-       Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | none   | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+       Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | none   | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub4 SET (origin = any);
 \dRs+ regress_testsub4
-                                                                                                                 List of subscriptions
-       Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
-------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub4 | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+       Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub4 | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 DROP SUBSCRIPTION regress_testsub3;
@@ -147,10 +147,10 @@ ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar';
 ERROR:  invalid connection string syntax: missing "=" after "foobar" in connection info string
 
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET PUBLICATION testpub2, testpub3 WITH (refresh = false);
@@ -159,10 +159,10 @@ ALTER SUBSCRIPTION regress_testsub SET (slot_name = 'newname');
 ALTER SUBSCRIPTION regress_testsub SET (password_required = false);
 ALTER SUBSCRIPTION regress_testsub SET (run_as_owner = true);
 \dRs+
-                                                                                                                     List of subscriptions
-      Name       |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |           Conninfo           | Skip LSN 
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | f                 | t             | f        | off                | dbname=regress_doesnotexist2 | 0/0
+                                                                                                                              List of subscriptions
+      Name       |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |           Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | f                 | t             | f        |                0 | off                | dbname=regress_doesnotexist2 | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (password_required = true);
@@ -178,10 +178,10 @@ ERROR:  unrecognized subscription parameter: "create_slot"
 -- ok
 ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/12345');
 \dRs+
-                                                                                                                     List of subscriptions
-      Name       |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |           Conninfo           | Skip LSN 
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist2 | 0/12345
+                                                                                                                              List of subscriptions
+      Name       |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |           Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist2 | 0/12345
 (1 row)
 
 -- ok - with lsn = NONE
@@ -190,10 +190,10 @@ ALTER SUBSCRIPTION regress_testsub SKIP (lsn = NONE);
 ALTER SUBSCRIPTION regress_testsub SKIP (lsn = '0/0');
 ERROR:  invalid WAL location (LSN): 0/0
 \dRs+
-                                                                                                                     List of subscriptions
-      Name       |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |           Conninfo           | Skip LSN 
------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+------------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist2 | 0/0
+                                                                                                                              List of subscriptions
+      Name       |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |           Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+------------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist2 | 0/0
 (1 row)
 
 BEGIN;
@@ -225,10 +225,10 @@ ALTER SUBSCRIPTION regress_testsub_foo SET (synchronous_commit = foobar);
 ERROR:  invalid value for parameter "synchronous_commit": "foobar"
 HINT:  Available values: local, remote_write, remote_apply, on, off.
 \dRs+
-                                                                                                                       List of subscriptions
-        Name         |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |           Conninfo           | Skip LSN 
----------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+------------------------------+----------
- regress_testsub_foo | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | t                 | f             | f        | local              | dbname=regress_doesnotexist2 | 0/0
+                                                                                                                                List of subscriptions
+        Name         |           Owner           | Enabled |     Publication     | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |           Conninfo           | Skip LSN 
+---------------------+---------------------------+---------+---------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+------------------------------+----------
+ regress_testsub_foo | regress_subscription_user | f       | {testpub2,testpub3} | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | local              | dbname=regress_doesnotexist2 | 0/0
 (1 row)
 
 -- rename back to keep the rest simple
@@ -257,19 +257,19 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
 WARNING:  subscription was created, but is not connected
 HINT:  To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | t      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | t      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (binary = false);
 ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 DROP SUBSCRIPTION regress_testsub;
@@ -281,27 +281,27 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
 WARNING:  subscription was created, but is not connected
 HINT:  To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | on        | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | on        | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (streaming = parallel);
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | parallel  | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | parallel  | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (streaming = false);
 ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 -- fail - publication already exists
@@ -316,10 +316,10 @@ ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refr
 ALTER SUBSCRIPTION regress_testsub ADD PUBLICATION testpub1, testpub2 WITH (refresh = false);
 ERROR:  publication "testpub1" is already in subscription "regress_testsub"
 \dRs+
-                                                                                                                        List of subscriptions
-      Name       |           Owner           | Enabled |         Publication         | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub,testpub1,testpub2} | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                                  List of subscriptions
+      Name       |           Owner           | Enabled |         Publication         | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-----------------------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub,testpub1,testpub2} | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 -- fail - publication used more than once
@@ -334,10 +334,10 @@ ERROR:  publication "testpub3" is not in subscription "regress_testsub"
 -- ok - delete publications
 ALTER SUBSCRIPTION regress_testsub DROP PUBLICATION testpub1, testpub2 WITH (refresh = false);
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 DROP SUBSCRIPTION regress_testsub;
@@ -373,10 +373,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
 WARNING:  subscription was created, but is not connected
 HINT:  To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | p                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | p                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 --fail - alter of two_phase option not supported.
@@ -385,10 +385,10 @@ ERROR:  unrecognized subscription parameter: "two_phase"
 -- but can alter streaming when two_phase enabled
 ALTER SUBSCRIPTION regress_testsub SET (streaming = true);
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | on        | p                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | on        | p                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -398,10 +398,10 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
 WARNING:  subscription was created, but is not connected
 HINT:  To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | on        | p                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | on        | p                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
@@ -414,18 +414,18 @@ CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUB
 WARNING:  subscription was created, but is not connected
 HINT:  To initiate replication, you must manually create the replication slot, enable the subscription, and refresh the subscription.
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | f                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (disable_on_error = true);
 \dRs+
-                                                                                                                List of subscriptions
-      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Synchronous commit |          Conninfo           | Skip LSN 
------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+-----------------------------+----------
- regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | t                | any    | t                 | f             | f        | off                | dbname=regress_doesnotexist | 0/0
+                                                                                                                          List of subscriptions
+      Name       |           Owner           | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Inactive timeout | Synchronous commit |          Conninfo           | Skip LSN 
+-----------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+------------------+--------------------+-----------------------------+----------
+ regress_testsub | regress_subscription_user | f       | {testpub}   | f      | off       | d                | t                | any    | t                 | f             | f        |                0 | off                | dbname=regress_doesnotexist | 0/0
 (1 row)
 
 ALTER SUBSCRIPTION regress_testsub SET (slot_name = NONE);
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 9ccebd890a..e6fd2297d2 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -531,6 +531,69 @@ $node_publisher->poll_query_until('postgres',
   or die
   "Timed out while waiting for apply to restart after renaming SUBSCRIPTION";
 
+# Check inactive_timeout options with subscriptions
+#
+# Setup logical replication that will only be used for this test
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION tap_pub_it FOR ALL TABLES;");
+
+# Create subscription with inactive_timeout set
+my $inactive_timeout = 300;
+$node_subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION tap_sub_it1 CONNECTION '$publisher_connstr' PUBLICATION tap_pub_it WITH (inactive_timeout = $inactive_timeout, enabled = false);"
+);
+
+$result = $node_publisher->safe_psql('postgres',
+	"SELECT inactive_timeout FROM pg_replication_slots WHERE slot_name = 'tap_sub_it1';"
+);
+is($result, $inactive_timeout,
+	"check inactive_timeout passed via create subscription is set on slot created on publisher"
+);
+
+$result = $node_subscriber->safe_psql('postgres',
+	"SELECT subinactivetimeout FROM pg_subscription WHERE subname  = 'tap_sub_it1';"
+);
+is($result, $inactive_timeout,
+	"check inactive_timeout passed via create subscription is set on pg_subscription on subscriber"
+);
+
+# Alter subscription with inactive_timeout set to a different value
+$inactive_timeout = 600;
+$node_subscriber->safe_psql('postgres',
+	"ALTER SUBSCRIPTION tap_sub_it1 SET (inactive_timeout = $inactive_timeout);"
+);
+
+$result = $node_publisher->safe_psql('postgres',
+	"SELECT inactive_timeout FROM pg_replication_slots WHERE slot_name = 'tap_sub_it1';"
+);
+is($result, $inactive_timeout,
+	"check inactive_timeout passed via alter subscription is set on slot created on publisher"
+);
+
+# Create subscription with using a pre-existing slot with inactive_timeout set.
+# This is equivalent to alter subscription to change the inactive_timeout.
+$inactive_timeout = 900;
+$node_publisher->safe_psql('postgres',
+	"SELECT * FROM pg_create_logical_replication_slot('slot_sub_it2', 'pgoutput');"
+);
+$node_subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION tap_sub_it2 CONNECTION '$publisher_connstr' PUBLICATION tap_pub_it WITH (create_slot = false, slot_name = 'slot_sub_it2', inactive_timeout = $inactive_timeout);"
+);
+
+$result = $node_publisher->safe_psql('postgres',
+	"SELECT inactive_timeout FROM pg_replication_slots WHERE slot_name = 'slot_sub_it2';"
+);
+is($result, $inactive_timeout,
+	"check inactive_timeout passed via create subscription with slot_name is set on slot created on publisher"
+);
+
+# Drop subscriptions as we don't need them anymore
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_it1");
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_it2");
+
+# Drop publication as we don't need it anymore
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_it");
+
 # check all the cleanup
 $node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_renamed");
 
-- 
2.34.1

v13-0006-Add-inactive_timeout-based-replication-slot-inva.patchapplication/octet-stream; name=v13-0006-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 89fd621fb800bb2300c16567742906001fca7158 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 20 Mar 2024 23:04:24 +0000
Subject: [PATCH v13 6/6] Add inactive_timeout based replication slot
 invalidation

---
 doc/src/sgml/func.sgml                        |  12 +-
 doc/src/sgml/ref/create_subscription.sgml     |   4 +-
 doc/src/sgml/system-views.sgml                |  10 +-
 .../replication/logical/logicalfuncs.c        |   4 +-
 src/backend/replication/logical/slotsync.c    |   8 +-
 src/backend/replication/slot.c                | 205 +++++++++++++++++-
 src/backend/replication/slotfuncs.c           |  27 ++-
 src/backend/replication/walsender.c           |  12 +-
 src/backend/tcop/postgres.c                   |   2 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   4 +-
 src/include/replication/slot.h                |  11 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 168 ++++++++++++++
 13 files changed, 425 insertions(+), 43 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 9467ff86b3..af5ac09a4b 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28183,8 +28183,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         released upon any error. The optional fourth
         parameter, <parameter>inactive_timeout</parameter>, when set to a
         non-zero value, specifies the amount of time in seconds the slot is
-        allowed to be inactive. This function corresponds to the replication
-        protocol command
+        allowed to be inactive before getting invalidated.
+        This function corresponds to the replication protocol command
         <literal>CREATE_REPLICATION_SLOT ... PHYSICAL</literal>.
        </para></entry>
       </row>
@@ -28229,12 +28229,12 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <parameter>failover</parameter>, when set to true,
         specifies that this slot is enabled to be synced to the
         standbys so that logical replication can be resumed after
-        failover.  The optional sixth parameter,
+        failover. The optional sixth parameter,
         <parameter>inactive_timeout</parameter>, when set to a
         non-zero value, specifies the amount of time in seconds the slot is
-        allowed to be inactive. A call to this function has the same effect as
-        the replication protocol command
-        <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
+        allowed to be inactive before getting invalidated.
+        A call to this function has the same effect as the replication protocol
+        command <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
        </para></entry>
       </row>
 
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 7be4610921..472592c750 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -421,8 +421,8 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
          <para>
           When set to a non-zero value, specifies the amount of time in seconds
           the associated replication slots (i.e. the main slot and the table
-          sync slots) in the upstream database are allowed to be inactive.
-          The default is <literal>0</literal>.
+          sync slots) in the upstream database are allowed to be inactive before
+          getting invalidated. The default is <literal>0</literal>.
          </para>
         </listitem>
        </varlistentry>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index ec60c43038..d063e989a6 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2588,6 +2588,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by slot's
+          <literal>inactive_timeout</literal> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
@@ -2756,7 +2763,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        <structfield>inactive_timeout</structfield> <type>integer</type>
       </para>
       <para>
-        The amount of time in seconds the slot is allowed to be inactive.
+        The amount of time in seconds the slot is allowed to be inactive before
+        getting invalidated.
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..53cf8bbd42 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
@@ -309,7 +309,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 		/* free context, call shutdown callback */
 		FreeDecodingContext(ctx);
 
-		ReplicationSlotRelease();
+		ReplicationSlotRelease(true);
 		InvalidateSystemCaches();
 	}
 	PG_CATCH();
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index c01876ceeb..5aba117e2b 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -319,7 +319,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -529,7 +529,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
 		 * if the slot is not acquired by other processes.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -554,7 +554,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		/* Skip the sync of an invalidated slot */
 		if (slot->data.invalidated != RS_INVAL_NONE)
 		{
-			ReplicationSlotRelease();
+			ReplicationSlotRelease(false);
 			return slot_updated;
 		}
 
@@ -640,7 +640,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		slot_updated = true;
 	}
 
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(false);
 
 	return slot_updated;
 }
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index aba5e981d7..d007ed126f 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -158,6 +159,9 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool InvalidateSlotForInactiveTimeout(ReplicationSlot *slot,
+											 bool need_control_lock,
+											 bool need_mutex);
 
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
@@ -234,7 +238,7 @@ ReplicationSlotShmemExit(int code, Datum arg)
 {
 	/* Make sure active replication slots are released */
 	if (MyReplicationSlot != NULL)
-		ReplicationSlotRelease();
+		ReplicationSlotRelease(true);
 
 	/* Also cleanup all the temporary slots. */
 	ReplicationSlotCleanup();
@@ -551,9 +555,14 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * If check_for_invalidation is true, the slot is checked for invalidation
+ * based on its inactive_timeout parameter and an error is raised after making
+ * the slot ours.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -631,6 +640,42 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * Check if the given slot can be invalidated based on its
+	 * inactive_timeout parameter. If yes, persist the invalidated state to
+	 * disk and then error out. We do this only after making the slot ours to
+	 * avoid anyone else acquiring it while we check for its invalidation.
+	 */
+	if (check_for_invalidation)
+	{
+		/* The slot is ours by now */
+		Assert(s->active_pid == MyProcPid);
+
+		/*
+		 * Well, the slot is not yet ours really unless we check for the
+		 * invalidation below.
+		 */
+		s->active_pid = 0;
+		if (InvalidateReplicationSlotForInactiveTimeout(s, true, true, true))
+		{
+			/*
+			 * If the slot has been invalidated, recalculate the resource
+			 * limits.
+			 */
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+
+			/* Might need it for slot clean up on error, so restore it */
+			s->active_pid = MyProcPid;
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("cannot acquire invalidated replication slot \"%s\"",
+							NameStr(MyReplicationSlot->data.name)),
+					 errdetail("This slot has been invalidated because of its inactive_timeout parameter.")));
+		}
+		s->active_pid = MyProcPid;
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -664,7 +709,7 @@ retry:
  * Resources this slot requires will be preserved.
  */
 void
-ReplicationSlotRelease(void)
+ReplicationSlotRelease(bool set_last_inactive_at)
 {
 	ReplicationSlot *slot = MyReplicationSlot;
 	char	   *slotname = NULL;	/* keep compiler quiet */
@@ -715,11 +760,20 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
-	if (slot->data.persistency == RS_PERSISTENT)
+	if (set_last_inactive_at &&
+		slot->data.persistency == RS_PERSISTENT)
 	{
-		SpinLockAcquire(&slot->mutex);
-		slot->last_inactive_at = GetCurrentTimestamp();
-		SpinLockRelease(&slot->mutex);
+		/*
+		 * There's no point in allowing failover slots to get invalidated
+		 * based on slot's inactive_timeout parameter on standby. The failover
+		 * slots simply get synced from the primary on the standby.
+		 */
+		if (!(RecoveryInProgress() && slot->data.failover))
+		{
+			SpinLockAcquire(&slot->mutex);
+			slot->last_inactive_at = GetCurrentTimestamp();
+			SpinLockRelease(&slot->mutex);
+		}
 	}
 
 	MyReplicationSlot = NULL;
@@ -789,7 +843,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -814,7 +868,7 @@ ReplicationSlotAlter(const char *name, bool failover, int inactive_timeout)
 
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -890,7 +944,7 @@ ReplicationSlotAlter(const char *name, bool failover, int inactive_timeout)
 		ReplicationSlotSave();
 	}
 
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(false);
 }
 
 /*
@@ -1542,6 +1596,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by slot's inactive_timeout parameter."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1655,6 +1712,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (InvalidateReplicationSlotForInactiveTimeout(s, false, false, false))
+						invalidation_cause = cause;
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1781,7 +1842,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			/* Make sure the invalidated state persists across server restart */
 			ReplicationSlotMarkDirty();
 			ReplicationSlotSave();
-			ReplicationSlotRelease();
+			ReplicationSlotRelease(true);
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
@@ -1808,6 +1869,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1859,6 +1921,110 @@ restart:
 	return invalidated;
 }
 
+/*
+ * Invalidate given slot based on its inactive_timeout parameter.
+ *
+ * Returns true if the slot has got invalidated.
+ *
+ * NB - this function also runs as part of checkpoint, so avoid raising errors
+ * if possible.
+ */
+bool
+InvalidateReplicationSlotForInactiveTimeout(ReplicationSlot *slot,
+											bool need_control_lock,
+											bool need_mutex,
+											bool persist_state)
+{
+	if (!InvalidateSlotForInactiveTimeout(slot, need_control_lock, need_mutex))
+		return false;
+
+	Assert(slot->active_pid == 0);
+
+	SpinLockAcquire(&slot->mutex);
+	slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT;
+
+	/* Make sure the invalidated state persists across server restart */
+	slot->just_dirtied = true;
+	slot->dirty = true;
+	SpinLockRelease(&slot->mutex);
+
+	if (persist_state)
+	{
+		char		path[MAXPGPATH];
+
+		sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+		SaveSlotToPath(slot, path, ERROR);
+	}
+
+	ReportSlotInvalidation(RS_INVAL_INACTIVE_TIMEOUT, false, 0,
+						   slot->data.name, InvalidXLogRecPtr,
+						   InvalidXLogRecPtr, InvalidTransactionId);
+
+	return true;
+}
+
+/*
+ * Helper for InvalidateReplicationSlotForInactiveTimeout
+ */
+static bool
+InvalidateSlotForInactiveTimeout(ReplicationSlot *slot,
+								 bool need_control_lock,
+								 bool need_mutex)
+{
+	ReplicationSlotInvalidationCause inavidation_cause = RS_INVAL_NONE;
+
+	if (slot->last_inactive_at == 0 ||
+		slot->data.inactive_timeout == 0)
+		return false;
+
+	/* inactive_timeout is only tracked for permanent slots */
+	if (slot->data.persistency != RS_PERSISTENT)
+		return false;
+
+	/*
+	 * There's no point in allowing failover slots to get invalidated based on
+	 * slot's inactive_timeout parameter on standby. The failover slots simply
+	 * get synced from the primary on the standby.
+	 */
+	if (RecoveryInProgress() && slot->data.failover)
+		return false;
+
+	if (need_control_lock)
+		LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
+
+	/*
+	 * Check if the slot needs to be invalidated due to inactive_timeout. We
+	 * do this with the spinlock held to avoid race conditions -- for example
+	 * the restart_lsn could move forward, or the slot could be dropped.
+	 */
+	if (need_mutex)
+		SpinLockAcquire(&slot->mutex);
+
+	if (slot->last_inactive_at > 0 &&
+		slot->data.inactive_timeout > 0)
+	{
+		TimestampTz now;
+
+		/* last_inactive_at is only tracked for inactive slots */
+		Assert(slot->active_pid == 0);
+
+		now = GetCurrentTimestamp();
+		if (TimestampDifferenceExceeds(slot->last_inactive_at, now,
+									   slot->data.inactive_timeout * 1000))
+			inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+	}
+
+	if (need_mutex)
+		SpinLockRelease(&slot->mutex);
+
+	if (need_control_lock)
+		LWLockRelease(ReplicationSlotControlLock);
+
+	return (inavidation_cause == RS_INVAL_INACTIVE_TIMEOUT);
+}
+
 /*
  * Flush all replication slots to disk.
  *
@@ -1871,6 +2037,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated = false;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -1894,6 +2061,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		/* save the slot to disk, locking is handled in SaveSlotToPath() */
 		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
 
+		/*
+		 * Here's an opportunity to invalidate inactive replication slots
+		 * based on timeout, so let's do it.
+		 */
+		if (InvalidateReplicationSlotForInactiveTimeout(s, true, true, false))
+			invalidated = true;
+
 		/*
 		 * Slot's data is not flushed each time the confirmed_flush LSN is
 		 * updated as that could lead to frequent writes.  However, we decide
@@ -1920,6 +2094,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	/* If the slot has been invalidated, recalculate the resource limits */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 55ff73cc78..60055a6bb4 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -111,7 +111,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 	tuple = heap_form_tuple(tupdesc, values, nulls);
 	result = HeapTupleGetDatum(tuple);
 
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(false);
 
 	PG_RETURN_DATUM(result);
 }
@@ -224,7 +224,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 	/* ok, slot is now fully created, mark it as persistent if needed */
 	if (!temporary)
 		ReplicationSlotPersist();
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(false);
 
 	PG_RETURN_DATUM(result);
 }
@@ -258,6 +258,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
+	bool		invalidated = false;
 
 	/*
 	 * We don't require any special permission to see this function's data
@@ -288,6 +289,13 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		slot_contents = *slot;
 		SpinLockRelease(&slot->mutex);
 
+		/*
+		 * Here's an opportunity to invalidate inactive replication slots
+		 * based on timeout, so let's do it.
+		 */
+		if (InvalidateReplicationSlotForInactiveTimeout(slot, false, true, true))
+			invalidated = true;
+
 		memset(values, 0, sizeof(values));
 		memset(nulls, 0, sizeof(nulls));
 
@@ -466,6 +474,15 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 	LWLockRelease(ReplicationSlotControlLock);
 
+	/*
+	 * If the slot has been invalidated, recalculate the resource limits
+	 */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
+
 	return (Datum) 0;
 }
 
@@ -668,7 +685,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
@@ -711,7 +728,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 	ReplicationSlotsComputeRequiredXmin(false);
 	ReplicationSlotsComputeRequiredLSN();
 
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(true);
 
 	/* Return the reached position. */
 	values[1] = LSNGetDatum(endlsn);
@@ -955,7 +972,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 	tuple = heap_form_tuple(tupdesc, values, nulls);
 	result = HeapTupleGetDatum(tuple);
 
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(false);
 
 	PG_RETURN_DATUM(result);
 }
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 0420274247..b6795048cc 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -334,7 +334,7 @@ WalSndErrorCleanup(void)
 		wal_segment_close(xlogreader);
 
 	if (MyReplicationSlot != NULL)
-		ReplicationSlotRelease();
+		ReplicationSlotRelease(true);
 
 	ReplicationSlotCleanup();
 
@@ -846,7 +846,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -992,7 +992,7 @@ StartReplication(StartReplicationCmd *cmd)
 	}
 
 	if (cmd->slotname)
-		ReplicationSlotRelease();
+		ReplicationSlotRelease(true);
 
 	/*
 	 * Copy is finished now. Send a single-row result set indicating the next
@@ -1407,7 +1407,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	do_tup_output(tstate, values, nulls);
 	end_tup_output(tstate);
 
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(false);
 }
 
 /*
@@ -1483,7 +1483,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
@@ -1545,7 +1545,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 	WalSndLoop(XLogSendLogical);
 
 	FreeDecodingContext(logical_decoding_ctx);
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(true);
 
 	replication_active = false;
 	if (got_STOPPING)
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index fd4199a098..749de2741e 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4407,7 +4407,7 @@ PostgresMain(const char *dbname, const char *username)
 		 * callback ensuring correct cleanup on FATAL errors.
 		 */
 		if (MyReplicationSlot != NULL)
-			ReplicationSlotRelease();
+			ReplicationSlotRelease(true);
 
 		/* We also want to cleanup temporary slots on error. */
 		ReplicationSlotCleanup();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..d56ecf4137 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
@@ -310,7 +310,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	found_pending_wal = LogicalReplicationSlotHasPendingWal(end_of_wal);
 
 	/* Clean up */
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(false);
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 77def17386..6a91e8a63f 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -249,8 +251,9 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, bool failover,
 								 int inactive_timeout);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
-extern void ReplicationSlotRelease(void);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_invalidation);
+extern void ReplicationSlotRelease(bool set_last_inactive_at);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
 extern void ReplicationSlotMarkDirty(void);
@@ -268,6 +271,10 @@ extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
+extern bool InvalidateReplicationSlotForInactiveTimeout(ReplicationSlot *slot,
+														bool need_control_lock,
+														bool need_mutex,
+														bool persist_state);
 extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock);
 extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..d046e1d5d7
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,168 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot due to inactive_timeout
+#
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+$standby1->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+});
+
+# Set timeout so that the slot when inactive will get invalidated after the
+# timeout.
+my $inactive_timeout = 5;
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot', inactive_timeout := $inactive_timeout);
+]);
+
+$standby1->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# The inactive replication slot info should be null when the slot is active
+my $result = $primary->safe_psql(
+	'postgres', qq[
+	SELECT last_inactive_at IS NULL, inactive_timeout = $inactive_timeout
+		FROM pg_replication_slots WHERE slot_name = 'sb1_slot';
+]);
+is($result, "t|t",
+	'check the inactive replication slot info for an active slot');
+
+my $logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby1->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_at IS NOT NULL
+            AND slot_name = 'sb1_slot'
+            AND inactive_timeout = $inactive_timeout;
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+my $invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$primary->safe_psql('postgres', "CHECKPOINT");
+	if ($primary->log_contains(
+			'invalidating obsolete replication slot "sb1_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot sb1_slot invalidation has been logged');
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for inactive replication slot sb1_slot to be invalidated";
+
+# Testcase end: Invalidate streaming standby's slot due to inactive_timeout
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to inactive_timeout
+my $publisher = $primary;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$subscriber->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', inactive_timeout = $inactive_timeout)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+$result = $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the inactive replication slot info to be updated
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_at IS NOT NULL
+            AND slot_name = 'lsub1_slot'
+            AND inactive_timeout = $inactive_timeout;
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+$invalidated = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+	$publisher->safe_psql('postgres', "CHECKPOINT");
+	if ($publisher->log_contains(
+			'invalidating obsolete replication slot "lsub1_slot"', $logstart))
+	{
+		$invalidated = 1;
+		last;
+	}
+	usleep(100_000);
+}
+ok($invalidated, 'check that slot lsub1_slot invalidation has been logged');
+
+# Testcase end: Invalidate logical subscriber's slot due to inactive_timeout
+# =============================================================================
+
+done_testing();
-- 
2.34.1

#82

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#80)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 20, 2024 at 7:08 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Regarding v12-0004: "Allow setting inactive_timeout in the replication command",
shouldn't we also add an new SQL API say: pg_alter_replication_slot() that would
allow to change the timeout property?

That would allow users to alter this property without the need to make a
replication connection.

+1 to add a new SQL function pg_alter_replication_slot(). It helps
first create the slots and then later decide the appropriate
inactive_timeout. It might grow into altering other slot parameters
such as failover (I'm not sure if altering failover property on the
primary after a while makes it the right candidate for syncing on the
standby). Perhaps, we can add it for altering just inactive_timeout
for now and be done with it.

FWIW, ALTER_REPLICATION_SLOT was added keeping in mind just the
failover property for logical slots, that's why it emits an error
"cannot use ALTER_REPLICATION_SLOT with a physical replication slot"

But the issue is that it would make it inconsistent with the new inactivetimeout
in the subscription that is added in "v12-0005".

Can you please elaborate what the inconsistency it causes with inactivetimeout?

But do we need to display
subinactivetimeout in pg_subscription (and even allow it at subscription creation
/ alter) after all? (I've the feeling there is less such a need as compare to
subfailover, subtwophasestate for example).

Maybe we don't need to. One can always trace down to the replication
slot associated with the subscription on the publisher, and get to
know what the slot's inactive_timeout setting is. However, it looks to
me that it avoids one going to the publisher to know the
inactive_timeout value for a subscription. Moreover, we are allowing
the inactive_timeout to be set via CREATE/ALTER SUBSCRIPTION command,
I believe there's nothing wrong if it's also part of the
pg_subscription catalog.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#83

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#79)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 20, 2024 at 1:51 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote:

2. last_inactive_at and inactive_timeout are now tracked in on-disk
replication slot data structure.

Should last_inactive_at be tracked on disk? Say the engine is down for a period
of time > inactive_timeout then the slot will be invalidated after the engine
re-start (if no activity before we invalidate the slot). Should the time the
engine is down be counted as "inactive" time? I've the feeling it should not, and
that we should only take into account inactive time while the engine is up.

Good point. The question is how do we achieve this without persisting
the 'last_inactive_at'? Say, 'last_inactive_at' for a particular slot
had some valid value before we shut down but it still didn't cross the
configured 'inactive_timeout' value, so, we won't be able to
invalidate it. Now, after the restart, as we don't know the
last_inactive_at's value before the shutdown, we will initialize it
with 0 (this is what Bharath seems to have done in the latest
v13-0002* patch). After this, even if walsender or backend never
acquires the slot, we won't invalidate it. OTOH, if we track
'last_inactive_at' on the disk, after, restart, we could initialize it
to the current time if the value is non-zero. Do you have any better
ideas?

--
With Regards,
Amit Kapila.

#84

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#82)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 21, 2024 at 5:19 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Mar 20, 2024 at 7:08 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Regarding v12-0004: "Allow setting inactive_timeout in the replication command",
shouldn't we also add an new SQL API say: pg_alter_replication_slot() that would
allow to change the timeout property?

That would allow users to alter this property without the need to make a
replication connection.

+1 to add a new SQL function pg_alter_replication_slot().

I also don't see any obvious problem with such an API. However, this
is not a good time to invent new APIs. Let's keep the feature simple
and then we can extend it in the next version after more discussion
and probably by that time we will get some feedback from the field as
well.

It helps
first create the slots and then later decide the appropriate
inactive_timeout. It might grow into altering other slot parameters
such as failover (I'm not sure if altering failover property on the
primary after a while makes it the right candidate for syncing on the
standby). Perhaps, we can add it for altering just inactive_timeout
for now and be done with it.

FWIW, ALTER_REPLICATION_SLOT was added keeping in mind just the
failover property for logical slots, that's why it emits an error
"cannot use ALTER_REPLICATION_SLOT with a physical replication slot"

But the issue is that it would make it inconsistent with the new inactivetimeout
in the subscription that is added in "v12-0005".

Can you please elaborate what the inconsistency it causes with inactivetimeout?

I think the inconsistency can arise from the fact that on publisher
one can change the inactive_timeout for the slot corresponding to a
subscription but the subscriber won't know, so it will still show the
old value. If we want we can document this as a limitation and let
users be aware of it. However, I feel at this stage, let's not even
expose this from the subscription or maybe we can discuss it once/if
we are done with other patches. Anyway, if one wants to use this
feature with a subscription, she can create a slot first on the
publisher with inactive_timeout value and then associate such a slot
with a required subscription.

--
With Regards,
Amit Kapila.

#85

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#84)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 21, 2024 at 9:07 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

I also don't see any obvious problem with such an API. However, this
is not a good time to invent new APIs. Let's keep the feature simple
and then we can extend it in the next version after more discussion
and probably by that time we will get some feedback from the field as
well.

I couldn't agree more.

But the issue is that it would make it inconsistent with the new inactivetimeout
in the subscription that is added in "v12-0005".

Can you please elaborate what the inconsistency it causes with inactivetimeout?

I think the inconsistency can arise from the fact that on publisher
one can change the inactive_timeout for the slot corresponding to a
subscription but the subscriber won't know, so it will still show the
old value.

Understood.

If we want we can document this as a limitation and let
users be aware of it. However, I feel at this stage, let's not even
expose this from the subscription or maybe we can discuss it once/if
we are done with other patches. Anyway, if one wants to use this
feature with a subscription, she can create a slot first on the
publisher with inactive_timeout value and then associate such a slot
with a required subscription.

If we are not exposing it via subscription (meaning, we don't consider
v13-0004 and v13-0005 patches), I feel we can have a new SQL API
pg_alter_replication_slot(int inactive_timeout) for now just altering
the inactive_timeout of a given slot.

With this approach, one can do either of the following:
1) Create a slot with SQL API with inactive_timeout set, and use it
for subscriptions or for streaming standbys.
2) Create a slot with SQL API without inactive_timeout set, use it for
subscriptions or for streaming standbys, and set inactive_timeout
later via pg_alter_replication_slot() depending on how the slot is
consumed
3) Create a subscription with create_slot=true, and set
inactive_timeout via pg_alter_replication_slot() depending on how the
slot is consumed.

This approach seems consistent and minimal to start with.

If we agree on this, I'll drop both 0004 and 0005 that are allowing
inactive_timeout to be set via replication commands and via
create/alter subscription respectively, and implement
pg_alter_replication_slot().

FWIW, adding the new SQL API pg_alter_replication_slot() isn't that hard.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#86

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#83)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 21, 2024 at 8:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Mar 20, 2024 at 1:51 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote:

2. last_inactive_at and inactive_timeout are now tracked in on-disk
replication slot data structure.

Should last_inactive_at be tracked on disk? Say the engine is down for a period
of time > inactive_timeout then the slot will be invalidated after the engine
re-start (if no activity before we invalidate the slot). Should the time the
engine is down be counted as "inactive" time? I've the feeling it should not, and
that we should only take into account inactive time while the engine is up.

Good point. The question is how do we achieve this without persisting
the 'last_inactive_at'? Say, 'last_inactive_at' for a particular slot
had some valid value before we shut down but it still didn't cross the
configured 'inactive_timeout' value, so, we won't be able to
invalidate it. Now, after the restart, as we don't know the
last_inactive_at's value before the shutdown, we will initialize it
with 0 (this is what Bharath seems to have done in the latest
v13-0002* patch). After this, even if walsender or backend never
acquires the slot, we won't invalidate it. OTOH, if we track
'last_inactive_at' on the disk, after, restart, we could initialize it
to the current time if the value is non-zero. Do you have any better
ideas?

This sounds reasonable to me at least.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#87

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#83)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Thu, Mar 21, 2024 at 08:47:18AM +0530, Amit Kapila wrote:

On Wed, Mar 20, 2024 at 1:51 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote:

2. last_inactive_at and inactive_timeout are now tracked in on-disk
replication slot data structure.

Should last_inactive_at be tracked on disk? Say the engine is down for a period
of time > inactive_timeout then the slot will be invalidated after the engine
re-start (if no activity before we invalidate the slot). Should the time the
engine is down be counted as "inactive" time? I've the feeling it should not, and
that we should only take into account inactive time while the engine is up.

Good point. The question is how do we achieve this without persisting
the 'last_inactive_at'? Say, 'last_inactive_at' for a particular slot
had some valid value before we shut down but it still didn't cross the
configured 'inactive_timeout' value, so, we won't be able to
invalidate it. Now, after the restart, as we don't know the
last_inactive_at's value before the shutdown, we will initialize it
with 0 (this is what Bharath seems to have done in the latest
v13-0002* patch). After this, even if walsender or backend never
acquires the slot, we won't invalidate it. OTOH, if we track
'last_inactive_at' on the disk, after, restart, we could initialize it
to the current time if the value is non-zero. Do you have any better
ideas?

I think that setting last_inactive_at when we restart makes sense if the slot
has been active previously. I think the idea is because it's holding xmin/catalog_xmin
and that we don't want to prevent rows removal longer that the timeout.

So what about relying on xmin/catalog_xmin instead that way?

- For physical slots if xmin is set then set last_inactive_at to the current
time at restart (else zero).

- For logical slot, it's not the same as the catalog_xmin is set at the slot
creation time. So what about setting last_inactive_at at the current time at
restart but also at creation time for logical slot? (Setting it to zero at
creation time (as we do in v13) does not look right, given the fact that it's
"already" holding a catalog_xmin).

That way, we'd ensure that we are not holding rows for longer that the timeout
and we don't need to persist last_inactive_at.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#88

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#85)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Thu, Mar 21, 2024 at 10:53:54AM +0530, Bharath Rupireddy wrote:

On Thu, Mar 21, 2024 at 9:07 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

But the issue is that it would make it inconsistent with the new inactivetimeout
in the subscription that is added in "v12-0005".

Can you please elaborate what the inconsistency it causes with inactivetimeout?

I think the inconsistency can arise from the fact that on publisher
one can change the inactive_timeout for the slot corresponding to a
subscription but the subscriber won't know, so it will still show the
old value.

Yeah, that was what I had in mind.

If we want we can document this as a limitation and let
users be aware of it. However, I feel at this stage, let's not even
expose this from the subscription or maybe we can discuss it once/if
we are done with other patches.

I agree, it's important to expose it for things like "failover" but I think we
can get rid of it for the timeout one.

Anyway, if one wants to use this
feature with a subscription, she can create a slot first on the
publisher with inactive_timeout value and then associate such a slot
with a required subscription.

Right.

If we are not exposing it via subscription (meaning, we don't consider
v13-0004 and v13-0005 patches), I feel we can have a new SQL API
pg_alter_replication_slot(int inactive_timeout) for now just altering
the inactive_timeout of a given slot.

Agree, that seems more "natural" that going through a replication connection.

With this approach, one can do either of the following:
1) Create a slot with SQL API with inactive_timeout set, and use it
for subscriptions or for streaming standbys.

Yes.

2) Create a slot with SQL API without inactive_timeout set, use it for
subscriptions or for streaming standbys, and set inactive_timeout
later via pg_alter_replication_slot() depending on how the slot is
consumed

Yes.

3) Create a subscription with create_slot=true, and set
inactive_timeout via pg_alter_replication_slot() depending on how the
slot is consumed.

Yes.

We could also do the above 3 and altering the timeout with a replication
connection but the SQL API seems more natural to me.

This approach seems consistent and minimal to start with.

If we agree on this, I'll drop both 0004 and 0005 that are allowing
inactive_timeout to be set via replication commands and via
create/alter subscription respectively, and implement
pg_alter_replication_slot().

+1 on this.

FWIW, adding the new SQL API pg_alter_replication_slot() isn't that hard.

Also I think we should ensure that one could "only" alter the timeout property
for the time being (if not that could lead to the subscription inconsistency
mentioned above).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#89

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#87)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 21, 2024 at 11:23 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Thu, Mar 21, 2024 at 08:47:18AM +0530, Amit Kapila wrote:

On Wed, Mar 20, 2024 at 1:51 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote:

2. last_inactive_at and inactive_timeout are now tracked in on-disk
replication slot data structure.

Should last_inactive_at be tracked on disk? Say the engine is down for a period
of time > inactive_timeout then the slot will be invalidated after the engine
re-start (if no activity before we invalidate the slot). Should the time the
engine is down be counted as "inactive" time? I've the feeling it should not, and
that we should only take into account inactive time while the engine is up.

Good point. The question is how do we achieve this without persisting
the 'last_inactive_at'? Say, 'last_inactive_at' for a particular slot
had some valid value before we shut down but it still didn't cross the
configured 'inactive_timeout' value, so, we won't be able to
invalidate it. Now, after the restart, as we don't know the
last_inactive_at's value before the shutdown, we will initialize it
with 0 (this is what Bharath seems to have done in the latest
v13-0002* patch). After this, even if walsender or backend never
acquires the slot, we won't invalidate it. OTOH, if we track
'last_inactive_at' on the disk, after, restart, we could initialize it
to the current time if the value is non-zero. Do you have any better
ideas?

I think that setting last_inactive_at when we restart makes sense if the slot
has been active previously. I think the idea is because it's holding xmin/catalog_xmin
and that we don't want to prevent rows removal longer that the timeout.

So what about relying on xmin/catalog_xmin instead that way?

That doesn't sound like a great idea because xmin/catalog_xmin values
won't tell us before restart whether it was active or not. It could
have been inactive for long time before restart but the xmin values
could still be valid. What about we always set 'last_inactive_at' at
restart (if the slot's inactive_timeout has non-zero value) and reset
it as soon as someone acquires that slot? Now, if the slot doesn't get
acquired till 'inactive_timeout', checkpointer will invalidate the
slot.

--
With Regards,
Amit Kapila.

#90

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#88)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 21, 2024 at 11:37 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Thu, Mar 21, 2024 at 10:53:54AM +0530, Bharath Rupireddy wrote:

On Thu, Mar 21, 2024 at 9:07 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

But the issue is that it would make it inconsistent with the new inactivetimeout
in the subscription that is added in "v12-0005".

Can you please elaborate what the inconsistency it causes with inactivetimeout?

I think the inconsistency can arise from the fact that on publisher
one can change the inactive_timeout for the slot corresponding to a
subscription but the subscriber won't know, so it will still show the
old value.

Yeah, that was what I had in mind.

If we want we can document this as a limitation and let
users be aware of it. However, I feel at this stage, let's not even
expose this from the subscription or maybe we can discuss it once/if
we are done with other patches.

I agree, it's important to expose it for things like "failover" but I think we
can get rid of it for the timeout one.

Anyway, if one wants to use this
feature with a subscription, she can create a slot first on the
publisher with inactive_timeout value and then associate such a slot
with a required subscription.

Right.

If we are not exposing it via subscription (meaning, we don't consider
v13-0004 and v13-0005 patches), I feel we can have a new SQL API
pg_alter_replication_slot(int inactive_timeout) for now just altering
the inactive_timeout of a given slot.

Agree, that seems more "natural" that going through a replication connection.

With this approach, one can do either of the following:
1) Create a slot with SQL API with inactive_timeout set, and use it
for subscriptions or for streaming standbys.

Yes.

2) Create a slot with SQL API without inactive_timeout set, use it for
subscriptions or for streaming standbys, and set inactive_timeout
later via pg_alter_replication_slot() depending on how the slot is
consumed

Yes.

3) Create a subscription with create_slot=true, and set
inactive_timeout via pg_alter_replication_slot() depending on how the
slot is consumed.

Yes.

We could also do the above 3 and altering the timeout with a replication
connection but the SQL API seems more natural to me.

If we want to go with this then I think we should at least ensure that
if one specified timeout via CREATE_REPLICATION_SLOT or
ALTER_REPLICATION_SLOT that should be honored.

--
With Regards,
Amit Kapila.

#91

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#89)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Thu, Mar 21, 2024 at 11:43:54AM +0530, Amit Kapila wrote:

On Thu, Mar 21, 2024 at 11:23 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Thu, Mar 21, 2024 at 08:47:18AM +0530, Amit Kapila wrote:

On Wed, Mar 20, 2024 at 1:51 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote:

2. last_inactive_at and inactive_timeout are now tracked in on-disk
replication slot data structure.

Should last_inactive_at be tracked on disk? Say the engine is down for a period
of time > inactive_timeout then the slot will be invalidated after the engine
re-start (if no activity before we invalidate the slot). Should the time the
engine is down be counted as "inactive" time? I've the feeling it should not, and
that we should only take into account inactive time while the engine is up.

Good point. The question is how do we achieve this without persisting
the 'last_inactive_at'? Say, 'last_inactive_at' for a particular slot
had some valid value before we shut down but it still didn't cross the
configured 'inactive_timeout' value, so, we won't be able to
invalidate it. Now, after the restart, as we don't know the
last_inactive_at's value before the shutdown, we will initialize it
with 0 (this is what Bharath seems to have done in the latest
v13-0002* patch). After this, even if walsender or backend never
acquires the slot, we won't invalidate it. OTOH, if we track
'last_inactive_at' on the disk, after, restart, we could initialize it
to the current time if the value is non-zero. Do you have any better
ideas?

I think that setting last_inactive_at when we restart makes sense if the slot
has been active previously. I think the idea is because it's holding xmin/catalog_xmin
and that we don't want to prevent rows removal longer that the timeout.

So what about relying on xmin/catalog_xmin instead that way?

That doesn't sound like a great idea because xmin/catalog_xmin values
won't tell us before restart whether it was active or not. It could
have been inactive for long time before restart but the xmin values
could still be valid.

Right, the idea here was more like "don't hold xmin/catalog_xmin" for longer
than timeout.

My concern was that we set catalog_xmin at logical slot creation time. So if we
set last_inactive_at to zero at creation time and the slot is not used for a long
period of time > timeout, then I think it's not helping there.

What about we always set 'last_inactive_at' at
restart (if the slot's inactive_timeout has non-zero value) and reset
it as soon as someone acquires that slot? Now, if the slot doesn't get
acquired till 'inactive_timeout', checkpointer will invalidate the
slot.

Yeah that sounds good to me, but I think we should set last_inactive_at at creation
time too, if not:

- physical slot could remain valid for long time after creation (which is fine)
but the behavior would change at restart.
- logical slot would have the "issue" reported above (holding catalog_xmin).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#92

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#90)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Thu, Mar 21, 2024 at 11:53:32AM +0530, Amit Kapila wrote:

On Thu, Mar 21, 2024 at 11:37 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

We could also do the above 3 and altering the timeout with a replication
connection but the SQL API seems more natural to me.

If we want to go with this then I think we should at least ensure that
if one specified timeout via CREATE_REPLICATION_SLOT or
ALTER_REPLICATION_SLOT that should be honored.

Yeah, agree.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#93

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#81)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Thu, Mar 21, 2024 at 05:05:46AM +0530, Bharath Rupireddy wrote:

On Wed, Mar 20, 2024 at 1:04 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Wed, Mar 20, 2024 at 08:58:05AM +0530, Amit Kapila wrote:

On Wed, Mar 20, 2024 at 12:49 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Following are some open points:

1. Where to do inactive_timeout invalidation exactly if not the checkpointer.

I have suggested to do it at the time of CheckpointReplicationSlots()
and Bertrand suggested to do it whenever we resume using the slot. I
think we should follow both the suggestions.

Agree. I also think that pg_get_replication_slots() would be a good place, so
that queries would return the right invalidation status.

I've addressed review comments and attaching the v13 patches with the
following changes:

Thanks!

v13-0001 looks good to me. The only Nit (that I've mentioned up-thread) is that
in the pg_replication_slots view, the invalidation_reason is "far away" from the
conflicting field. I understand that one could query the fields individually but
when describing the view or reading the doc, it seems more appropriate to see
them closer. Also as "failover" and "synced" are also new in version 17, there
is no risk to break order by "17,18" kind of queries (which are the failover
and sync positions).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#94

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#93)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 21, 2024 at 12:40 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

v13-0001 looks good to me. The only Nit (that I've mentioned up-thread) is that
in the pg_replication_slots view, the invalidation_reason is "far away" from the
conflicting field. I understand that one could query the fields individually but
when describing the view or reading the doc, it seems more appropriate to see
them closer. Also as "failover" and "synced" are also new in version 17, there
is no risk to break order by "17,18" kind of queries (which are the failover
and sync positions).

Hm, yeah, I can change that in the next version of the patches. Thanks.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#95

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#91)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 21, 2024 at 12:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Thu, Mar 21, 2024 at 11:43:54AM +0530, Amit Kapila wrote:

On Thu, Mar 21, 2024 at 11:23 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Thu, Mar 21, 2024 at 08:47:18AM +0530, Amit Kapila wrote:

On Wed, Mar 20, 2024 at 1:51 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote:

2. last_inactive_at and inactive_timeout are now tracked in on-disk
replication slot data structure.

Should last_inactive_at be tracked on disk? Say the engine is down for a period
of time > inactive_timeout then the slot will be invalidated after the engine
re-start (if no activity before we invalidate the slot). Should the time the
engine is down be counted as "inactive" time? I've the feeling it should not, and
that we should only take into account inactive time while the engine is up.

Good point. The question is how do we achieve this without persisting
the 'last_inactive_at'? Say, 'last_inactive_at' for a particular slot
had some valid value before we shut down but it still didn't cross the
configured 'inactive_timeout' value, so, we won't be able to
invalidate it. Now, after the restart, as we don't know the
last_inactive_at's value before the shutdown, we will initialize it
with 0 (this is what Bharath seems to have done in the latest
v13-0002* patch). After this, even if walsender or backend never
acquires the slot, we won't invalidate it. OTOH, if we track
'last_inactive_at' on the disk, after, restart, we could initialize it
to the current time if the value is non-zero. Do you have any better
ideas?

I think that setting last_inactive_at when we restart makes sense if the slot
has been active previously. I think the idea is because it's holding xmin/catalog_xmin
and that we don't want to prevent rows removal longer that the timeout.

So what about relying on xmin/catalog_xmin instead that way?

That doesn't sound like a great idea because xmin/catalog_xmin values
won't tell us before restart whether it was active or not. It could
have been inactive for long time before restart but the xmin values
could still be valid.

Right, the idea here was more like "don't hold xmin/catalog_xmin" for longer
than timeout.

My concern was that we set catalog_xmin at logical slot creation time. So if we
set last_inactive_at to zero at creation time and the slot is not used for a long
period of time > timeout, then I think it's not helping there.

But, we do call ReplicationSlotRelease() after slot creation. For
example, see CreateReplicationSlot(). So wouldn't that take care of
the case you are worried about?

--
With Regards,
Amit Kapila.

#96

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#95)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 21, 2024 at 3:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

My concern was that we set catalog_xmin at logical slot creation time. So if we
set last_inactive_at to zero at creation time and the slot is not used for a long
period of time > timeout, then I think it's not helping there.

But, we do call ReplicationSlotRelease() after slot creation. For
example, see CreateReplicationSlot(). So wouldn't that take care of
the case you are worried about?

Right. That's true even for pg_create_physical_replication_slot and
pg_create_logical_replication_slot. AFAICS, setting it to the current
timestamp in ReplicationSlotRelease suffices unless I'm missing
something.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#97

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#94)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 21, 2024 at 2:44 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Mar 21, 2024 at 12:40 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

v13-0001 looks good to me. The only Nit (that I've mentioned up-thread) is that
in the pg_replication_slots view, the invalidation_reason is "far away" from the
conflicting field. I understand that one could query the fields individually but
when describing the view or reading the doc, it seems more appropriate to see
them closer. Also as "failover" and "synced" are also new in version 17, there
is no risk to break order by "17,18" kind of queries (which are the failover
and sync positions).

Hm, yeah, I can change that in the next version of the patches. Thanks.

This makes sense to me. Apart from this, few more comments on 0001.
1.
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -676,13 +676,13 @@ get_old_cluster_logical_slot_infos(DbInfo
*dbinfo, bool live_check)
  * removed.
  */
  res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase,
failover, "
- "%s as caught_up, conflict_reason IS NOT NULL as invalid "
+ "%s as caught_up, invalidation_reason IS NOT NULL as invalid "
  "FROM pg_catalog.pg_replication_slots "
  "WHERE slot_type = 'logical' AND "
  "database = current_database() AND "
  "temporary IS FALSE;",
  live_check ? "FALSE" :
- "(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+ "(CASE WHEN conflicting THEN FALSE "

I think here at both places we need to change 'conflict_reason' to
'conflicting'.

2.
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>invalidation_reason</structfield> <type>text</type>
+      </para>
+      <para>
+       The reason for the slot's invalidation. It is set for both logical and
+       physical slots. <literal>NULL</literal> if the slot is not invalidated.
+       Possible values are:
+       <itemizedlist spacing="compact">
+        <listitem>
+         <para>
+          <literal>wal_removed</literal> means that the required WAL has been
+          removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>rows_removed</literal> means that the required rows have
+          been removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>wal_level_insufficient</literal> means that the
+          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
+          perform logical decoding.
+         </para>

Can the reasons 'rows_removed' and 'wal_level_insufficient' appear for
physical slots? If not, then it is not clear from above text.

3.
-# Verify slots are reported as non conflicting in pg_replication_slots
+# Verify slots are reported as valid in pg_replication_slots
 is( $node_standby->safe_psql(
  'postgres',
  q[select bool_or(conflicting) from
-   (select conflict_reason is not NULL as conflicting
-    from pg_replication_slots WHERE slot_type = 'logical')]),
+   (select conflicting from pg_replication_slots
+ where slot_type = 'logical')]),
  'f',
- 'Logical slots are reported as non conflicting');
+ 'Logical slots are reported as valid');

I don't think we need to change the comment or success message in this test.

--
With Regards,
Amit Kapila.

#98

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#96)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Thu, Mar 21, 2024 at 04:13:31PM +0530, Bharath Rupireddy wrote:

On Thu, Mar 21, 2024 at 3:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

My concern was that we set catalog_xmin at logical slot creation time. So if we
set last_inactive_at to zero at creation time and the slot is not used for a long
period of time > timeout, then I think it's not helping there.

But, we do call ReplicationSlotRelease() after slot creation. For
example, see CreateReplicationSlot(). So wouldn't that take care of
the case you are worried about?

Right. That's true even for pg_create_physical_replication_slot and
pg_create_logical_replication_slot. AFAICS, setting it to the current
timestamp in ReplicationSlotRelease suffices unless I'm missing
something.

Right, but we have:

"
if (set_last_inactive_at &&
slot->data.persistency == RS_PERSISTENT)
{
/*
* There's no point in allowing failover slots to get invalidated
* based on slot's inactive_timeout parameter on standby. The failover
* slots simply get synced from the primary on the standby.
*/
if (!(RecoveryInProgress() && slot->data.failover))
{
SpinLockAcquire(&slot->mutex);
slot->last_inactive_at = GetCurrentTimestamp();
SpinLockRelease(&slot->mutex);
}
}
"

while we set set_last_inactive_at to false at creation time so that last_inactive_at
is not set to GetCurrentTimestamp(). We should set set_last_inactive_at to true
if a timeout is provided during the slot creation.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#99

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#97)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 21, 2024 at 4:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

This makes sense to me. Apart from this, few more comments on 0001.

Thanks for looking into it.

1.
- "%s as caught_up, conflict_reason IS NOT NULL as invalid "
+ "%s as caught_up, invalidation_reason IS NOT NULL as invalid "
live_check ? "FALSE" :
- "(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+ "(CASE WHEN conflicting THEN FALSE "

I think here at both places we need to change 'conflict_reason' to
'conflicting'.

Basically, the idea there is to not live_check for invalidated logical
slots. It has nothing to do with conflicting. Up until now,
conflict_reason is also reporting wal_removed (although wrongly
including rows_removed, wal_level_insufficient, the two reasons for
conflicts). So, I think invalidation_reason is right for invalid
column. Also, I think we need to change conflicting to
invalidation_reason for live_check. So, I've changed that to use
invalidation_reason for both columns.

2.

Can the reasons 'rows_removed' and 'wal_level_insufficient' appear for
physical slots?

No. They can only occur for logical slots, check
InvalidatePossiblyObsoleteSlot, only the logical slots get
invalidated.

If not, then it is not clear from above text.

I've stated that "It is set only for logical slots." for rows_removed
and wal_level_insufficient. Other reasons can occur for both slots.

3.
-# Verify slots are reported as non conflicting in pg_replication_slots
+# Verify slots are reported as valid in pg_replication_slots
is( $node_standby->safe_psql(
'postgres',
q[select bool_or(conflicting) from
-   (select conflict_reason is not NULL as conflicting
-    from pg_replication_slots WHERE slot_type = 'logical')]),
+   (select conflicting from pg_replication_slots
+ where slot_type = 'logical')]),
'f',
- 'Logical slots are reported as non conflicting');
+ 'Logical slots are reported as valid');

I don't think we need to change the comment or success message in this test.

Yes. There the intention of the test case is to verify logical slots
are reported as non conflicting. So, I changed them.

Please find the v14-0001 patch for now. I'll post the other patches soon.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v14-0001-Track-invalidation_reason-in-pg_replication_slot.patchapplication/octet-stream; name=v14-0001-Track-invalidation_reason-in-pg_replication_slot.patchDownload

From 606c2f61f11c1a57fbc3c9ffd6d553f6f6b47473 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 21 Mar 2024 17:43:50 +0000
Subject: [PATCH v14] Track invalidation_reason in pg_replication_slots

Up until now, reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
'conflict_reason' to show the reasons for slot invalidation, but
only for logical slots.

This commit adds a new column 'invalidation_reason' to show
invalidation reasons for both physical and logical slots. And,
this commit also turns 'conflict_reason' text column to
'conflicting' boolean column (effectively reverting commit
007693f2a). The 'conflicting' column is true for inavlidation
reasons 'rows_removed' and 'wal_level_insufficient', because they
are the ones making the slot conflict with recovery. When
'conflicting' is true, one can now look at the new
'invalidation_reason' column for reason for logical slots conflict
with recovery.

The new 'invalidation_reason' column will also be useful when we
add more invalidation reasons in the future commit.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Reviewed-by: shveta malik
Discussion: https://www.postgresql.org/message-id/ZfR7HuzFEswakt/a%40ip-10-97-1-34.eu-west-3.compute.internal
---
 doc/src/sgml/ref/pgupgrade.sgml               |  4 +-
 doc/src/sgml/system-views.sgml                | 25 +++++++---
 src/backend/catalog/system_views.sql          |  3 +-
 src/backend/replication/logical/slotsync.c    |  2 +-
 src/backend/replication/slot.c                | 49 +++++++++----------
 src/backend/replication/slotfuncs.c           | 25 +++++++---
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  6 +--
 src/include/replication/slot.h                |  2 +-
 .../t/035_standby_logical_decoding.pl         | 35 ++++++-------
 .../t/040_standby_failover_slots_sync.pl      |  4 +-
 src/test/regress/expected/rules.out           |  5 +-
 12 files changed, 93 insertions(+), 71 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 58c6c2df8b..8de52bf752 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -453,8 +453,8 @@ make prefix=/usr/local/pgsql.new install
       <para>
        All slots on the old cluster must be usable, i.e., there are no slots
        whose
-       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>conflict_reason</structfield>
-       is not <literal>NULL</literal>.
+       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>conflicting</structfield>
+       is not <literal>true</literal>.
       </para>
      </listitem>
      <listitem>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be90edd0e2..b5da476c20 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,13 +2525,24 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflict_reason</structfield> <type>text</type>
+       <structfield>conflicting</structfield> <type>bool</type>
       </para>
       <para>
-       The reason for the logical slot's conflict with recovery. It is always
-       NULL for physical slots, as well as for logical slots which are not
-       invalidated. The non-NULL values indicate that the slot is marked
-       as invalidated. Possible values are:
+       True if this logical slot conflicted with recovery (and so is now
+       invalidated). When this column is true, check
+       <structfield>invalidation_reason</structfield> column for the conflict
+       reason. Always NULL for physical slots.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>invalidation_reason</structfield> <type>text</type>
+      </para>
+      <para>
+       The reason for the slot's invalidation. It is set for both logical and
+       physical slots. <literal>NULL</literal> if the slot is not invalidated.
+       Possible values are:
        <itemizedlist spacing="compact">
         <listitem>
          <para>
@@ -2542,14 +2553,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
         <listitem>
          <para>
           <literal>rows_removed</literal> means that the required rows have
-          been removed.
+          been removed. It is set only for logical slots.
          </para>
         </listitem>
         <listitem>
          <para>
           <literal>wal_level_insufficient</literal> means that the
           primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
-          perform logical decoding.
+          perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
        </itemizedlist>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 04227a72d1..f69b7f5580 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,8 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflict_reason,
+            L.conflicting,
+            L.invalidation_reason,
             L.failover,
             L.synced
     FROM pg_get_replication_slots() AS L
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 7b180bdb5c..30480960c5 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -663,7 +663,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, conflict_reason"
+		" database, invalidation_reason"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 91ca397857..cdf0c450c5 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -1525,14 +1525,14 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	initial_effective_xmin = InvalidXLogRecPtr;
 	XLogRecPtr	initial_catalog_effective_xmin = InvalidXLogRecPtr;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
-	ReplicationSlotInvalidationCause conflict_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
 
 	for (;;)
 	{
 		XLogRecPtr	restart_lsn;
 		NameData	slotname;
 		int			active_pid = 0;
-		ReplicationSlotInvalidationCause conflict = RS_INVAL_NONE;
+		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1554,17 +1554,14 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		restart_lsn = s->data.restart_lsn;
 
-		/*
-		 * If the slot is already invalid or is a non conflicting slot, we
-		 * don't need to do anything.
-		 */
+		/* we do nothing if the slot is already invalid */
 		if (s->data.invalidated == RS_INVAL_NONE)
 		{
 			/*
 			 * The slot's mutex will be released soon, and it is possible that
 			 * those values change since the process holding the slot has been
 			 * terminated (if any), so record them here to ensure that we
-			 * would report the correct conflict cause.
+			 * would report the correct invalidation cause.
 			 */
 			if (!terminated)
 			{
@@ -1578,7 +1575,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				case RS_INVAL_WAL_REMOVED:
 					if (initial_restart_lsn != InvalidXLogRecPtr &&
 						initial_restart_lsn < oldestLSN)
-						conflict = cause;
+						invalidation_cause = cause;
 					break;
 				case RS_INVAL_HORIZON:
 					if (!SlotIsLogical(s))
@@ -1589,15 +1586,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (TransactionIdIsValid(initial_effective_xmin) &&
 						TransactionIdPrecedesOrEquals(initial_effective_xmin,
 													  snapshotConflictHorizon))
-						conflict = cause;
+						invalidation_cause = cause;
 					else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
 							 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
 														   snapshotConflictHorizon))
-						conflict = cause;
+						invalidation_cause = cause;
 					break;
 				case RS_INVAL_WAL_LEVEL:
 					if (SlotIsLogical(s))
-						conflict = cause;
+						invalidation_cause = cause;
 					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
@@ -1605,14 +1602,14 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		}
 
 		/*
-		 * The conflict cause recorded previously should not change while the
-		 * process owning the slot (if any) has been terminated.
+		 * The invalidation cause recorded previously should not change while
+		 * the process owning the slot (if any) has been terminated.
 		 */
-		Assert(!(conflict_prev != RS_INVAL_NONE && terminated &&
-				 conflict_prev != conflict));
+		Assert(!(invalidation_cause_prev != RS_INVAL_NONE && terminated &&
+				 invalidation_cause_prev != invalidation_cause));
 
-		/* if there's no conflict, we're done */
-		if (conflict == RS_INVAL_NONE)
+		/* if there's no invalidation, we're done */
+		if (invalidation_cause == RS_INVAL_NONE)
 		{
 			SpinLockRelease(&s->mutex);
 			if (released_lock)
@@ -1632,13 +1629,13 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
-			s->data.invalidated = conflict;
+			s->data.invalidated = invalidation_cause;
 
 			/*
 			 * XXX: We should consider not overwriting restart_lsn and instead
 			 * just rely on .invalidated.
 			 */
-			if (conflict == RS_INVAL_WAL_REMOVED)
+			if (invalidation_cause == RS_INVAL_WAL_REMOVED)
 				s->data.restart_lsn = InvalidXLogRecPtr;
 
 			/* Let caller know */
@@ -1681,7 +1678,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			 */
 			if (last_signaled_pid != active_pid)
 			{
-				ReportSlotInvalidation(conflict, true, active_pid,
+				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
 									   oldestLSN, snapshotConflictHorizon);
 
@@ -1694,7 +1691,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 				last_signaled_pid = active_pid;
 				terminated = true;
-				conflict_prev = conflict;
+				invalidation_cause_prev = invalidation_cause;
 			}
 
 			/* Wait until the slot is released. */
@@ -1727,7 +1724,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			ReplicationSlotSave();
 			ReplicationSlotRelease();
 
-			ReportSlotInvalidation(conflict, false, active_pid,
+			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
 								   oldestLSN, snapshotConflictHorizon);
 
@@ -2356,21 +2353,21 @@ RestoreSlotFromDisk(const char *name)
 }
 
 /*
- * Maps a conflict reason for a replication slot to
+ * Maps an invalidation reason for a replication slot to
  * ReplicationSlotInvalidationCause.
  */
 ReplicationSlotInvalidationCause
-GetSlotInvalidationCause(const char *conflict_reason)
+GetSlotInvalidationCause(const char *invalidation_reason)
 {
 	ReplicationSlotInvalidationCause cause;
 	ReplicationSlotInvalidationCause result = RS_INVAL_NONE;
 	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
 
-	Assert(conflict_reason);
+	Assert(invalidation_reason);
 
 	for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
 	{
-		if (strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0)
+		if (strcmp(SlotInvalidationCauses[cause], invalidation_reason) == 0)
 		{
 			found = true;
 			result = cause;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index ad79e1fccd..4232c1e52e 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 17
+#define PG_GET_REPLICATION_SLOTS_COLS 18
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -263,6 +263,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		bool		nulls[PG_GET_REPLICATION_SLOTS_COLS];
 		WALAvailability walstate;
 		int			i;
+		ReplicationSlotInvalidationCause cause;
 
 		if (!slot->in_use)
 			continue;
@@ -409,18 +410,28 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.data.database == InvalidOid)
+		cause = slot_contents.data.invalidated;
+
+		if (SlotIsPhysical(&slot_contents))
 			nulls[i++] = true;
 		else
 		{
-			ReplicationSlotInvalidationCause cause = slot_contents.data.invalidated;
-
-			if (cause == RS_INVAL_NONE)
-				nulls[i++] = true;
+			/*
+			 * rows_removed and wal_level_insufficient are the only two
+			 * reasons for the logical slot's conflict with recovery.
+			 */
+			if (cause == RS_INVAL_HORIZON ||
+				cause == RS_INVAL_WAL_LEVEL)
+				values[i++] = BoolGetDatum(true);
 			else
-				values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+				values[i++] = BoolGetDatum(false);
 		}
 
+		if (cause == RS_INVAL_NONE)
+			nulls[i++] = true;
+		else
+			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
 		values[i++] = BoolGetDatum(slot_contents.data.synced);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index b5b8d11602..95c22a7200 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -676,13 +676,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+							"(CASE WHEN invalidation_reason IS NOT NULL THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 042f66f714..71c74350a0 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11133,9 +11133,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,text,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,invalidation_reason,failover,synced}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 425effad21..7f25a083ee 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -273,7 +273,7 @@ extern void CheckPointReplicationSlots(bool is_shutdown);
 extern void CheckSlotRequirements(void);
 extern void CheckSlotPermissions(void);
 extern ReplicationSlotInvalidationCause
-			GetSlotInvalidationCause(const char *conflict_reason);
+			GetSlotInvalidationCause(const char *invalidation_reason);
 
 extern bool SlotExistsInStandbySlotNames(const char *slot_name);
 extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 88b03048c4..8d6740c734 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,7 +168,7 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflict_reason in pg_replication_slots.
+# Check reason for conflict in pg_replication_slots.
 sub check_slots_conflict_reason
 {
 	my ($slot_prefix, $reason) = @_;
@@ -178,15 +178,15 @@ sub check_slots_conflict_reason
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$active_slot' and conflicting;));
 
-	is($res, "$reason", "$active_slot conflict_reason is $reason");
+	is($res, "$reason", "$active_slot reason for conflict is $reason");
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$inactive_slot' and conflicting;));
 
-	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
+	is($res, "$reason", "$inactive_slot reason for conflict is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -293,13 +293,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflict_reason is NULL for physical slot
+# Check conflicting is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT conflicting is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflict_reason as NULL");
+is($res, 't', "Physical slot reports conflicting as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -524,7 +524,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Ensure that replication slot stats are not removed after invalidation.
@@ -551,7 +551,7 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify conflict_reason is retained across a restart.
+# Verify reason for conflict is retained across a restart.
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
@@ -560,7 +560,8 @@ check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
+	"SELECT restart_lsn FROM pg_replication_slots
+		WHERE slot_name = 'vacuum_full_activeslot' AND conflicting;"
 );
 
 chomp($restart_lsn);
@@ -611,7 +612,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
@@ -647,7 +648,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
@@ -700,8 +701,8 @@ ok( $node_standby->poll_query_until(
 is( $node_standby->safe_psql(
 		'postgres',
 		q[select bool_or(conflicting) from
-		  (select conflict_reason is not NULL as conflicting
-		   from pg_replication_slots WHERE slot_type = 'logical')]),
+		  (select conflicting from pg_replication_slots
+			where slot_type = 'logical')]),
 	'f',
 	'Logical slots are reported as non conflicting');
 
@@ -739,7 +740,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
@@ -783,7 +784,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+# Verify reason for conflict is 'wal_level_insufficient' in pg_replication_slots
 check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 0ea1f3d323..f47bfd78eb 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -228,7 +228,7 @@ $standby1->safe_psql('postgres', "CHECKPOINT");
 # Check if the synced slot is invalidated
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'synchronized slot has been invalidated');
@@ -274,7 +274,7 @@ $standby1->wait_for_log(qr/dropped replication slot "lsub1_slot" of dbid [0-9]+/
 # flagged as 'synced'
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'logical slot is re-synced');
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 84e359f6ed..18829ea586 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,10 +1473,11 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflict_reason,
+    l.conflicting,
+    l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

#100

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#99)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 21, 2024 at 11:21 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Please find the v14-0001 patch for now. I'll post the other patches soon.

LGTM. Let's wait for Bertrand to see if he has more comments on 0001
and then I'll push it.

--
With Regards,
Amit Kapila.

#101

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#100)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Fri, Mar 22, 2024 at 10:49:17AM +0530, Amit Kapila wrote:

On Thu, Mar 21, 2024 at 11:21 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Please find the v14-0001 patch for now.

Thanks!

LGTM. Let's wait for Bertrand to see if he has more comments on 0001
and then I'll push it.

LGTM too.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#102

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#101)

6 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 22, 2024 at 12:39 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Please find the v14-0001 patch for now.

Thanks!

LGTM. Let's wait for Bertrand to see if he has more comments on 0001
and then I'll push it.

LGTM too.

Thanks. Here I'm implementing the following:

0001 Track invalidation_reason in pg_replication_slots
0002 Track last_inactive_at in pg_replication_slots
0003 Allow setting inactive_timeout for replication slots via SQL API
0004 Introduce new SQL funtion pg_alter_replication_slot
0005 Allow setting inactive_timeout in the replication command
0006 Add inactive_timeout based replication slot invalidation

1. Keep it last_inactive_at as a shared memory variable, but always
set it at restart if the slot's inactive_timeout has non-zero value
and reset it as soon as someone acquires that slot so that if the slot
doesn't get acquired till inactive_timeout, checkpointer will
invalidate the slot.
2. Ensure with pg_alter_replication_slot one could "only" alter the
timeout property for the time being, if not that could lead to the
subscription inconsistency.
3. Have some notes in the CREATE and ALTER SUBSCRIPTION docs about
using an existing slot to leverage inactive_timeout feature.
4. last_inactive_at should also be set to the current time during slot
creation because if one creates a slot and does nothing with it then
it's the time it starts to be inactive.
5. We don't set last_inactive_at to GetCurrentTimestamp() for failover slots.
6. Leave the patch that added support for inactive_timeout in subscriptions.

Please see the attached v14 patch set. No change in the attached
v14-0001 from the previous patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v14-0001-Track-invalidation_reason-in-pg_replication_slot.patchapplication/x-patch; name=v14-0001-Track-invalidation_reason-in-pg_replication_slot.patchDownload

From 9ce64d4e6b629d70516bb2a16c5cbfa458c3a244 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 22 Mar 2024 02:56:12 +0000
Subject: [PATCH v14 1/6] Track invalidation_reason in pg_replication_slots

Up until now, reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
'conflict_reason' to show the reasons for slot invalidation, but
only for logical slots.

This commit adds a new column 'invalidation_reason' to show
invalidation reasons for both physical and logical slots. And,
this commit also turns 'conflict_reason' text column to
'conflicting' boolean column (effectively reverting commit
007693f2a). The 'conflicting' column is true for inavlidation
reasons 'rows_removed' and 'wal_level_insufficient', because they
are the ones making the slot conflict with recovery. When
'conflicting' is true, one can now look at the new
'invalidation_reason' column for reason for logical slots conflict
with recovery.

The new 'invalidation_reason' column will also be useful when we
add more invalidation reasons in the future commit.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Reviewed-by: shveta malik
Discussion: https://www.postgresql.org/message-id/ZfR7HuzFEswakt/a%40ip-10-97-1-34.eu-west-3.compute.internal
---
 doc/src/sgml/ref/pgupgrade.sgml               |  4 +-
 doc/src/sgml/system-views.sgml                | 25 +++++++---
 src/backend/catalog/system_views.sql          |  3 +-
 src/backend/replication/logical/slotsync.c    |  2 +-
 src/backend/replication/slot.c                | 49 +++++++++----------
 src/backend/replication/slotfuncs.c           | 25 +++++++---
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  6 +--
 src/include/replication/slot.h                |  2 +-
 .../t/035_standby_logical_decoding.pl         | 35 ++++++-------
 .../t/040_standby_failover_slots_sync.pl      |  4 +-
 src/test/regress/expected/rules.out           |  5 +-
 12 files changed, 93 insertions(+), 71 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 58c6c2df8b..8de52bf752 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -453,8 +453,8 @@ make prefix=/usr/local/pgsql.new install
       <para>
        All slots on the old cluster must be usable, i.e., there are no slots
        whose
-       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>conflict_reason</structfield>
-       is not <literal>NULL</literal>.
+       <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>conflicting</structfield>
+       is not <literal>true</literal>.
       </para>
      </listitem>
      <listitem>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be90edd0e2..b5da476c20 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,13 +2525,24 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflict_reason</structfield> <type>text</type>
+       <structfield>conflicting</structfield> <type>bool</type>
       </para>
       <para>
-       The reason for the logical slot's conflict with recovery. It is always
-       NULL for physical slots, as well as for logical slots which are not
-       invalidated. The non-NULL values indicate that the slot is marked
-       as invalidated. Possible values are:
+       True if this logical slot conflicted with recovery (and so is now
+       invalidated). When this column is true, check
+       <structfield>invalidation_reason</structfield> column for the conflict
+       reason. Always NULL for physical slots.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>invalidation_reason</structfield> <type>text</type>
+      </para>
+      <para>
+       The reason for the slot's invalidation. It is set for both logical and
+       physical slots. <literal>NULL</literal> if the slot is not invalidated.
+       Possible values are:
        <itemizedlist spacing="compact">
         <listitem>
          <para>
@@ -2542,14 +2553,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
         <listitem>
          <para>
           <literal>rows_removed</literal> means that the required rows have
-          been removed.
+          been removed. It is set only for logical slots.
          </para>
         </listitem>
         <listitem>
          <para>
           <literal>wal_level_insufficient</literal> means that the
           primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
-          perform logical decoding.
+          perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
        </itemizedlist>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 04227a72d1..f69b7f5580 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,8 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflict_reason,
+            L.conflicting,
+            L.invalidation_reason,
             L.failover,
             L.synced
     FROM pg_get_replication_slots() AS L
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 7b180bdb5c..30480960c5 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -663,7 +663,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, conflict_reason"
+		" database, invalidation_reason"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 91ca397857..cdf0c450c5 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -1525,14 +1525,14 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	initial_effective_xmin = InvalidXLogRecPtr;
 	XLogRecPtr	initial_catalog_effective_xmin = InvalidXLogRecPtr;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
-	ReplicationSlotInvalidationCause conflict_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
 
 	for (;;)
 	{
 		XLogRecPtr	restart_lsn;
 		NameData	slotname;
 		int			active_pid = 0;
-		ReplicationSlotInvalidationCause conflict = RS_INVAL_NONE;
+		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1554,17 +1554,14 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		restart_lsn = s->data.restart_lsn;
 
-		/*
-		 * If the slot is already invalid or is a non conflicting slot, we
-		 * don't need to do anything.
-		 */
+		/* we do nothing if the slot is already invalid */
 		if (s->data.invalidated == RS_INVAL_NONE)
 		{
 			/*
 			 * The slot's mutex will be released soon, and it is possible that
 			 * those values change since the process holding the slot has been
 			 * terminated (if any), so record them here to ensure that we
-			 * would report the correct conflict cause.
+			 * would report the correct invalidation cause.
 			 */
 			if (!terminated)
 			{
@@ -1578,7 +1575,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				case RS_INVAL_WAL_REMOVED:
 					if (initial_restart_lsn != InvalidXLogRecPtr &&
 						initial_restart_lsn < oldestLSN)
-						conflict = cause;
+						invalidation_cause = cause;
 					break;
 				case RS_INVAL_HORIZON:
 					if (!SlotIsLogical(s))
@@ -1589,15 +1586,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (TransactionIdIsValid(initial_effective_xmin) &&
 						TransactionIdPrecedesOrEquals(initial_effective_xmin,
 													  snapshotConflictHorizon))
-						conflict = cause;
+						invalidation_cause = cause;
 					else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
 							 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
 														   snapshotConflictHorizon))
-						conflict = cause;
+						invalidation_cause = cause;
 					break;
 				case RS_INVAL_WAL_LEVEL:
 					if (SlotIsLogical(s))
-						conflict = cause;
+						invalidation_cause = cause;
 					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
@@ -1605,14 +1602,14 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		}
 
 		/*
-		 * The conflict cause recorded previously should not change while the
-		 * process owning the slot (if any) has been terminated.
+		 * The invalidation cause recorded previously should not change while
+		 * the process owning the slot (if any) has been terminated.
 		 */
-		Assert(!(conflict_prev != RS_INVAL_NONE && terminated &&
-				 conflict_prev != conflict));
+		Assert(!(invalidation_cause_prev != RS_INVAL_NONE && terminated &&
+				 invalidation_cause_prev != invalidation_cause));
 
-		/* if there's no conflict, we're done */
-		if (conflict == RS_INVAL_NONE)
+		/* if there's no invalidation, we're done */
+		if (invalidation_cause == RS_INVAL_NONE)
 		{
 			SpinLockRelease(&s->mutex);
 			if (released_lock)
@@ -1632,13 +1629,13 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
-			s->data.invalidated = conflict;
+			s->data.invalidated = invalidation_cause;
 
 			/*
 			 * XXX: We should consider not overwriting restart_lsn and instead
 			 * just rely on .invalidated.
 			 */
-			if (conflict == RS_INVAL_WAL_REMOVED)
+			if (invalidation_cause == RS_INVAL_WAL_REMOVED)
 				s->data.restart_lsn = InvalidXLogRecPtr;
 
 			/* Let caller know */
@@ -1681,7 +1678,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			 */
 			if (last_signaled_pid != active_pid)
 			{
-				ReportSlotInvalidation(conflict, true, active_pid,
+				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
 									   oldestLSN, snapshotConflictHorizon);
 
@@ -1694,7 +1691,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 				last_signaled_pid = active_pid;
 				terminated = true;
-				conflict_prev = conflict;
+				invalidation_cause_prev = invalidation_cause;
 			}
 
 			/* Wait until the slot is released. */
@@ -1727,7 +1724,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			ReplicationSlotSave();
 			ReplicationSlotRelease();
 
-			ReportSlotInvalidation(conflict, false, active_pid,
+			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
 								   oldestLSN, snapshotConflictHorizon);
 
@@ -2356,21 +2353,21 @@ RestoreSlotFromDisk(const char *name)
 }
 
 /*
- * Maps a conflict reason for a replication slot to
+ * Maps an invalidation reason for a replication slot to
  * ReplicationSlotInvalidationCause.
  */
 ReplicationSlotInvalidationCause
-GetSlotInvalidationCause(const char *conflict_reason)
+GetSlotInvalidationCause(const char *invalidation_reason)
 {
 	ReplicationSlotInvalidationCause cause;
 	ReplicationSlotInvalidationCause result = RS_INVAL_NONE;
 	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
 
-	Assert(conflict_reason);
+	Assert(invalidation_reason);
 
 	for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
 	{
-		if (strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0)
+		if (strcmp(SlotInvalidationCauses[cause], invalidation_reason) == 0)
 		{
 			found = true;
 			result = cause;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index ad79e1fccd..4232c1e52e 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 17
+#define PG_GET_REPLICATION_SLOTS_COLS 18
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -263,6 +263,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		bool		nulls[PG_GET_REPLICATION_SLOTS_COLS];
 		WALAvailability walstate;
 		int			i;
+		ReplicationSlotInvalidationCause cause;
 
 		if (!slot->in_use)
 			continue;
@@ -409,18 +410,28 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.data.database == InvalidOid)
+		cause = slot_contents.data.invalidated;
+
+		if (SlotIsPhysical(&slot_contents))
 			nulls[i++] = true;
 		else
 		{
-			ReplicationSlotInvalidationCause cause = slot_contents.data.invalidated;
-
-			if (cause == RS_INVAL_NONE)
-				nulls[i++] = true;
+			/*
+			 * rows_removed and wal_level_insufficient are the only two
+			 * reasons for the logical slot's conflict with recovery.
+			 */
+			if (cause == RS_INVAL_HORIZON ||
+				cause == RS_INVAL_WAL_LEVEL)
+				values[i++] = BoolGetDatum(true);
 			else
-				values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+				values[i++] = BoolGetDatum(false);
 		}
 
+		if (cause == RS_INVAL_NONE)
+			nulls[i++] = true;
+		else
+			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
 		values[i++] = BoolGetDatum(slot_contents.data.synced);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index b5b8d11602..95c22a7200 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -676,13 +676,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
+							"(CASE WHEN invalidation_reason IS NOT NULL THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 042f66f714..71c74350a0 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11133,9 +11133,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,text,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,invalidation_reason,failover,synced}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 425effad21..7f25a083ee 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -273,7 +273,7 @@ extern void CheckPointReplicationSlots(bool is_shutdown);
 extern void CheckSlotRequirements(void);
 extern void CheckSlotPermissions(void);
 extern ReplicationSlotInvalidationCause
-			GetSlotInvalidationCause(const char *conflict_reason);
+			GetSlotInvalidationCause(const char *invalidation_reason);
 
 extern bool SlotExistsInStandbySlotNames(const char *slot_name);
 extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 88b03048c4..8d6740c734 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,7 +168,7 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflict_reason in pg_replication_slots.
+# Check reason for conflict in pg_replication_slots.
 sub check_slots_conflict_reason
 {
 	my ($slot_prefix, $reason) = @_;
@@ -178,15 +178,15 @@ sub check_slots_conflict_reason
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$active_slot' and conflicting;));
 
-	is($res, "$reason", "$active_slot conflict_reason is $reason");
+	is($res, "$reason", "$active_slot reason for conflict is $reason");
 
 	$res = $node_standby->safe_psql(
 		'postgres', qq(
-			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+			 select invalidation_reason from pg_replication_slots where slot_name = '$inactive_slot' and conflicting;));
 
-	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
+	is($res, "$reason", "$inactive_slot reason for conflict is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -293,13 +293,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflict_reason is NULL for physical slot
+# Check conflicting is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT conflicting is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflict_reason as NULL");
+is($res, 't', "Physical slot reports conflicting as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -524,7 +524,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Ensure that replication slot stats are not removed after invalidation.
@@ -551,7 +551,7 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify conflict_reason is retained across a restart.
+# Verify reason for conflict is retained across a restart.
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
@@ -560,7 +560,8 @@ check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
+	"SELECT restart_lsn FROM pg_replication_slots
+		WHERE slot_name = 'vacuum_full_activeslot' AND conflicting;"
 );
 
 chomp($restart_lsn);
@@ -611,7 +612,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
@@ -647,7 +648,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
@@ -700,8 +701,8 @@ ok( $node_standby->poll_query_until(
 is( $node_standby->safe_psql(
 		'postgres',
 		q[select bool_or(conflicting) from
-		  (select conflict_reason is not NULL as conflicting
-		   from pg_replication_slots WHERE slot_type = 'logical')]),
+		  (select conflicting from pg_replication_slots
+			where slot_type = 'logical')]),
 	'f',
 	'Logical slots are reported as non conflicting');
 
@@ -739,7 +740,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+# Verify reason for conflict is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
@@ -783,7 +784,7 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+# Verify reason for conflict is 'wal_level_insufficient' in pg_replication_slots
 check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 0ea1f3d323..f47bfd78eb 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -228,7 +228,7 @@ $standby1->safe_psql('postgres', "CHECKPOINT");
 # Check if the synced slot is invalidated
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason = 'wal_removed' FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'synchronized slot has been invalidated');
@@ -274,7 +274,7 @@ $standby1->wait_for_log(qr/dropped replication slot "lsub1_slot" of dbid [0-9]+/
 # flagged as 'synced'
 is( $standby1->safe_psql(
 		'postgres',
-		q{SELECT conflict_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+		q{SELECT invalidation_reason IS NULL AND synced AND NOT temporary FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}
 	),
 	"t",
 	'logical slot is re-synced');
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 84e359f6ed..18829ea586 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,10 +1473,11 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflict_reason,
+    l.conflicting,
+    l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v14-0002-Track-last_inactive_at-in-pg_replication_slots.patchapplication/x-patch; name=v14-0002-Track-last_inactive_at-in-pg_replication_slots.patchDownload

From eb5bb019863bb7eaa872357b3d9985f59dc56e9f Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 22 Mar 2024 02:59:07 +0000
Subject: [PATCH v14 2/6] Track last_inactive_at in pg_replication_slots

---
 doc/src/sgml/system-views.sgml       | 11 +++++++++++
 src/backend/catalog/system_views.sql |  3 ++-
 src/backend/replication/slot.c       | 16 ++++++++++++++++
 src/backend/replication/slotfuncs.c  |  7 ++++++-
 src/include/catalog/pg_proc.dat      |  6 +++---
 src/include/replication/slot.h       |  3 +++
 src/test/regress/expected/rules.out  |  5 +++--
 7 files changed, 44 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index b5da476c20..61378b3d4b 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2592,6 +2592,17 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_at</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f69b7f5580..e17f979c7f 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1026,7 +1026,8 @@ CREATE VIEW pg_replication_slots AS
             L.conflicting,
             L.invalidation_reason,
             L.failover,
-            L.synced
+            L.synced,
+            L.last_inactive_at
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index cdf0c450c5..146f0fbf84 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -409,6 +409,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->candidate_restart_valid = InvalidXLogRecPtr;
 	slot->candidate_restart_lsn = InvalidXLogRecPtr;
 	slot->last_saved_confirmed_flush = InvalidXLogRecPtr;
+	slot->last_inactive_at = 0;
 
 	/*
 	 * Create the slot on disk.  We haven't actually marked the slot allocated
@@ -622,6 +623,13 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
+	if (s->data.persistency == RS_PERSISTENT)
+	{
+		SpinLockAcquire(&s->mutex);
+		s->last_inactive_at = 0;
+		SpinLockRelease(&s->mutex);
+	}
+
 	if (am_walsender)
 	{
 		ereport(log_replication_commands ? LOG : DEBUG1,
@@ -691,6 +699,13 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
+	if (slot->data.persistency == RS_PERSISTENT)
+	{
+		SpinLockAcquire(&slot->mutex);
+		slot->last_inactive_at = GetCurrentTimestamp();
+		SpinLockRelease(&slot->mutex);
+	}
+
 	MyReplicationSlot = NULL;
 
 	/* might not have been set when we've been a plain slot */
@@ -2341,6 +2356,7 @@ RestoreSlotFromDisk(const char *name)
 
 		slot->in_use = true;
 		slot->active_pid = 0;
+		slot->last_inactive_at = 0;
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 4232c1e52e..75300f24b6 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 18
+#define PG_GET_REPLICATION_SLOTS_COLS 19
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -436,6 +436,11 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.synced);
 
+		if (slot_contents.last_inactive_at > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.last_inactive_at);
+		else
+			nulls[i++] = true;
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 71c74350a0..bf13448ad4 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11133,9 +11133,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,invalidation_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,text,bool,bool,timestamptz}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,invalidation_reason,failover,synced,last_inactive_at}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7f25a083ee..b4bb7f5e99 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -201,6 +201,9 @@ typedef struct ReplicationSlot
 	 * forcibly flushed or not.
 	 */
 	XLogRecPtr	last_saved_confirmed_flush;
+
+	/* When did this slot become inactive last time? */
+	TimestampTz last_inactive_at;
 } ReplicationSlot;
 
 #define SlotIsPhysical(slot) ((slot)->data.database == InvalidOid)
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 18829ea586..effee2879e 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1476,8 +1476,9 @@ pg_replication_slots| SELECT l.slot_name,
     l.conflicting,
     l.invalidation_reason,
     l.failover,
-    l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, invalidation_reason, failover, synced)
+    l.synced,
+    l.last_inactive_at
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, invalidation_reason, failover, synced, last_inactive_at)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v14-0003-Allow-setting-inactive_timeout-for-replication-s.patchapplication/x-patch; name=v14-0003-Allow-setting-inactive_timeout-for-replication-s.patchDownload

From 1fe40bcdd725858eab8d414d80f1234dcd0a4835 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 22 Mar 2024 03:55:02 +0000
Subject: [PATCH v14 3/6] Allow setting inactive_timeout for replication slots
 via SQL API

---
 contrib/test_decoding/expected/slot.out       | 102 ++++++++++++++++++
 contrib/test_decoding/sql/slot.sql            |  34 ++++++
 doc/src/sgml/func.sgml                        |  18 ++--
 doc/src/sgml/system-views.sgml                |   9 ++
 src/backend/catalog/system_functions.sql      |   2 +
 src/backend/catalog/system_views.sql          |   3 +-
 src/backend/replication/logical/slotsync.c    |  17 ++-
 src/backend/replication/slot.c                |  20 +++-
 src/backend/replication/slotfuncs.c           |  31 +++++-
 src/backend/replication/walsender.c           |   4 +-
 src/bin/pg_upgrade/info.c                     |   6 +-
 src/bin/pg_upgrade/pg_upgrade.c               |   5 +-
 src/bin/pg_upgrade/pg_upgrade.h               |   2 +
 src/bin/pg_upgrade/t/003_logical_slots.pl     |  11 +-
 src/include/catalog/pg_proc.dat               |  22 ++--
 src/include/replication/slot.h                |   5 +-
 .../t/040_standby_failover_slots_sync.pl      |  13 ++-
 src/test/regress/expected/rules.out           |   5 +-
 18 files changed, 266 insertions(+), 43 deletions(-)

diff --git a/contrib/test_decoding/expected/slot.out b/contrib/test_decoding/expected/slot.out
index 349ab2d380..6771520afb 100644
--- a/contrib/test_decoding/expected/slot.out
+++ b/contrib/test_decoding/expected/slot.out
@@ -466,3 +466,105 @@ SELECT pg_drop_replication_slot('physical_slot');
  
 (1 row)
 
+-- Test negative value for inactive_timeout option for slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', inactive_timeout := -300);  -- error
+ERROR:  "inactive_timeout" must not be negative
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', inactive_timeout := -600);  -- error
+ERROR:  "inactive_timeout" must not be negative
+-- Test inactive_timeout option for temporary slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', temporary := true, inactive_timeout := 300);  -- error
+ERROR:  cannot set inactive_timeout for a temporary replication slot
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', temporary := true, inactive_timeout := 600);  -- error
+ERROR:  cannot set inactive_timeout for a temporary replication slot
+-- Test inactive_timeout option of physical slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot1', immediately_reserve := true, inactive_timeout := 300);
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot2');
+ ?column? 
+----------
+ init
+(1 row)
+
+-- Copy physical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_slot1', dst_slot_name := 'it_phy_slot3');
+ ?column? 
+----------
+ copy
+(1 row)
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+  slot_name   | slot_type | inactive_timeout 
+--------------+-----------+------------------
+ it_phy_slot1 | physical  |              300
+ it_phy_slot2 | physical  |                0
+ it_phy_slot3 | physical  |              300
+(3 rows)
+
+SELECT pg_drop_replication_slot('it_phy_slot1');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_phy_slot2');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_phy_slot3');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+-- Test inactive_timeout option of logical slots.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot2', plugin := 'test_decoding');
+ ?column? 
+----------
+ init
+(1 row)
+
+-- Copy logical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slot1', dst_slot_name := 'it_log_slot3');
+ ?column? 
+----------
+ copy
+(1 row)
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+  slot_name   | slot_type | inactive_timeout 
+--------------+-----------+------------------
+ it_log_slot1 | logical   |              600
+ it_log_slot2 | logical   |                0
+ it_log_slot3 | logical   |              600
+(3 rows)
+
+SELECT pg_drop_replication_slot('it_log_slot1');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_log_slot2');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_log_slot3');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
diff --git a/contrib/test_decoding/sql/slot.sql b/contrib/test_decoding/sql/slot.sql
index 580e3ae3be..443e91da07 100644
--- a/contrib/test_decoding/sql/slot.sql
+++ b/contrib/test_decoding/sql/slot.sql
@@ -190,3 +190,37 @@ SELECT pg_drop_replication_slot('failover_true_slot');
 SELECT pg_drop_replication_slot('failover_false_slot');
 SELECT pg_drop_replication_slot('failover_default_slot');
 SELECT pg_drop_replication_slot('physical_slot');
+
+-- Test negative value for inactive_timeout option for slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', inactive_timeout := -300);  -- error
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', inactive_timeout := -600);  -- error
+
+-- Test inactive_timeout option for temporary slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', temporary := true, inactive_timeout := 300);  -- error
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', temporary := true, inactive_timeout := 600);  -- error
+
+-- Test inactive_timeout option of physical slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot1', immediately_reserve := true, inactive_timeout := 300);
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot2');
+
+-- Copy physical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_slot1', dst_slot_name := 'it_phy_slot3');
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+
+SELECT pg_drop_replication_slot('it_phy_slot1');
+SELECT pg_drop_replication_slot('it_phy_slot2');
+SELECT pg_drop_replication_slot('it_phy_slot3');
+
+-- Test inactive_timeout option of logical slots.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot2', plugin := 'test_decoding');
+
+-- Copy logical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slot1', dst_slot_name := 'it_log_slot3');
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+
+SELECT pg_drop_replication_slot('it_log_slot1');
+SELECT pg_drop_replication_slot('it_log_slot2');
+SELECT pg_drop_replication_slot('it_log_slot3');
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 8ecc02f2b9..afaafa35ad 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28373,7 +28373,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <indexterm>
          <primary>pg_create_physical_replication_slot</primary>
         </indexterm>
-        <function>pg_create_physical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type> <optional>, <parameter>immediately_reserve</parameter> <type>boolean</type>, <parameter>temporary</parameter> <type>boolean</type> </optional> )
+        <function>pg_create_physical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type> <optional>, <parameter>immediately_reserve</parameter> <type>boolean</type>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>inactive_timeout</parameter> <type>integer</type> </optional>)
         <returnvalue>record</returnvalue>
         ( <parameter>slot_name</parameter> <type>name</type>,
         <parameter>lsn</parameter> <type>pg_lsn</type> )
@@ -28390,9 +28390,12 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         parameter, <parameter>temporary</parameter>, when set to true, specifies that
         the slot should not be permanently stored to disk and is only meant
         for use by the current session. Temporary slots are also
-        released upon any error. This function corresponds
-        to the replication protocol command <literal>CREATE_REPLICATION_SLOT
-        ... PHYSICAL</literal>.
+        released upon any error. The optional fourth
+        parameter, <parameter>inactive_timeout</parameter>, when set to a
+        non-zero value, specifies the amount of time in seconds the slot is
+        allowed to be inactive. This function corresponds to the replication
+        protocol command
+        <literal>CREATE_REPLICATION_SLOT ... PHYSICAL</literal>.
        </para></entry>
       </row>
 
@@ -28417,7 +28420,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <indexterm>
          <primary>pg_create_logical_replication_slot</primary>
         </indexterm>
-        <function>pg_create_logical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type>, <parameter>plugin</parameter> <type>name</type> <optional>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>twophase</parameter> <type>boolean</type>, <parameter>failover</parameter> <type>boolean</type> </optional> )
+        <function>pg_create_logical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type>, <parameter>plugin</parameter> <type>name</type> <optional>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>twophase</parameter> <type>boolean</type>, <parameter>failover</parameter> <type>boolean</type>, <parameter>inactive_timeout</parameter> <type>integer</type> </optional> )
         <returnvalue>record</returnvalue>
         ( <parameter>slot_name</parameter> <type>name</type>,
         <parameter>lsn</parameter> <type>pg_lsn</type> )
@@ -28436,7 +28439,10 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <parameter>failover</parameter>, when set to true,
         specifies that this slot is enabled to be synced to the
         standbys so that logical replication can be resumed after
-        failover. A call to this function has the same effect as
+        failover.  The optional sixth parameter,
+        <parameter>inactive_timeout</parameter>, when set to a
+        non-zero value, specifies the amount of time in seconds the slot is
+        allowed to be inactive. A call to this function has the same effect as
         the replication protocol command
         <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
        </para></entry>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 61378b3d4b..f8838b1a23 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2761,6 +2761,15 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        ID of role
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_timeout</structfield> <type>integer</type>
+      </para>
+      <para>
+        The amount of time in seconds the slot is allowed to be inactive.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index fe2bb50f46..af27616657 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -469,6 +469,7 @@ AS 'pg_logical_emit_message_bytea';
 CREATE OR REPLACE FUNCTION pg_create_physical_replication_slot(
     IN slot_name name, IN immediately_reserve boolean DEFAULT false,
     IN temporary boolean DEFAULT false,
+    IN inactive_timeout int DEFAULT 0,
     OUT slot_name name, OUT lsn pg_lsn)
 RETURNS RECORD
 LANGUAGE INTERNAL
@@ -480,6 +481,7 @@ CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot(
     IN temporary boolean DEFAULT false,
     IN twophase boolean DEFAULT false,
     IN failover boolean DEFAULT false,
+    IN inactive_timeout int DEFAULT 0,
     OUT slot_name name, OUT lsn pg_lsn)
 RETURNS RECORD
 LANGUAGE INTERNAL
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index e17f979c7f..6648b125d5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1027,7 +1027,8 @@ CREATE VIEW pg_replication_slots AS
             L.invalidation_reason,
             L.failover,
             L.synced,
-            L.last_inactive_at
+            L.last_inactive_at,
+            L.inactive_timeout
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..c01876ceeb 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -131,6 +131,7 @@ typedef struct RemoteSlot
 	char	   *database;
 	bool		two_phase;
 	bool		failover;
+	int			inactive_timeout;
 	XLogRecPtr	restart_lsn;
 	XLogRecPtr	confirmed_lsn;
 	TransactionId catalog_xmin;
@@ -167,7 +168,8 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		remote_slot->two_phase == slot->data.two_phase &&
 		remote_slot->failover == slot->data.failover &&
 		remote_slot->confirmed_lsn == slot->data.confirmed_flush &&
-		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
+		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 &&
+		remote_slot->inactive_timeout == slot->data.inactive_timeout)
 		return false;
 
 	/* Avoid expensive operations while holding a spinlock. */
@@ -182,6 +184,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 	slot->data.confirmed_flush = remote_slot->confirmed_lsn;
 	slot->data.catalog_xmin = remote_slot->catalog_xmin;
 	slot->effective_catalog_xmin = remote_slot->catalog_xmin;
+	slot->data.inactive_timeout = remote_slot->inactive_timeout;
 	SpinLockRelease(&slot->mutex);
 
 	if (xmin_changed)
@@ -607,7 +610,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
 							  remote_slot->two_phase,
 							  remote_slot->failover,
-							  true);
+							  true, 0);
 
 		/* For shorter lines. */
 		slot = MyReplicationSlot;
@@ -627,6 +630,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		SpinLockAcquire(&slot->mutex);
 		slot->effective_catalog_xmin = xmin_horizon;
 		slot->data.catalog_xmin = xmin_horizon;
+		slot->data.inactive_timeout = remote_slot->inactive_timeout;
 		SpinLockRelease(&slot->mutex);
 		ReplicationSlotsComputeRequiredXmin(true);
 		LWLockRelease(ProcArrayLock);
@@ -652,9 +656,9 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 static bool
 synchronize_slots(WalReceiverConn *wrconn)
 {
-#define SLOTSYNC_COLUMN_COUNT 9
+#define SLOTSYNC_COLUMN_COUNT 10
 	Oid			slotRow[SLOTSYNC_COLUMN_COUNT] = {TEXTOID, TEXTOID, LSNOID,
-	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID};
+	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID, INT4OID};
 
 	WalRcvExecResult *res;
 	TupleTableSlot *tupslot;
@@ -663,7 +667,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, invalidation_reason"
+		" database, invalidation_reason, inactive_timeout"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
@@ -743,6 +747,9 @@ synchronize_slots(WalReceiverConn *wrconn)
 		remote_slot->invalidated = isnull ? RS_INVAL_NONE :
 			GetSlotInvalidationCause(TextDatumGetCString(d));
 
+		remote_slot->inactive_timeout = DatumGetInt32(slot_getattr(tupslot, ++col,
+																   &isnull));
+
 		/* Sanity check */
 		Assert(col == SLOTSYNC_COLUMN_COUNT);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 146f0fbf84..195771920f 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -129,7 +129,7 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	5		/* version for new files */
+#define SLOT_VERSION	6		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -304,11 +304,14 @@ ReplicationSlotValidateName(const char *name, int elevel)
  * failover: If enabled, allows the slot to be synced to standbys so
  *     that logical replication can be resumed after failover.
  * synced: True if the slot is synchronized from the primary server.
+ * inactive_timeout: The amount of time in seconds the slot is allowed to be
+ *     inactive.
  */
 void
 ReplicationSlotCreate(const char *name, bool db_specific,
 					  ReplicationSlotPersistency persistency,
-					  bool two_phase, bool failover, bool synced)
+					  bool two_phase, bool failover, bool synced,
+					  int inactive_timeout)
 {
 	ReplicationSlot *slot = NULL;
 	int			i;
@@ -345,6 +348,18 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 					errmsg("cannot enable failover for a temporary replication slot"));
 	}
 
+	if (inactive_timeout > 0)
+	{
+		/*
+		 * Do not allow users to set inactive_timeout for temporary slots,
+		 * because temporary slots will not be saved to the disk.
+		 */
+		if (persistency == RS_TEMPORARY)
+			ereport(ERROR,
+					errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					errmsg("cannot set inactive_timeout for a temporary replication slot"));
+	}
+
 	/*
 	 * If some other backend ran this code concurrently with us, we'd likely
 	 * both allocate the same slot, and that would be bad.  We'd also be at
@@ -398,6 +413,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
 	slot->data.synced = synced;
+	slot->data.inactive_timeout = inactive_timeout;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 75300f24b6..326682138b 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -38,14 +38,15 @@
  */
 static void
 create_physical_replication_slot(char *name, bool immediately_reserve,
-								 bool temporary, XLogRecPtr restart_lsn)
+								 bool temporary, int inactive_timeout,
+								 XLogRecPtr restart_lsn)
 {
 	Assert(!MyReplicationSlot);
 
 	/* acquire replication slot, this will check for conflicting names */
 	ReplicationSlotCreate(name, false,
 						  temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
-						  false, false);
+						  false, false, inactive_timeout);
 
 	if (immediately_reserve)
 	{
@@ -71,6 +72,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 	Name		name = PG_GETARG_NAME(0);
 	bool		immediately_reserve = PG_GETARG_BOOL(1);
 	bool		temporary = PG_GETARG_BOOL(2);
+	int			inactive_timeout = PG_GETARG_INT32(3);
 	Datum		values[2];
 	bool		nulls[2];
 	TupleDesc	tupdesc;
@@ -84,9 +86,15 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 
 	CheckSlotRequirements();
 
+	if (inactive_timeout < 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("\"inactive_timeout\" must not be negative")));
+
 	create_physical_replication_slot(NameStr(*name),
 									 immediately_reserve,
 									 temporary,
+									 inactive_timeout,
 									 InvalidXLogRecPtr);
 
 	values[0] = NameGetDatum(&MyReplicationSlot->data.name);
@@ -120,7 +128,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 static void
 create_logical_replication_slot(char *name, char *plugin,
 								bool temporary, bool two_phase,
-								bool failover,
+								bool failover, int inactive_timeout,
 								XLogRecPtr restart_lsn,
 								bool find_startpoint)
 {
@@ -138,7 +146,7 @@ create_logical_replication_slot(char *name, char *plugin,
 	 */
 	ReplicationSlotCreate(name, true,
 						  temporary ? RS_TEMPORARY : RS_EPHEMERAL, two_phase,
-						  failover, false);
+						  failover, false, inactive_timeout);
 
 	/*
 	 * Create logical decoding context to find start point or, if we don't
@@ -177,6 +185,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 	bool		temporary = PG_GETARG_BOOL(2);
 	bool		two_phase = PG_GETARG_BOOL(3);
 	bool		failover = PG_GETARG_BOOL(4);
+	int			inactive_timeout = PG_GETARG_INT32(5);
 	Datum		result;
 	TupleDesc	tupdesc;
 	HeapTuple	tuple;
@@ -190,11 +199,17 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 
 	CheckLogicalDecodingRequirements();
 
+	if (inactive_timeout < 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("\"inactive_timeout\" must not be negative")));
+
 	create_logical_replication_slot(NameStr(*name),
 									NameStr(*plugin),
 									temporary,
 									two_phase,
 									failover,
+									inactive_timeout,
 									InvalidXLogRecPtr,
 									true);
 
@@ -239,7 +254,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 19
+#define PG_GET_REPLICATION_SLOTS_COLS 20
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -441,6 +456,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		else
 			nulls[i++] = true;
 
+		values[i++] = Int32GetDatum(slot_contents.data.inactive_timeout);
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
@@ -720,6 +737,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 	XLogRecPtr	src_restart_lsn;
 	bool		src_islogical;
 	bool		temporary;
+	int			inactive_timeout;
 	char	   *plugin;
 	Datum		values[2];
 	bool		nulls[2];
@@ -776,6 +794,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 	src_restart_lsn = first_slot_contents.data.restart_lsn;
 	temporary = (first_slot_contents.data.persistency == RS_TEMPORARY);
 	plugin = logical_slot ? NameStr(first_slot_contents.data.plugin) : NULL;
+	inactive_timeout = first_slot_contents.data.inactive_timeout;
 
 	/* Check type of replication slot */
 	if (src_islogical != logical_slot)
@@ -823,6 +842,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 										temporary,
 										false,
 										false,
+										inactive_timeout,
 										src_restart_lsn,
 										false);
 	}
@@ -830,6 +850,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 		create_physical_replication_slot(NameStr(*dst_name),
 										 true,
 										 temporary,
+										 inactive_timeout,
 										 src_restart_lsn);
 
 	/*
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bc40c454de..5315c08650 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1221,7 +1221,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	{
 		ReplicationSlotCreate(cmd->slotname, false,
 							  cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
-							  false, false, false);
+							  false, false, false, 0);
 
 		if (reserve_wal)
 		{
@@ -1252,7 +1252,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 		 */
 		ReplicationSlotCreate(cmd->slotname, true,
 							  cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
-							  two_phase, failover, false);
+							  two_phase, failover, false, 0);
 
 		/*
 		 * Do options check early so that we can bail before calling the
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 95c22a7200..12626987f0 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -676,7 +676,8 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid, "
+							"inactive_timeout "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
@@ -696,6 +697,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 		int			i_failover;
 		int			i_caught_up;
 		int			i_invalid;
+		int			i_inactive_timeout;
 
 		slotinfos = (LogicalSlotInfo *) pg_malloc(sizeof(LogicalSlotInfo) * num_slots);
 
@@ -705,6 +707,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 		i_failover = PQfnumber(res, "failover");
 		i_caught_up = PQfnumber(res, "caught_up");
 		i_invalid = PQfnumber(res, "invalid");
+		i_inactive_timeout = PQfnumber(res, "inactive_timeout");
 
 		for (int slotnum = 0; slotnum < num_slots; slotnum++)
 		{
@@ -716,6 +719,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 			curr->failover = (strcmp(PQgetvalue(res, slotnum, i_failover), "t") == 0);
 			curr->caught_up = (strcmp(PQgetvalue(res, slotnum, i_caught_up), "t") == 0);
 			curr->invalid = (strcmp(PQgetvalue(res, slotnum, i_invalid), "t") == 0);
+			curr->inactive_timeout = atooid(PQgetvalue(res, slotnum, i_inactive_timeout));
 		}
 	}
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index f6143b6bc4..2656056103 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -931,9 +931,10 @@ create_logical_replication_slots(void)
 			appendPQExpBuffer(query, ", ");
 			appendStringLiteralConn(query, slot_info->plugin, conn);
 
-			appendPQExpBuffer(query, ", false, %s, %s);",
+			appendPQExpBuffer(query, ", false, %s, %s, %d);",
 							  slot_info->two_phase ? "true" : "false",
-							  slot_info->failover ? "true" : "false");
+							  slot_info->failover ? "true" : "false",
+							  slot_info->inactive_timeout);
 
 			PQclear(executeQueryOrDie(conn, "%s", query->data));
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 92bcb693fb..eb86d000b1 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -162,6 +162,8 @@ typedef struct
 	bool		invalid;		/* if true, the slot is unusable */
 	bool		failover;		/* is the slot designated to be synced to the
 								 * physical standby? */
+	int			inactive_timeout;	/* The amount of time in seconds the slot
+									 * is allowed to be inactive. */
 } LogicalSlotInfo;
 
 typedef struct
diff --git a/src/bin/pg_upgrade/t/003_logical_slots.pl b/src/bin/pg_upgrade/t/003_logical_slots.pl
index 83d71c3084..6e82d2cb7b 100644
--- a/src/bin/pg_upgrade/t/003_logical_slots.pl
+++ b/src/bin/pg_upgrade/t/003_logical_slots.pl
@@ -153,14 +153,17 @@ like(
 # TEST: Successful upgrade
 
 # Preparations for the subsequent test:
-# 1. Setup logical replication (first, cleanup slots from the previous tests)
+# 1. Setup logical replication (first, cleanup slots from the previous tests,
+# and then create slot for this test with inactive_timeout set).
 my $old_connstr = $oldpub->connstr . ' dbname=postgres';
 
+my $inactive_timeout = 3600;
 $oldpub->start;
 $oldpub->safe_psql(
 	'postgres', qq[
 	SELECT * FROM pg_drop_replication_slot('test_slot1');
 	SELECT * FROM pg_drop_replication_slot('test_slot2');
+	SELECT pg_create_logical_replication_slot(slot_name := 'regress_sub', plugin := 'pgoutput', inactive_timeout := $inactive_timeout);
 	CREATE PUBLICATION regress_pub FOR ALL TABLES;
 ]);
 
@@ -172,7 +175,7 @@ $sub->start;
 $sub->safe_psql(
 	'postgres', qq[
 	CREATE TABLE tbl (a int);
-	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (two_phase = 'true', failover = 'true')
+	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (slot_name = 'regress_sub', create_slot = false, two_phase = 'true', failover = 'true')
 ]);
 $sub->wait_for_subscription_sync($oldpub, 'regress_sub');
 
@@ -192,8 +195,8 @@ command_ok([@pg_upgrade_cmd], 'run of pg_upgrade of old cluster');
 # Check that the slot 'regress_sub' has migrated to the new cluster
 $newpub->start;
 my $result = $newpub->safe_psql('postgres',
-	"SELECT slot_name, two_phase, failover FROM pg_replication_slots");
-is($result, qq(regress_sub|t|t), 'check the slot exists on new cluster');
+	"SELECT slot_name, two_phase, failover, inactive_timeout = $inactive_timeout FROM pg_replication_slots");
+is($result, qq(regress_sub|t|t|t), 'check the slot exists on new cluster');
 
 # Update the connection
 my $new_connstr = $newpub->connstr . ' dbname=postgres';
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index bf13448ad4..50db6b68d0 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11105,10 +11105,10 @@
 # replication slots
 { oid => '3779', descr => 'create a physical replication slot',
   proname => 'pg_create_physical_replication_slot', provolatile => 'v',
-  proparallel => 'u', prorettype => 'record', proargtypes => 'name bool bool',
-  proallargtypes => '{name,bool,bool,name,pg_lsn}',
-  proargmodes => '{i,i,i,o,o}',
-  proargnames => '{slot_name,immediately_reserve,temporary,slot_name,lsn}',
+  proparallel => 'u', prorettype => 'record', proargtypes => 'name bool bool int4',
+  proallargtypes => '{name,bool,bool,int4,name,pg_lsn}',
+  proargmodes => '{i,i,i,i,o,o}',
+  proargnames => '{slot_name,immediately_reserve,temporary,inactive_timeout,slot_name,lsn}',
   prosrc => 'pg_create_physical_replication_slot' },
 { oid => '4220',
   descr => 'copy a physical replication slot, changing temporality',
@@ -11133,17 +11133,17 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,text,bool,bool,timestamptz}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,invalidation_reason,failover,synced,last_inactive_at}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,text,bool,bool,timestamptz,int4}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,invalidation_reason,failover,synced,last_inactive_at,inactive_timeout}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
   proparallel => 'u', prorettype => 'record',
-  proargtypes => 'name name bool bool bool',
-  proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
-  proargmodes => '{i,i,i,i,i,o,o}',
-  proargnames => '{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
+  proargtypes => 'name name bool bool bool int4',
+  proallargtypes => '{name,name,bool,bool,bool,int4,name,pg_lsn}',
+  proargmodes => '{i,i,i,i,i,i,o,o}',
+  proargnames => '{slot_name,plugin,temporary,twophase,failover,inactive_timeout,slot_name,lsn}',
   prosrc => 'pg_create_logical_replication_slot' },
 { oid => '4222',
   descr => 'copy a logical replication slot, changing temporality and plugin',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index b4bb7f5e99..ff62542b03 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -127,6 +127,9 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* The amount of time in seconds the slot is allowed to be inactive. */
+	int			inactive_timeout;
 } ReplicationSlotPersistentData;
 
 /*
@@ -239,7 +242,7 @@ extern void ReplicationSlotsShmemInit(void);
 extern void ReplicationSlotCreate(const char *name, bool db_specific,
 								  ReplicationSlotPersistency persistency,
 								  bool two_phase, bool failover,
-								  bool synced);
+								  bool synced, int inactive_timeout);
 extern void ReplicationSlotPersist(void);
 extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index f47bfd78eb..3dd780beab 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -152,8 +152,9 @@ log_min_messages = 'debug2'
 $primary->append_conf('postgresql.conf', "log_min_messages = 'debug2'");
 $primary->reload;
 
+my $inactive_timeout = 3600;
 $primary->psql('postgres',
-	q{SELECT pg_create_logical_replication_slot('lsub2_slot', 'test_decoding', false, false, true);}
+	"SELECT pg_create_logical_replication_slot('lsub2_slot', 'test_decoding', false, false, true, $inactive_timeout);"
 );
 
 $primary->psql('postgres',
@@ -190,6 +191,16 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Confirm that the synced slot on the standby has got inactive_timeout from the
+# primary.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT inactive_timeout = $inactive_timeout FROM pg_replication_slots
+			WHERE slot_name = 'lsub2_slot' AND synced AND NOT temporary;"
+	),
+	"t",
+	'synced logical slot has got inactive_timeout on standby');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index effee2879e..adf7b8947f 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1477,8 +1477,9 @@ pg_replication_slots| SELECT l.slot_name,
     l.invalidation_reason,
     l.failover,
     l.synced,
-    l.last_inactive_at
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, invalidation_reason, failover, synced, last_inactive_at)
+    l.last_inactive_at,
+    l.inactive_timeout
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, invalidation_reason, failover, synced, last_inactive_at, inactive_timeout)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v14-0004-Introduce-new-SQL-funtion-pg_alter_replication_s.patchapplication/x-patch; name=v14-0004-Introduce-new-SQL-funtion-pg_alter_replication_s.patchDownload

From a0163b1f67dad275ff84fd1c5ebe290a19ebeb07 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 22 Mar 2024 06:32:39 +0000
Subject: [PATCH v14 4/6] Introduce new SQL funtion pg_alter_replication_slot

This commit adds a new function pg_alter_replication_slot to alter
the given property of a replication slot. It is similar to
replication protocol command ALTER_REPLICATION_SLOT, except that
for now it allows only inactive_timeout property to be set. The
reason for disallowing failover property to be altered via this
function is to avoid inconsistency with the catalog
pg_subscription on the logical subscriber. Because, the subscriber
won't know the altered value of its replication slot on the
publisher.
---
 contrib/test_decoding/expected/slot.out   | 44 ++++++++++++++-
 contrib/test_decoding/sql/slot.sql        | 10 ++++
 doc/src/sgml/func.sgml                    | 21 ++++++++
 src/backend/replication/slot.c            | 22 ++++----
 src/backend/replication/slotfuncs.c       | 66 ++++++++++++++++++++++-
 src/bin/pg_upgrade/t/003_logical_slots.pl | 14 +++--
 src/include/catalog/pg_proc.dat           |  5 ++
 src/include/replication/slot.h            |  2 +
 8 files changed, 167 insertions(+), 17 deletions(-)

diff --git a/contrib/test_decoding/expected/slot.out b/contrib/test_decoding/expected/slot.out
index 6771520afb..5b8dbf6f52 100644
--- a/contrib/test_decoding/expected/slot.out
+++ b/contrib/test_decoding/expected/slot.out
@@ -496,13 +496,27 @@ SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_sl
  copy
 (1 row)
 
+-- Test alter physical slot with inactive_timeout option set.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot4');
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'alter' FROM pg_alter_replication_slot(slot_name := 'it_phy_slot4', inactive_timeout := 900);
+ ?column? 
+----------
+ alter
+(1 row)
+
 SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
   slot_name   | slot_type | inactive_timeout 
 --------------+-----------+------------------
  it_phy_slot1 | physical  |              300
  it_phy_slot2 | physical  |                0
  it_phy_slot3 | physical  |              300
-(3 rows)
+ it_phy_slot4 | physical  |              900
+(4 rows)
 
 SELECT pg_drop_replication_slot('it_phy_slot1');
  pg_drop_replication_slot 
@@ -522,6 +536,12 @@ SELECT pg_drop_replication_slot('it_phy_slot3');
  
 (1 row)
 
+SELECT pg_drop_replication_slot('it_phy_slot4');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
 -- Test inactive_timeout option of logical slots.
 SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
  ?column? 
@@ -542,13 +562,27 @@ SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slo
  copy
 (1 row)
 
+-- Test alter logical slot with inactive_timeout option set.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot4', plugin := 'test_decoding');
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'alter' FROM pg_alter_replication_slot(slot_name := 'it_log_slot4', inactive_timeout := 900);
+ ?column? 
+----------
+ alter
+(1 row)
+
 SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
   slot_name   | slot_type | inactive_timeout 
 --------------+-----------+------------------
  it_log_slot1 | logical   |              600
  it_log_slot2 | logical   |                0
  it_log_slot3 | logical   |              600
-(3 rows)
+ it_log_slot4 | logical   |              900
+(4 rows)
 
 SELECT pg_drop_replication_slot('it_log_slot1');
  pg_drop_replication_slot 
@@ -568,3 +602,9 @@ SELECT pg_drop_replication_slot('it_log_slot3');
  
 (1 row)
 
+SELECT pg_drop_replication_slot('it_log_slot4');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
diff --git a/contrib/test_decoding/sql/slot.sql b/contrib/test_decoding/sql/slot.sql
index 443e91da07..6785714cc7 100644
--- a/contrib/test_decoding/sql/slot.sql
+++ b/contrib/test_decoding/sql/slot.sql
@@ -206,11 +206,16 @@ SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot
 -- Copy physical slot with inactive_timeout option set.
 SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_slot1', dst_slot_name := 'it_phy_slot3');
 
+-- Test alter physical slot with inactive_timeout option set.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot4');
+SELECT 'alter' FROM pg_alter_replication_slot(slot_name := 'it_phy_slot4', inactive_timeout := 900);
+
 SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
 
 SELECT pg_drop_replication_slot('it_phy_slot1');
 SELECT pg_drop_replication_slot('it_phy_slot2');
 SELECT pg_drop_replication_slot('it_phy_slot3');
+SELECT pg_drop_replication_slot('it_phy_slot4');
 
 -- Test inactive_timeout option of logical slots.
 SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
@@ -219,8 +224,13 @@ SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot2
 -- Copy logical slot with inactive_timeout option set.
 SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slot1', dst_slot_name := 'it_log_slot3');
 
+-- Test alter logical slot with inactive_timeout option set.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot4', plugin := 'test_decoding');
+SELECT 'alter' FROM pg_alter_replication_slot(slot_name := 'it_log_slot4', inactive_timeout := 900);
+
 SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
 
 SELECT pg_drop_replication_slot('it_log_slot1');
 SELECT pg_drop_replication_slot('it_log_slot2');
 SELECT pg_drop_replication_slot('it_log_slot3');
+SELECT pg_drop_replication_slot('it_log_slot4');
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index afaafa35ad..22c8e0d39c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28829,6 +28829,27 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
       </entry>
       </row>
 
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_alter_replication_slot</primary>
+        </indexterm>
+        <function>pg_alter_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type>, <parameter>inactive_timeout</parameter> <type>integer</type> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Alters the given property of a replication slot
+        named <parameter>slot_name</parameter>. Same as replication protocol
+        command <literal>ALTER_REPLICATION_SLOT</literal>, except that it
+        allows only <parameter>inactive_timeout</parameter> property to be set.
+        The reason for disallowing <parameter>failover</parameter> property to
+        be altered via this function is to avoid inconsistency with the catalog
+        <structname>pg_subscription</structname> on the logical subscriber.
+        Because, the subscriber won't know the altered value of its
+        replication slot on the publisher.
+       </para></entry>
+      </row>
+
      </tbody>
     </tgroup>
    </table>
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 195771920f..5644765a7e 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -162,7 +162,6 @@ static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
-static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
 
 /*
  * Report shared-memory space needed by ReplicationSlotsShmemInit.
@@ -865,6 +864,7 @@ ReplicationSlotAlter(const char *name, bool failover)
 	ReplicationSlotRelease();
 }
 
+
 /*
  * Permanently drop the currently acquired replication slot.
  */
@@ -1000,7 +1000,7 @@ ReplicationSlotSave(void)
 	Assert(MyReplicationSlot != NULL);
 
 	sprintf(path, "pg_replslot/%s", NameStr(MyReplicationSlot->data.name));
-	SaveSlotToPath(MyReplicationSlot, path, ERROR);
+	ReplicationSlotSaveToPath(MyReplicationSlot, path, ERROR);
 }
 
 /*
@@ -1863,7 +1863,10 @@ CheckPointReplicationSlots(bool is_shutdown)
 		if (!s->in_use)
 			continue;
 
-		/* save the slot to disk, locking is handled in SaveSlotToPath() */
+		/*
+		 * Save the slot to disk, locking is handled in
+		 * ReplicationSlotSaveToPath.
+		 */
 		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
 
 		/*
@@ -1889,7 +1892,7 @@ CheckPointReplicationSlots(bool is_shutdown)
 			SpinLockRelease(&s->mutex);
 		}
 
-		SaveSlotToPath(s, path, LOG);
+		ReplicationSlotSaveToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
 }
@@ -1968,8 +1971,9 @@ CreateSlotOnDisk(ReplicationSlot *slot)
 
 	/*
 	 * No need to take out the io_in_progress_lock, nobody else can see this
-	 * slot yet, so nobody else will write. We're reusing SaveSlotToPath which
-	 * takes out the lock, if we'd take the lock here, we'd deadlock.
+	 * slot yet, so nobody else will write. We're reusing
+	 * ReplicationSlotSaveToPath which takes out the lock, if we'd take the
+	 * lock here, we'd deadlock.
 	 */
 
 	sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
@@ -1995,7 +1999,7 @@ CreateSlotOnDisk(ReplicationSlot *slot)
 
 	/* Write the actual state file. */
 	slot->dirty = true;			/* signal that we really need to write */
-	SaveSlotToPath(slot, tmppath, ERROR);
+	ReplicationSlotSaveToPath(slot, tmppath, ERROR);
 
 	/* Rename the directory into place. */
 	if (rename(tmppath, path) != 0)
@@ -2020,8 +2024,8 @@ CreateSlotOnDisk(ReplicationSlot *slot)
 /*
  * Shared functionality between saving and creating a replication slot.
  */
-static void
-SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel)
+void
+ReplicationSlotSaveToPath(ReplicationSlot *slot, const char *dir, int elevel)
 {
 	char		tmppath[MAXPGPATH];
 	char		path[MAXPGPATH];
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 326682138b..d6ef14fba6 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -229,7 +229,6 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 	PG_RETURN_DATUM(result);
 }
 
-
 /*
  * SQL function for dropping a replication slot.
  */
@@ -1038,3 +1037,68 @@ pg_sync_replication_slots(PG_FUNCTION_ARGS)
 
 	PG_RETURN_VOID();
 }
+
+/*
+ * SQL function for altering given properties of a replication slot.
+ */
+Datum
+pg_alter_replication_slot(PG_FUNCTION_ARGS)
+{
+	Name		name = PG_GETARG_NAME(0);
+	int			inactive_timeout = PG_GETARG_INT32(1);
+	ReplicationSlot *slot;
+	char		path[MAXPGPATH];
+
+	CheckSlotPermissions();
+
+	CheckSlotRequirements();
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	/* Check if the slot exits with the given name. */
+	slot = SearchNamedReplicationSlot(NameStr(*name), false);
+
+	if (!slot)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("replication slot \"%s\" does not exist",
+						NameStr(*name))));
+
+	/*
+	 * Do not allow users to set inactive_timeout for temporary slots because
+	 * temporary, slots will not be saved to the disk.
+	 */
+	if (slot->data.persistency == RS_TEMPORARY)
+		ereport(ERROR,
+				errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				errmsg("cannot set inactive_timeout for a temporary replication slot"));
+
+	LWLockRelease(ReplicationSlotControlLock);
+
+	if (inactive_timeout < 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("\"inactive_timeout\" must not be negative")));
+
+	/*
+	 * We need to briefly prevent any other backend from acquiring the slot
+	 * while we set the property. Without holding the ControlLock exclusively,
+	 * a concurrent ReplicationSlotAcquire() could acquire the slot as well.
+	 */
+	LWLockAcquire(ReplicationSlotControlLock, LW_EXCLUSIVE);
+
+	SpinLockAcquire(&slot->mutex);
+	slot->data.inactive_timeout = inactive_timeout;
+
+	/* Make sure the invalidated state persists across server restart */
+	slot->just_dirtied = true;
+	slot->dirty = true;
+	SpinLockRelease(&slot->mutex);
+
+	sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+	ReplicationSlotSaveToPath(slot, path, ERROR);
+
+	LWLockRelease(ReplicationSlotControlLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_upgrade/t/003_logical_slots.pl b/src/bin/pg_upgrade/t/003_logical_slots.pl
index 6e82d2cb7b..b79db24f42 100644
--- a/src/bin/pg_upgrade/t/003_logical_slots.pl
+++ b/src/bin/pg_upgrade/t/003_logical_slots.pl
@@ -153,17 +153,14 @@ like(
 # TEST: Successful upgrade
 
 # Preparations for the subsequent test:
-# 1. Setup logical replication (first, cleanup slots from the previous tests,
-# and then create slot for this test with inactive_timeout set).
+# 1. Setup logical replication (first, cleanup slots from the previous tests)
 my $old_connstr = $oldpub->connstr . ' dbname=postgres';
 
-my $inactive_timeout = 3600;
 $oldpub->start;
 $oldpub->safe_psql(
 	'postgres', qq[
 	SELECT * FROM pg_drop_replication_slot('test_slot1');
 	SELECT * FROM pg_drop_replication_slot('test_slot2');
-	SELECT pg_create_logical_replication_slot(slot_name := 'regress_sub', plugin := 'pgoutput', inactive_timeout := $inactive_timeout);
 	CREATE PUBLICATION regress_pub FOR ALL TABLES;
 ]);
 
@@ -175,7 +172,7 @@ $sub->start;
 $sub->safe_psql(
 	'postgres', qq[
 	CREATE TABLE tbl (a int);
-	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (slot_name = 'regress_sub', create_slot = false, two_phase = 'true', failover = 'true')
+	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (two_phase = 'true', failover = 'true')
 ]);
 $sub->wait_for_subscription_sync($oldpub, 'regress_sub');
 
@@ -185,6 +182,13 @@ my $twophase_query =
 $sub->poll_query_until('postgres', $twophase_query)
   or die "Timed out while waiting for subscriber to enable twophase";
 
+# Alter slot to set inactive_timeout
+my $inactive_timeout = 3600;
+$oldpub->safe_psql(
+	'postgres', qq[
+	SELECT pg_alter_replication_slot(slot_name := 'regress_sub', inactive_timeout := $inactive_timeout);
+]);
+
 # 2. Temporarily disable the subscription
 $sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub DISABLE");
 $oldpub->stop;
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 50db6b68d0..b83e2f39b1 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11222,6 +11222,11 @@
   proname => 'pg_sync_replication_slots', provolatile => 'v', proparallel => 'u',
   prorettype => 'void', proargtypes => '',
   prosrc => 'pg_sync_replication_slots' },
+{ oid => '9039', descr => 'alter given properties of a replication slot',
+  proname => 'pg_alter_replication_slot', provolatile => 'v', proparallel => 'u',
+  prorettype => 'void', proargtypes => 'name int4',
+  proargnames => '{slot_name,inactive_timeout}',
+  prosrc => 'pg_alter_replication_slot' },
 
 # event triggers
 { oid => '3566', descr => 'list objects dropped by the current command',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index ff62542b03..a8d7d42a07 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -252,6 +252,8 @@ extern void ReplicationSlotAcquire(const char *name, bool nowait);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
+extern void ReplicationSlotSaveToPath(ReplicationSlot *slot, const char *dir,
+									  int elevel);
 extern void ReplicationSlotMarkDirty(void);
 
 /* misc stuff */
-- 
2.34.1

v14-0005-Allow-setting-inactive_timeout-in-the-replicatio.patchapplication/x-patch; name=v14-0005-Allow-setting-inactive_timeout-in-the-replicatio.patchDownload

From bcc949abc9521eac3c16353fdf2783ced57eebae Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 22 Mar 2024 06:55:45 +0000
Subject: [PATCH v14 5/6] Allow setting inactive_timeout in the replication
 commands

---
 doc/src/sgml/protocol.sgml                    | 20 ++++++
 src/backend/commands/subscriptioncmds.c       |  6 +-
 .../libpqwalreceiver/libpqwalreceiver.c       | 61 ++++++++++++++++---
 src/backend/replication/logical/tablesync.c   |  1 +
 src/backend/replication/slot.c                | 30 ++++++++-
 src/backend/replication/walreceiver.c         |  2 +-
 src/backend/replication/walsender.c           | 38 +++++++++---
 src/include/replication/slot.h                |  3 +-
 src/include/replication/walreceiver.h         | 11 ++--
 src/test/recovery/t/001_stream_rep.pl         | 50 +++++++++++++++
 10 files changed, 195 insertions(+), 27 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index a5cb19357f..2ffa1b470a 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2068,6 +2068,16 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>INACTIVE_TIMEOUT [ <replaceable class="parameter">integer</replaceable> ]</literal></term>
+        <listitem>
+         <para>
+          If set to a non-zero value, specifies the amount of time in seconds
+          the slot is allowed to be inactive. The default is zero.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
       <para>
@@ -2168,6 +2178,16 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>INACTIVE_TIMEOUT [ <replaceable class="parameter">integer</replaceable> ]</literal></term>
+        <listitem>
+         <para>
+          If set to a non-zero value, specifies the amount of time in seconds
+          the slot is allowed to be inactive. The default is zero.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </listitem>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 5a47fa984d..4562de49c4 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -827,7 +827,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 					twophase_enabled = true;
 
 				walrcv_create_slot(wrconn, opts.slot_name, false, twophase_enabled,
-								   opts.failover, CRS_NOEXPORT_SNAPSHOT, NULL);
+								   opts.failover, 0, CRS_NOEXPORT_SNAPSHOT, NULL);
 
 				if (twophase_enabled)
 					UpdateTwoPhaseState(subid, LOGICALREP_TWOPHASE_STATE_ENABLED);
@@ -849,7 +849,7 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
 			else if (opts.slot_name &&
 					 (opts.failover || walrcv_server_version(wrconn) >= 170000))
 			{
-				walrcv_alter_slot(wrconn, opts.slot_name, opts.failover);
+				walrcv_alter_slot(wrconn, opts.slot_name, &opts.failover, NULL);
 			}
 		}
 		PG_FINALLY();
@@ -1541,7 +1541,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 
 		PG_TRY();
 		{
-			walrcv_alter_slot(wrconn, sub->slotname, opts.failover);
+			walrcv_alter_slot(wrconn, sub->slotname, &opts.failover, NULL);
 		}
 		PG_FINALLY();
 		{
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 761bf0f677..126250a076 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -77,10 +77,11 @@ static char *libpqrcv_create_slot(WalReceiverConn *conn,
 								  bool temporary,
 								  bool two_phase,
 								  bool failover,
+								  int inactive_timeout,
 								  CRSSnapshotAction snapshot_action,
 								  XLogRecPtr *lsn);
 static void libpqrcv_alter_slot(WalReceiverConn *conn, const char *slotname,
-								bool failover);
+								bool *failover, int *inactive_timeout);
 static pid_t libpqrcv_get_backend_pid(WalReceiverConn *conn);
 static WalRcvExecResult *libpqrcv_exec(WalReceiverConn *conn,
 									   const char *query,
@@ -1008,7 +1009,8 @@ libpqrcv_send(WalReceiverConn *conn, const char *buffer, int nbytes)
  */
 static char *
 libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname,
-					 bool temporary, bool two_phase, bool failover,
+					 bool temporary, bool two_phase,
+					 bool failover, int inactive_timeout,
 					 CRSSnapshotAction snapshot_action, XLogRecPtr *lsn)
 {
 	PGresult   *res;
@@ -1048,6 +1050,15 @@ libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname,
 				appendStringInfoChar(&cmd, ' ');
 		}
 
+		if (inactive_timeout > 0)
+		{
+			appendStringInfo(&cmd, "INACTIVE_TIMEOUT %d", inactive_timeout);
+			if (use_new_options_syntax)
+				appendStringInfoString(&cmd, ", ");
+			else
+				appendStringInfoChar(&cmd, ' ');
+		}
+
 		if (use_new_options_syntax)
 		{
 			switch (snapshot_action)
@@ -1084,10 +1095,24 @@ libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname,
 	}
 	else
 	{
+		appendStringInfoString(&cmd, " PHYSICAL ");
 		if (use_new_options_syntax)
-			appendStringInfoString(&cmd, " PHYSICAL (RESERVE_WAL)");
-		else
-			appendStringInfoString(&cmd, " PHYSICAL RESERVE_WAL");
+			appendStringInfoChar(&cmd, '(');
+
+		appendStringInfoString(&cmd, "RESERVE_WAL");
+
+		if (inactive_timeout > 0)
+		{
+			if (use_new_options_syntax)
+				appendStringInfoString(&cmd, ", ");
+			else
+				appendStringInfoChar(&cmd, ' ');
+
+			appendStringInfo(&cmd, "INACTIVE_TIMEOUT %d", inactive_timeout);
+		}
+
+		if (use_new_options_syntax)
+			appendStringInfoChar(&cmd, ')');
 	}
 
 	res = libpqrcv_PQexec(conn->streamConn, cmd.data);
@@ -1121,15 +1146,33 @@ libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname,
  */
 static void
 libpqrcv_alter_slot(WalReceiverConn *conn, const char *slotname,
-					bool failover)
+					bool *failover, int *inactive_timeout)
 {
 	StringInfoData cmd;
 	PGresult   *res;
+	bool		specified_prev_opt = false;
 
 	initStringInfo(&cmd);
-	appendStringInfo(&cmd, "ALTER_REPLICATION_SLOT %s ( FAILOVER %s )",
-					 quote_identifier(slotname),
-					 failover ? "true" : "false");
+	appendStringInfo(&cmd, "ALTER_REPLICATION_SLOT %s (",
+					 quote_identifier(slotname));
+
+	if (failover != NULL)
+	{
+		appendStringInfo(&cmd, "FAILOVER %s",
+						 *failover ? "true" : "false");
+		specified_prev_opt = true;
+	}
+
+	if (inactive_timeout != NULL)
+	{
+		if (specified_prev_opt)
+			appendStringInfoString(&cmd, ", ");
+
+		appendStringInfo(&cmd, "INACTIVE_TIMEOUT %d", *inactive_timeout);
+		specified_prev_opt = true;
+	}
+
+	appendStringInfoChar(&cmd, ')');
 
 	res = libpqrcv_PQexec(conn->streamConn, cmd.data);
 	pfree(cmd.data);
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 1061d5b61b..59f8e5fbaa 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1431,6 +1431,7 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 	walrcv_create_slot(LogRepWorkerWalRcvConn,
 					   slotname, false /* permanent */ , false /* two_phase */ ,
 					   MySubscription->failover,
+					   0,
 					   CRS_USE_SNAPSHOT, origin_startpos);
 
 	/*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 5644765a7e..3680a608c3 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -807,8 +807,10 @@ ReplicationSlotDrop(const char *name, bool nowait)
  * Change the definition of the slot identified by the specified name.
  */
 void
-ReplicationSlotAlter(const char *name, bool failover)
+ReplicationSlotAlter(const char *name, bool failover, int inactive_timeout)
 {
+	bool		lock_acquired;
+
 	Assert(MyReplicationSlot == NULL);
 
 	ReplicationSlotAcquire(name, false);
@@ -851,10 +853,36 @@ ReplicationSlotAlter(const char *name, bool failover)
 				errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				errmsg("cannot enable failover for a temporary replication slot"));
 
+	/*
+	 * Do not allow users to set inactive_timeout for temporary slots because
+	 * temporary, slots will not be saved to the disk.
+	 */
+	if (inactive_timeout > 0 && MyReplicationSlot->data.persistency == RS_TEMPORARY)
+		ereport(ERROR,
+				errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				errmsg("cannot set inactive_timeout for a temporary replication slot"));
+
+	lock_acquired = false;
 	if (MyReplicationSlot->data.failover != failover)
 	{
 		SpinLockAcquire(&MyReplicationSlot->mutex);
+		lock_acquired = true;
 		MyReplicationSlot->data.failover = failover;
+	}
+
+	if (MyReplicationSlot->data.inactive_timeout != inactive_timeout)
+	{
+		if (!lock_acquired)
+		{
+			SpinLockAcquire(&MyReplicationSlot->mutex);
+			lock_acquired = true;
+		}
+
+		MyReplicationSlot->data.inactive_timeout = inactive_timeout;
+	}
+
+	if (lock_acquired)
+	{
 		SpinLockRelease(&MyReplicationSlot->mutex);
 
 		ReplicationSlotMarkDirty();
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index acda5f68d9..ac2ebb0c69 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -389,7 +389,7 @@ WalReceiverMain(char *startup_data, size_t startup_data_len)
 					 "pg_walreceiver_%lld",
 					 (long long int) walrcv_get_backend_pid(wrconn));
 
-			walrcv_create_slot(wrconn, slotname, true, false, false, 0, NULL);
+			walrcv_create_slot(wrconn, slotname, true, false, false, 0, 0, NULL);
 
 			SpinLockAcquire(&walrcv->mutex);
 			strlcpy(walrcv->slotname, slotname, NAMEDATALEN);
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 5315c08650..0420274247 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1123,13 +1123,15 @@ static void
 parseCreateReplSlotOptions(CreateReplicationSlotCmd *cmd,
 						   bool *reserve_wal,
 						   CRSSnapshotAction *snapshot_action,
-						   bool *two_phase, bool *failover)
+						   bool *two_phase, bool *failover,
+						   int *inactive_timeout)
 {
 	ListCell   *lc;
 	bool		snapshot_action_given = false;
 	bool		reserve_wal_given = false;
 	bool		two_phase_given = false;
 	bool		failover_given = false;
+	bool		inactive_timeout_given = false;
 
 	/* Parse options */
 	foreach(lc, cmd->options)
@@ -1188,6 +1190,15 @@ parseCreateReplSlotOptions(CreateReplicationSlotCmd *cmd,
 			failover_given = true;
 			*failover = defGetBoolean(defel);
 		}
+		else if (strcmp(defel->defname, "inactive_timeout") == 0)
+		{
+			if (inactive_timeout_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			inactive_timeout_given = true;
+			*inactive_timeout = defGetInt32(defel);
+		}
 		else
 			elog(ERROR, "unrecognized option: %s", defel->defname);
 	}
@@ -1205,6 +1216,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	bool		reserve_wal = false;
 	bool		two_phase = false;
 	bool		failover = false;
+	int			inactive_timeout = 0;
 	CRSSnapshotAction snapshot_action = CRS_EXPORT_SNAPSHOT;
 	DestReceiver *dest;
 	TupOutputState *tstate;
@@ -1215,13 +1227,13 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	Assert(!MyReplicationSlot);
 
 	parseCreateReplSlotOptions(cmd, &reserve_wal, &snapshot_action, &two_phase,
-							   &failover);
+							   &failover, &inactive_timeout);
 
 	if (cmd->kind == REPLICATION_KIND_PHYSICAL)
 	{
 		ReplicationSlotCreate(cmd->slotname, false,
 							  cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
-							  false, false, false, 0);
+							  false, false, false, inactive_timeout);
 
 		if (reserve_wal)
 		{
@@ -1252,7 +1264,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 		 */
 		ReplicationSlotCreate(cmd->slotname, true,
 							  cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
-							  two_phase, failover, false, 0);
+							  two_phase, failover, false, inactive_timeout);
 
 		/*
 		 * Do options check early so that we can bail before calling the
@@ -1411,9 +1423,11 @@ DropReplicationSlot(DropReplicationSlotCmd *cmd)
  * Process extra options given to ALTER_REPLICATION_SLOT.
  */
 static void
-ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover)
+ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover,
+						  int *inactive_timeout)
 {
 	bool		failover_given = false;
+	bool		inactive_timeout_given = false;
 
 	/* Parse options */
 	foreach_ptr(DefElem, defel, cmd->options)
@@ -1427,6 +1441,15 @@ ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover)
 			failover_given = true;
 			*failover = defGetBoolean(defel);
 		}
+		else if (strcmp(defel->defname, "inactive_timeout") == 0)
+		{
+			if (inactive_timeout_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			inactive_timeout_given = true;
+			*inactive_timeout = defGetInt32(defel);
+		}
 		else
 			elog(ERROR, "unrecognized option: %s", defel->defname);
 	}
@@ -1439,9 +1462,10 @@ static void
 AlterReplicationSlot(AlterReplicationSlotCmd *cmd)
 {
 	bool		failover = false;
+	int			inactive_timeout = 0;
 
-	ParseAlterReplSlotOptions(cmd, &failover);
-	ReplicationSlotAlter(cmd->slotname, failover);
+	ParseAlterReplSlotOptions(cmd, &failover, &inactive_timeout);
+	ReplicationSlotAlter(cmd->slotname, failover, inactive_timeout);
 }
 
 /*
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index a8d7d42a07..9cd4bf98e5 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -246,7 +246,8 @@ extern void ReplicationSlotCreate(const char *name, bool db_specific,
 extern void ReplicationSlotPersist(void);
 extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
-extern void ReplicationSlotAlter(const char *name, bool failover);
+extern void ReplicationSlotAlter(const char *name, bool failover,
+								 int inactive_timeout);
 
 extern void ReplicationSlotAcquire(const char *name, bool nowait);
 extern void ReplicationSlotRelease(void);
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 12f71fa99b..038812fd24 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -366,6 +366,7 @@ typedef char *(*walrcv_create_slot_fn) (WalReceiverConn *conn,
 										bool temporary,
 										bool two_phase,
 										bool failover,
+										int inactive_timeout,
 										CRSSnapshotAction snapshot_action,
 										XLogRecPtr *lsn);
 
@@ -377,7 +378,7 @@ typedef char *(*walrcv_create_slot_fn) (WalReceiverConn *conn,
  */
 typedef void (*walrcv_alter_slot_fn) (WalReceiverConn *conn,
 									  const char *slotname,
-									  bool failover);
+									  bool *failover, int *inactive_timeout);
 
 /*
  * walrcv_get_backend_pid_fn
@@ -453,10 +454,10 @@ extern PGDLLIMPORT WalReceiverFunctionsType *WalReceiverFunctions;
 	WalReceiverFunctions->walrcv_receive(conn, buffer, wait_fd)
 #define walrcv_send(conn, buffer, nbytes) \
 	WalReceiverFunctions->walrcv_send(conn, buffer, nbytes)
-#define walrcv_create_slot(conn, slotname, temporary, two_phase, failover, snapshot_action, lsn) \
-	WalReceiverFunctions->walrcv_create_slot(conn, slotname, temporary, two_phase, failover, snapshot_action, lsn)
-#define walrcv_alter_slot(conn, slotname, failover) \
-	WalReceiverFunctions->walrcv_alter_slot(conn, slotname, failover)
+#define walrcv_create_slot(conn, slotname, temporary, two_phase, failover, inactive_timeout, snapshot_action, lsn) \
+	WalReceiverFunctions->walrcv_create_slot(conn, slotname, temporary, two_phase, failover, inactive_timeout, snapshot_action, lsn)
+#define walrcv_alter_slot(conn, slotname, failover, inactive_timeout) \
+	WalReceiverFunctions->walrcv_alter_slot(conn, slotname, failover, inactive_timeout)
 #define walrcv_get_backend_pid(conn) \
 	WalReceiverFunctions->walrcv_get_backend_pid(conn)
 #define walrcv_exec(conn, exec, nRetTypes, retTypes) \
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 5311ade509..db00b6aa24 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -604,4 +604,54 @@ ok( pump_until(
 	'base backup cleanly canceled');
 $sigchld_bb->finish();
 
+# Drop any existing slots on the primary, for the follow-up tests.
+$node_primary->safe_psql('postgres',
+	"SELECT pg_drop_replication_slot(slot_name) FROM pg_replication_slots;");
+
+# Test setting inactive_timeout option via replication commands.
+$node_primary->append_conf(
+	'postgresql.conf', qq(
+wal_level = logical
+));
+$node_primary->restart;
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_phy_slot1 PHYSICAL (RESERVE_WAL, INACTIVE_TIMEOUT 100);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_phy_slot2 PHYSICAL (RESERVE_WAL);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"ALTER_REPLICATION_SLOT it_phy_slot2 (INACTIVE_TIMEOUT 200);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_log_slot1 LOGICAL pgoutput (TWO_PHASE, INACTIVE_TIMEOUT 300);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_log_slot2 LOGICAL pgoutput;",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"ALTER_REPLICATION_SLOT it_log_slot2 (INACTIVE_TIMEOUT 400);",
+	extra_params => [ '-d', $connstr_db ]);
+
+my $slot_info_expected = 'it_log_slot1|logical|300
+it_log_slot2|logical|400
+it_phy_slot1|physical|100
+it_phy_slot2|physical|0';
+
+my $slot_info = $node_primary->safe_psql('postgres',
+	qq[SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;]);
+is($slot_info, $slot_info_expected, "replication slots with inactive_timeout on primary exist");
+
 done_testing();
-- 
2.34.1

v14-0006-Add-inactive_timeout-based-replication-slot-inva.patchapplication/x-patch; name=v14-0006-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 0a30fab6610a1197f58124d276369ac89f7cba99 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 22 Mar 2024 07:54:21 +0000
Subject: [PATCH v14 6/6] Add inactive_timeout based replication slot
 invalidation

---
 doc/src/sgml/func.sgml                        |  12 +-
 doc/src/sgml/system-views.sgml                |  10 +-
 .../replication/logical/logicalfuncs.c        |   4 +-
 src/backend/replication/logical/slotsync.c    |   8 +-
 src/backend/replication/slot.c                | 240 ++++++++++++++++--
 src/backend/replication/slotfuncs.c           |  27 +-
 src/backend/replication/walsender.c           |  12 +-
 src/backend/tcop/postgres.c                   |   2 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   4 +-
 src/include/replication/slot.h                |  11 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 170 +++++++++++++
 12 files changed, 455 insertions(+), 46 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 22c8e0d39c..4826e45c7d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28393,8 +28393,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         released upon any error. The optional fourth
         parameter, <parameter>inactive_timeout</parameter>, when set to a
         non-zero value, specifies the amount of time in seconds the slot is
-        allowed to be inactive. This function corresponds to the replication
-        protocol command
+        allowed to be inactive before getting invalidated.
+        This function corresponds to the replication protocol command
         <literal>CREATE_REPLICATION_SLOT ... PHYSICAL</literal>.
        </para></entry>
       </row>
@@ -28439,12 +28439,12 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <parameter>failover</parameter>, when set to true,
         specifies that this slot is enabled to be synced to the
         standbys so that logical replication can be resumed after
-        failover.  The optional sixth parameter,
+        failover. The optional sixth parameter,
         <parameter>inactive_timeout</parameter>, when set to a
         non-zero value, specifies the amount of time in seconds the slot is
-        allowed to be inactive. A call to this function has the same effect as
-        the replication protocol command
-        <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
+        allowed to be inactive before getting invalidated.
+        A call to this function has the same effect as the replication protocol
+        command <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
        </para></entry>
       </row>
 
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index f8838b1a23..8e7d9c9105 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2563,6 +2563,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by slot's
+          <literal>inactive_timeout</literal> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
@@ -2767,7 +2774,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        <structfield>inactive_timeout</structfield> <type>integer</type>
       </para>
       <para>
-        The amount of time in seconds the slot is allowed to be inactive.
+        The amount of time in seconds the slot is allowed to be inactive before
+        getting invalidated.
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..53cf8bbd42 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
@@ -309,7 +309,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 		/* free context, call shutdown callback */
 		FreeDecodingContext(ctx);
 
-		ReplicationSlotRelease();
+		ReplicationSlotRelease(true);
 		InvalidateSystemCaches();
 	}
 	PG_CATCH();
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index c01876ceeb..5aba117e2b 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -319,7 +319,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -529,7 +529,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
 		 * if the slot is not acquired by other processes.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -554,7 +554,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		/* Skip the sync of an invalidated slot */
 		if (slot->data.invalidated != RS_INVAL_NONE)
 		{
-			ReplicationSlotRelease();
+			ReplicationSlotRelease(false);
 			return slot_updated;
 		}
 
@@ -640,7 +640,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		slot_updated = true;
 	}
 
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(false);
 
 	return slot_updated;
 }
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 3680a608c3..0acf1d1960 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -158,6 +159,9 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool InvalidateSlotForInactiveTimeout(ReplicationSlot *slot,
+											 bool need_control_lock,
+											 bool need_mutex);
 
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
@@ -233,7 +237,7 @@ ReplicationSlotShmemExit(int code, Datum arg)
 {
 	/* Make sure active replication slots are released */
 	if (MyReplicationSlot != NULL)
-		ReplicationSlotRelease();
+		ReplicationSlotRelease(true);
 
 	/* Also cleanup all the temporary slots. */
 	ReplicationSlotCleanup();
@@ -424,7 +428,19 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->candidate_restart_valid = InvalidXLogRecPtr;
 	slot->candidate_restart_lsn = InvalidXLogRecPtr;
 	slot->last_saved_confirmed_flush = InvalidXLogRecPtr;
-	slot->last_inactive_at = 0;
+
+	/*
+	 * We set last_inactive_at after creation of the slot so that the
+	 * inactive_timeout if set is honored.
+	 *
+	 * There's no point in allowing failover slots to get invalidated based on
+	 * slot's inactive_timeout parameter on standby. The failover slots simply
+	 * get synced from the primary on the standby.
+	 */
+	if (!(RecoveryInProgress() && slot->data.failover))
+		slot->last_inactive_at = GetCurrentTimestamp();
+	else
+		slot->last_inactive_at = 0;
 
 	/*
 	 * Create the slot on disk.  We haven't actually marked the slot allocated
@@ -550,9 +566,14 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * If check_for_invalidation is true, the slot is checked for invalidation
+ * based on its inactive_timeout parameter and an error is raised after making
+ * the slot ours.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -630,6 +651,42 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * Check if the given slot can be invalidated based on its
+	 * inactive_timeout parameter. If yes, persist the invalidated state to
+	 * disk and then error out. We do this only after making the slot ours to
+	 * avoid anyone else acquiring it while we check for its invalidation.
+	 */
+	if (check_for_invalidation)
+	{
+		/* The slot is ours by now */
+		Assert(s->active_pid == MyProcPid);
+
+		/*
+		 * Well, the slot is not yet ours really unless we check for the
+		 * invalidation below.
+		 */
+		s->active_pid = 0;
+		if (InvalidateReplicationSlotForInactiveTimeout(s, true, true, true))
+		{
+			/*
+			 * If the slot has been invalidated, recalculate the resource
+			 * limits.
+			 */
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+
+			/* Might need it for slot clean up on error, so restore it */
+			s->active_pid = MyProcPid;
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("cannot acquire invalidated replication slot \"%s\"",
+							NameStr(MyReplicationSlot->data.name)),
+					 errdetail("This slot has been invalidated because of its inactive_timeout parameter.")));
+		}
+		s->active_pid = MyProcPid;
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -663,7 +720,7 @@ retry:
  * Resources this slot requires will be preserved.
  */
 void
-ReplicationSlotRelease(void)
+ReplicationSlotRelease(bool set_last_inactive_at)
 {
 	ReplicationSlot *slot = MyReplicationSlot;
 	char	   *slotname = NULL;	/* keep compiler quiet */
@@ -714,11 +771,20 @@ ReplicationSlotRelease(void)
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 
-	if (slot->data.persistency == RS_PERSISTENT)
+	if (set_last_inactive_at &&
+		slot->data.persistency == RS_PERSISTENT)
 	{
-		SpinLockAcquire(&slot->mutex);
-		slot->last_inactive_at = GetCurrentTimestamp();
-		SpinLockRelease(&slot->mutex);
+		/*
+		 * There's no point in allowing failover slots to get invalidated
+		 * based on slot's inactive_timeout parameter on standby. The failover
+		 * slots simply get synced from the primary on the standby.
+		 */
+		if (!(RecoveryInProgress() && slot->data.failover))
+		{
+			SpinLockAcquire(&slot->mutex);
+			slot->last_inactive_at = GetCurrentTimestamp();
+			SpinLockRelease(&slot->mutex);
+		}
 	}
 
 	MyReplicationSlot = NULL;
@@ -788,7 +854,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -813,7 +879,7 @@ ReplicationSlotAlter(const char *name, bool failover, int inactive_timeout)
 
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -889,7 +955,7 @@ ReplicationSlotAlter(const char *name, bool failover, int inactive_timeout)
 		ReplicationSlotSave();
 	}
 
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(true);
 }
 
 
@@ -1542,6 +1608,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by slot's inactive_timeout parameter."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1655,6 +1724,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (InvalidateReplicationSlotForInactiveTimeout(s, false, false, false))
+						invalidation_cause = cause;
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1781,7 +1854,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			/* Make sure the invalidated state persists across server restart */
 			ReplicationSlotMarkDirty();
 			ReplicationSlotSave();
-			ReplicationSlotRelease();
+			ReplicationSlotRelease(true);
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
@@ -1808,6 +1881,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1859,6 +1933,110 @@ restart:
 	return invalidated;
 }
 
+/*
+ * Invalidate given slot based on its inactive_timeout parameter.
+ *
+ * Returns true if the slot has got invalidated.
+ *
+ * NB - this function also runs as part of checkpoint, so avoid raising errors
+ * if possible.
+ */
+bool
+InvalidateReplicationSlotForInactiveTimeout(ReplicationSlot *slot,
+											bool need_control_lock,
+											bool need_mutex,
+											bool persist_state)
+{
+	if (!InvalidateSlotForInactiveTimeout(slot, need_control_lock, need_mutex))
+		return false;
+
+	Assert(slot->active_pid == 0);
+
+	SpinLockAcquire(&slot->mutex);
+	slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT;
+
+	/* Make sure the invalidated state persists across server restart */
+	slot->just_dirtied = true;
+	slot->dirty = true;
+	SpinLockRelease(&slot->mutex);
+
+	if (persist_state)
+	{
+		char		path[MAXPGPATH];
+
+		sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+		ReplicationSlotSaveToPath(slot, path, ERROR);
+	}
+
+	ReportSlotInvalidation(RS_INVAL_INACTIVE_TIMEOUT, false, 0,
+						   slot->data.name, InvalidXLogRecPtr,
+						   InvalidXLogRecPtr, InvalidTransactionId);
+
+	return true;
+}
+
+/*
+ * Helper for InvalidateReplicationSlotForInactiveTimeout
+ */
+static bool
+InvalidateSlotForInactiveTimeout(ReplicationSlot *slot,
+								 bool need_control_lock,
+								 bool need_mutex)
+{
+	ReplicationSlotInvalidationCause inavidation_cause = RS_INVAL_NONE;
+
+	if (slot->last_inactive_at == 0 ||
+		slot->data.inactive_timeout == 0)
+		return false;
+
+	/* inactive_timeout is only tracked for permanent slots */
+	if (slot->data.persistency != RS_PERSISTENT)
+		return false;
+
+	/*
+	 * There's no point in allowing failover slots to get invalidated based on
+	 * slot's inactive_timeout parameter on standby. The failover slots simply
+	 * get synced from the primary on the standby.
+	 */
+	if (RecoveryInProgress() && slot->data.failover)
+		return false;
+
+	if (need_control_lock)
+		LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
+
+	/*
+	 * Check if the slot needs to be invalidated due to inactive_timeout. We
+	 * do this with the spinlock held to avoid race conditions -- for example
+	 * the restart_lsn could move forward, or the slot could be dropped.
+	 */
+	if (need_mutex)
+		SpinLockAcquire(&slot->mutex);
+
+	if (slot->last_inactive_at > 0 &&
+		slot->data.inactive_timeout > 0)
+	{
+		TimestampTz now;
+
+		/* last_inactive_at is only tracked for inactive slots */
+		Assert(slot->active_pid == 0);
+
+		now = GetCurrentTimestamp();
+		if (TimestampDifferenceExceeds(slot->last_inactive_at, now,
+									   slot->data.inactive_timeout * 1000))
+			inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+	}
+
+	if (need_mutex)
+		SpinLockRelease(&slot->mutex);
+
+	if (need_control_lock)
+		LWLockRelease(ReplicationSlotControlLock);
+
+	return (inavidation_cause == RS_INVAL_INACTIVE_TIMEOUT);
+}
+
 /*
  * Flush all replication slots to disk.
  *
@@ -1871,6 +2049,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated = false;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -1892,10 +2071,11 @@ CheckPointReplicationSlots(bool is_shutdown)
 			continue;
 
 		/*
-		 * Save the slot to disk, locking is handled in
-		 * ReplicationSlotSaveToPath.
+		 * Here's an opportunity to invalidate inactive replication slots
+		 * based on timeout, so let's do it.
 		 */
-		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
+		if (InvalidateReplicationSlotForInactiveTimeout(s, true, true, false))
+			invalidated = true;
 
 		/*
 		 * Slot's data is not flushed each time the confirmed_flush LSN is
@@ -1920,9 +2100,21 @@ CheckPointReplicationSlots(bool is_shutdown)
 			SpinLockRelease(&s->mutex);
 		}
 
+		/*
+		 * Save the slot to disk, locking is handled in
+		 * ReplicationSlotSaveToPath.
+		 */
+		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
 		ReplicationSlotSaveToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	/* If the slot has been invalidated, recalculate the resource limits */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
 }
 
 /*
@@ -2404,7 +2596,21 @@ RestoreSlotFromDisk(const char *name)
 
 		slot->in_use = true;
 		slot->active_pid = 0;
-		slot->last_inactive_at = 0;
+
+		/*
+		 * We set last_inactive_at only if inactive_timeout of the slot is
+		 * specified so that the timeout is honored after the slot is restored
+		 * from the disk.
+		 *
+		 * There's no point in allowing failover slots to get invalidated
+		 * based on slot's inactive_timeout parameter on standby. The failover
+		 * slots simply get synced from the primary on the standby.
+		 */
+		if (slot->data.inactive_timeout > 0 &&
+			!(RecoveryInProgress() && slot->data.failover))
+			slot->last_inactive_at = GetCurrentTimestamp();
+		else
+			slot->last_inactive_at = 0;
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index d6ef14fba6..7cc5c8bdf6 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -111,7 +111,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 	tuple = heap_form_tuple(tupdesc, values, nulls);
 	result = HeapTupleGetDatum(tuple);
 
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(false);
 
 	PG_RETURN_DATUM(result);
 }
@@ -224,7 +224,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 	/* ok, slot is now fully created, mark it as persistent if needed */
 	if (!temporary)
 		ReplicationSlotPersist();
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(false);
 
 	PG_RETURN_DATUM(result);
 }
@@ -257,6 +257,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
+	bool		invalidated = false;
 
 	/*
 	 * We don't require any special permission to see this function's data
@@ -287,6 +288,13 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		slot_contents = *slot;
 		SpinLockRelease(&slot->mutex);
 
+		/*
+		 * Here's an opportunity to invalidate inactive replication slots
+		 * based on timeout, so let's do it.
+		 */
+		if (InvalidateReplicationSlotForInactiveTimeout(slot, false, true, true))
+			invalidated = true;
+
 		memset(values, 0, sizeof(values));
 		memset(nulls, 0, sizeof(nulls));
 
@@ -465,6 +473,15 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 	LWLockRelease(ReplicationSlotControlLock);
 
+	/*
+	 * If the slot has been invalidated, recalculate the resource limits
+	 */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
+
 	return (Datum) 0;
 }
 
@@ -667,7 +684,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
@@ -710,7 +727,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 	ReplicationSlotsComputeRequiredXmin(false);
 	ReplicationSlotsComputeRequiredLSN();
 
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(true);
 
 	/* Return the reached position. */
 	values[1] = LSNGetDatum(endlsn);
@@ -954,7 +971,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 	tuple = heap_form_tuple(tupdesc, values, nulls);
 	result = HeapTupleGetDatum(tuple);
 
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(false);
 
 	PG_RETURN_DATUM(result);
 }
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 0420274247..b6795048cc 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -334,7 +334,7 @@ WalSndErrorCleanup(void)
 		wal_segment_close(xlogreader);
 
 	if (MyReplicationSlot != NULL)
-		ReplicationSlotRelease();
+		ReplicationSlotRelease(true);
 
 	ReplicationSlotCleanup();
 
@@ -846,7 +846,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -992,7 +992,7 @@ StartReplication(StartReplicationCmd *cmd)
 	}
 
 	if (cmd->slotname)
-		ReplicationSlotRelease();
+		ReplicationSlotRelease(true);
 
 	/*
 	 * Copy is finished now. Send a single-row result set indicating the next
@@ -1407,7 +1407,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	do_tup_output(tstate, values, nulls);
 	end_tup_output(tstate);
 
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(false);
 }
 
 /*
@@ -1483,7 +1483,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
@@ -1545,7 +1545,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 	WalSndLoop(XLogSendLogical);
 
 	FreeDecodingContext(logical_decoding_ctx);
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(true);
 
 	replication_active = false;
 	if (got_STOPPING)
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index fd4199a098..749de2741e 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4407,7 +4407,7 @@ PostgresMain(const char *dbname, const char *username)
 		 * callback ensuring correct cleanup on FATAL errors.
 		 */
 		if (MyReplicationSlot != NULL)
-			ReplicationSlotRelease();
+			ReplicationSlotRelease(true);
 
 		/* We also want to cleanup temporary slots on error. */
 		ReplicationSlotCleanup();
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..d56ecf4137 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
@@ -310,7 +310,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	found_pending_wal = LogicalReplicationSlotHasPendingWal(end_of_wal);
 
 	/* Clean up */
-	ReplicationSlotRelease();
+	ReplicationSlotRelease(false);
 
 	PG_RETURN_BOOL(!found_pending_wal);
 }
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 9cd4bf98e5..bd4ad48ce8 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -249,8 +251,9 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, bool failover,
 								 int inactive_timeout);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
-extern void ReplicationSlotRelease(void);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_invalidation);
+extern void ReplicationSlotRelease(bool set_last_inactive_at);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
 extern void ReplicationSlotSaveToPath(ReplicationSlot *slot, const char *dir,
@@ -270,6 +273,10 @@ extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
+extern bool InvalidateReplicationSlotForInactiveTimeout(ReplicationSlot *slot,
+														bool need_control_lock,
+														bool need_mutex,
+														bool persist_state);
 extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock);
 extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..6adaa1d648
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,170 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Check for invalidation of slot in server log.
+sub check_slots_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"", $offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated, "check that slot $slot_name invalidation has been logged");
+}
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot due to inactive_timeout
+#
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+$standby1->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+});
+
+# Set timeout so that the slot when inactive will get invalidated after the
+# timeout.
+my $inactive_timeout = 1;
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot', inactive_timeout := $inactive_timeout);
+]);
+
+$standby1->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# The inactive replication slot info should be null when the slot is active
+my $result = $primary->safe_psql(
+	'postgres', qq[
+	SELECT last_inactive_at IS NULL, inactive_timeout = $inactive_timeout
+		FROM pg_replication_slots WHERE slot_name = 'sb1_slot';
+]);
+is($result, "t|t",
+	'check the inactive replication slot info for an active slot');
+
+my $logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby1->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_at IS NOT NULL
+            AND slot_name = 'sb1_slot'
+            AND inactive_timeout = $inactive_timeout;
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+check_slots_invalidation_in_server_log($primary, 'sb1_slot', $logstart);
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for inactive replication slot sb1_slot to be invalidated";
+
+# Testcase end: Invalidate streaming standby's slot due to inactive_timeout
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to inactive_timeout
+my $publisher = $primary;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$subscriber->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot')"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+$result = $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+# Alter slot to set inactive_timeout
+$publisher->safe_psql(
+	'postgres', qq[
+	SELECT pg_alter_replication_slot(slot_name := 'lsub1_slot', inactive_timeout := $inactive_timeout);
+]);
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the inactive replication slot info to be updated
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_at IS NOT NULL
+            AND slot_name = 'lsub1_slot'
+            AND inactive_timeout = $inactive_timeout;
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+check_slots_invalidation_in_server_log($publisher, 'lsub1_slot', $logstart);
+
+# Testcase end: Invalidate logical subscriber's slot due to inactive_timeout
+# =============================================================================
+
+done_testing();
-- 
2.34.1

#103

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#102)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote:

On Fri, Mar 22, 2024 at 12:39 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Please find the v14-0001 patch for now.

Thanks!

LGTM. Let's wait for Bertrand to see if he has more comments on 0001
and then I'll push it.

LGTM too.

Thanks. Here I'm implementing the following:

Thanks!

0001 Track invalidation_reason in pg_replication_slots
0002 Track last_inactive_at in pg_replication_slots
0003 Allow setting inactive_timeout for replication slots via SQL API
0004 Introduce new SQL funtion pg_alter_replication_slot
0005 Allow setting inactive_timeout in the replication command
0006 Add inactive_timeout based replication slot invalidation

1. Keep it last_inactive_at as a shared memory variable, but always
set it at restart if the slot's inactive_timeout has non-zero value
and reset it as soon as someone acquires that slot so that if the slot
doesn't get acquired till inactive_timeout, checkpointer will
invalidate the slot.
4. last_inactive_at should also be set to the current time during slot
creation because if one creates a slot and does nothing with it then
it's the time it starts to be inactive.

I did not look at the code yet but just tested the behavior. It works as you
describe it but I think this behavior is weird because:

- when we create a slot without a timeout then last_inactive_at is set. I think
that's fine, but then:
- when we restart the engine, then last_inactive_at is gone (as timeout is not
set).

I think last_inactive_at should be set also at engine restart even if there is
no timeout. I don't think we should link both. Changing my mind here on this
subject due to the testing.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#104

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#103)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 22, 2024 at 2:27 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote:

0001 Track invalidation_reason in pg_replication_slots
0002 Track last_inactive_at in pg_replication_slots
0003 Allow setting inactive_timeout for replication slots via SQL API
0004 Introduce new SQL funtion pg_alter_replication_slot
0005 Allow setting inactive_timeout in the replication command
0006 Add inactive_timeout based replication slot invalidation

1. Keep it last_inactive_at as a shared memory variable, but always
set it at restart if the slot's inactive_timeout has non-zero value
and reset it as soon as someone acquires that slot so that if the slot
doesn't get acquired till inactive_timeout, checkpointer will
invalidate the slot.
4. last_inactive_at should also be set to the current time during slot
creation because if one creates a slot and does nothing with it then
it's the time it starts to be inactive.

I did not look at the code yet but just tested the behavior. It works as you
describe it but I think this behavior is weird because:

- when we create a slot without a timeout then last_inactive_at is set. I think
that's fine, but then:
- when we restart the engine, then last_inactive_at is gone (as timeout is not
set).

I think last_inactive_at should be set also at engine restart even if there is
no timeout.

I think it is the opposite. Why do we need to set 'last_inactive_at'
when inactive_timeout is not set? BTW, haven't we discussed that we
don't need to set 'last_inactive_at' at the time of slot creation as
it is sufficient to set it at the time ReplicationSlotRelease()?

A few other comments:
==================
1.
@@ -1027,7 +1027,8 @@ CREATE VIEW pg_replication_slots AS
L.invalidation_reason,
L.failover,
L.synced,
- L.last_inactive_at
+ L.last_inactive_at,
+ L.inactive_timeout

I think it would be better to keep 'inactive_timeout' ahead of
'last_inactive_at' as that is the primary field. In major versions, we
don't have to strictly keep the new fields at the end. In this case,
it seems better to keep these two new fields after two_phase so that
these are before invalidation_reason where we can show the
invalidation due to these fields.

2.
 void
-ReplicationSlotRelease(void)
+ReplicationSlotRelease(bool set_last_inactive_at)

Why do we need a parameter here? Can't we directly check from the slot
whether 'inactive_timeout' has a non-zero value?

3.
+ /*
+ * There's no point in allowing failover slots to get invalidated
+ * based on slot's inactive_timeout parameter on standby. The failover
+ * slots simply get synced from the primary on the standby.
+ */
+ if (!(RecoveryInProgress() && slot->data.failover))

I think you need to check 'sync' flag instead of 'failover'.
Generally, failover marker slots should be invalidated either on
primary or standby unless on standby the 'failover' marked slot is
synced from the primary.

4. I feel the patches should be arranged like 0003->0001, 0002->0002,
0006->0003. We can leave remaining for the time being till we get
these three patches (all three need to be committed as one but it is
okay to keep them separate for review) committed.

--
With Regards,
Amit Kapila.

#105

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#102)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote:

On Fri, Mar 22, 2024 at 12:39 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Please find the v14-0001 patch for now.

Thanks!

LGTM. Let's wait for Bertrand to see if he has more comments on 0001
and then I'll push it.

LGTM too.

Please see the attached v14 patch set. No change in the attached
v14-0001 from the previous patch.

Looking at v14-0002:

1 ===

@@ -691,6 +699,13 @@ ReplicationSlotRelease(void)
ConditionVariableBroadcast(&slot->active_cv);
}

+       if (slot->data.persistency == RS_PERSISTENT)
+       {
+               SpinLockAcquire(&slot->mutex);
+               slot->last_inactive_at = GetCurrentTimestamp();
+               SpinLockRelease(&slot->mutex);
+       }

I'm not sure we should do system calls while we're holding a spinlock.
Assign a variable before?

2 ===

Also, what about moving this here?

"
if (slot->data.persistency == RS_PERSISTENT)
{
/*
* Mark persistent slot inactive. We're not freeing it, just
* disconnecting, but wake up others that may be waiting for it.
*/
SpinLockAcquire(&slot->mutex);
slot->active_pid = 0;
SpinLockRelease(&slot->mutex);
ConditionVariableBroadcast(&slot->active_cv);
}
"

That would avoid testing twice "slot->data.persistency == RS_PERSISTENT".

3 ===

@@ -2341,6 +2356,7 @@ RestoreSlotFromDisk(const char *name)

slot->in_use = true;
slot->active_pid = 0;
+ slot->last_inactive_at = 0;

I think we should put GetCurrentTimestamp() here. It's done in v14-0006 but I
think it's better to do it in 0002 (and not taking care of inactive_timeout).

4 ===

Track last_inactive_at in pg_replication_slots

doc/src/sgml/system-views.sgml | 11 +++++++++++
src/backend/catalog/system_views.sql | 3 ++-
src/backend/replication/slot.c | 16 ++++++++++++++++
src/backend/replication/slotfuncs.c | 7 ++++++-
src/include/catalog/pg_proc.dat | 6 +++---
src/include/replication/slot.h | 3 +++
src/test/regress/expected/rules.out | 5 +++--
7 files changed, 44 insertions(+), 7 deletions(-)

Worth to add some tests too (or we postpone them in future commits because we're
confident enough they will follow soon)?

5 ===

Most of the fields that reflect a time (not duration) in the system views are
xxxx_time, so I'm wondering if instead of "last_inactive_at" we should use
something like "last_inactive_time"?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#106

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#104)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Fri, Mar 22, 2024 at 02:59:21PM +0530, Amit Kapila wrote:

On Fri, Mar 22, 2024 at 2:27 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote:

0001 Track invalidation_reason in pg_replication_slots
0002 Track last_inactive_at in pg_replication_slots
0003 Allow setting inactive_timeout for replication slots via SQL API
0004 Introduce new SQL funtion pg_alter_replication_slot
0005 Allow setting inactive_timeout in the replication command
0006 Add inactive_timeout based replication slot invalidation

1. Keep it last_inactive_at as a shared memory variable, but always
set it at restart if the slot's inactive_timeout has non-zero value
and reset it as soon as someone acquires that slot so that if the slot
doesn't get acquired till inactive_timeout, checkpointer will
invalidate the slot.
4. last_inactive_at should also be set to the current time during slot
creation because if one creates a slot and does nothing with it then
it's the time it starts to be inactive.

I did not look at the code yet but just tested the behavior. It works as you
describe it but I think this behavior is weird because:

- when we create a slot without a timeout then last_inactive_at is set. I think
that's fine, but then:
- when we restart the engine, then last_inactive_at is gone (as timeout is not
set).

I think last_inactive_at should be set also at engine restart even if there is
no timeout.

I think it is the opposite. Why do we need to set 'last_inactive_at'
when inactive_timeout is not set?

I think those are unrelated, one could want to know when a slot has been inactive
even if no timeout is set. I understand that for this patch series we have in mind
to use them both to invalidate slots but I think that there is use case to not
use both in correlation. Also not setting last_inactive_at could give the "false"
impression that the slot is active.

BTW, haven't we discussed that we
don't need to set 'last_inactive_at' at the time of slot creation as
it is sufficient to set it at the time ReplicationSlotRelease()?

Right.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#107

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#105)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 22, 2024 at 3:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote:

1 ===

@@ -691,6 +699,13 @@ ReplicationSlotRelease(void)
ConditionVariableBroadcast(&slot->active_cv);
}
+       if (slot->data.persistency == RS_PERSISTENT)
+       {
+               SpinLockAcquire(&slot->mutex);
+               slot->last_inactive_at = GetCurrentTimestamp();
+               SpinLockRelease(&slot->mutex);
+       }
I'm not sure we should do system calls while we're holding a spinlock.
Assign a variable before?

2 ===

Also, what about moving this here?

"
if (slot->data.persistency == RS_PERSISTENT)
{
/*
* Mark persistent slot inactive. We're not freeing it, just
* disconnecting, but wake up others that may be waiting for it.
*/
SpinLockAcquire(&slot->mutex);
slot->active_pid = 0;
SpinLockRelease(&slot->mutex);
ConditionVariableBroadcast(&slot->active_cv);
}
"

That would avoid testing twice "slot->data.persistency == RS_PERSISTENT".

That sounds like a good idea. Also, don't we need to consider physical
slots where we don't reserve WAL during slot creation? I don't think
there is a need to set inactive_at for such slots. If we agree,
probably checking restart_lsn should suffice the need to know whether
the WAL is reserved or not.

5 ===

Most of the fields that reflect a time (not duration) in the system views are
xxxx_time, so I'm wondering if instead of "last_inactive_at" we should use
something like "last_inactive_time"?

How about naming it as last_active_time? This will indicate the time
at which the slot was last active.

--
With Regards,
Amit Kapila.

#108

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#106)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 22, 2024 at 3:23 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 22, 2024 at 02:59:21PM +0530, Amit Kapila wrote:

On Fri, Mar 22, 2024 at 2:27 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote:

0001 Track invalidation_reason in pg_replication_slots
0002 Track last_inactive_at in pg_replication_slots
0003 Allow setting inactive_timeout for replication slots via SQL API
0004 Introduce new SQL funtion pg_alter_replication_slot
0005 Allow setting inactive_timeout in the replication command
0006 Add inactive_timeout based replication slot invalidation

1. Keep it last_inactive_at as a shared memory variable, but always
set it at restart if the slot's inactive_timeout has non-zero value
and reset it as soon as someone acquires that slot so that if the slot
doesn't get acquired till inactive_timeout, checkpointer will
invalidate the slot.
4. last_inactive_at should also be set to the current time during slot
creation because if one creates a slot and does nothing with it then
it's the time it starts to be inactive.

I did not look at the code yet but just tested the behavior. It works as you
describe it but I think this behavior is weird because:

- when we create a slot without a timeout then last_inactive_at is set. I think
that's fine, but then:
- when we restart the engine, then last_inactive_at is gone (as timeout is not
set).

I think last_inactive_at should be set also at engine restart even if there is
no timeout.

I think it is the opposite. Why do we need to set 'last_inactive_at'
when inactive_timeout is not set?

I think those are unrelated, one could want to know when a slot has been inactive
even if no timeout is set. I understand that for this patch series we have in mind
to use them both to invalidate slots but I think that there is use case to not
use both in correlation. Also not setting last_inactive_at could give the "false"
impression that the slot is active.

I see your point and agree with this. I feel we can commit this part
first then, probably that is the reason Bharath has kept it as a
separate patch. It would be good add the use case for this patch in
the commit message.

A minor comment:

if (SlotIsLogical(s))
pgstat_acquire_replslot(s);

+ if (s->data.persistency == RS_PERSISTENT)
+ {
+ SpinLockAcquire(&s->mutex);
+ s->last_inactive_at = 0;
+ SpinLockRelease(&s->mutex);
+ }
+

I think this part of the change needs a comment.

--
With Regards,
Amit Kapila.

#109

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#107)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Fri, Mar 22, 2024 at 03:56:23PM +0530, Amit Kapila wrote:

On Fri, Mar 22, 2024 at 3:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote:

1 ===

@@ -691,6 +699,13 @@ ReplicationSlotRelease(void)
ConditionVariableBroadcast(&slot->active_cv);
}
+       if (slot->data.persistency == RS_PERSISTENT)
+       {
+               SpinLockAcquire(&slot->mutex);
+               slot->last_inactive_at = GetCurrentTimestamp();
+               SpinLockRelease(&slot->mutex);
+       }
I'm not sure we should do system calls while we're holding a spinlock.
Assign a variable before?

2 ===

Also, what about moving this here?

"
if (slot->data.persistency == RS_PERSISTENT)
{
/*
* Mark persistent slot inactive. We're not freeing it, just
* disconnecting, but wake up others that may be waiting for it.
*/
SpinLockAcquire(&slot->mutex);
slot->active_pid = 0;
SpinLockRelease(&slot->mutex);
ConditionVariableBroadcast(&slot->active_cv);
}
"

That would avoid testing twice "slot->data.persistency == RS_PERSISTENT".
That sounds like a good idea. Also, don't we need to consider physical
slots where we don't reserve WAL during slot creation? I don't think
there is a need to set inactive_at for such slots.

If the slot is not active, why shouldn't we set inactive_at? I can understand
that such a slots do not present "any risks" but I think we should still set
inactive_at (also to not give the false impression that the slot is active).

5 ===

Most of the fields that reflect a time (not duration) in the system views are
xxxx_time, so I'm wondering if instead of "last_inactive_at" we should use
something like "last_inactive_time"?

How about naming it as last_active_time? This will indicate the time
at which the slot was last active.

I thought about it too but I think it could be missleading as one could think that
it should be updated each time WAL record decoding is happening.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#110

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#108)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Fri, Mar 22, 2024 at 04:16:19PM +0530, Amit Kapila wrote:

On Fri, Mar 22, 2024 at 3:23 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 22, 2024 at 02:59:21PM +0530, Amit Kapila wrote:

On Fri, Mar 22, 2024 at 2:27 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote:

0001 Track invalidation_reason in pg_replication_slots
0002 Track last_inactive_at in pg_replication_slots
0003 Allow setting inactive_timeout for replication slots via SQL API
0004 Introduce new SQL funtion pg_alter_replication_slot
0005 Allow setting inactive_timeout in the replication command
0006 Add inactive_timeout based replication slot invalidation

1. Keep it last_inactive_at as a shared memory variable, but always
set it at restart if the slot's inactive_timeout has non-zero value
and reset it as soon as someone acquires that slot so that if the slot
doesn't get acquired till inactive_timeout, checkpointer will
invalidate the slot.
4. last_inactive_at should also be set to the current time during slot
creation because if one creates a slot and does nothing with it then
it's the time it starts to be inactive.

I did not look at the code yet but just tested the behavior. It works as you
describe it but I think this behavior is weird because:

- when we create a slot without a timeout then last_inactive_at is set. I think
that's fine, but then:
- when we restart the engine, then last_inactive_at is gone (as timeout is not
set).

I think last_inactive_at should be set also at engine restart even if there is
no timeout.

I think it is the opposite. Why do we need to set 'last_inactive_at'
when inactive_timeout is not set?

I think those are unrelated, one could want to know when a slot has been inactive
even if no timeout is set. I understand that for this patch series we have in mind
to use them both to invalidate slots but I think that there is use case to not
use both in correlation. Also not setting last_inactive_at could give the "false"
impression that the slot is active.

I see your point and agree with this. I feel we can commit this part
first then,

Agree that in this case the current ordering makes sense (as setting
last_inactive_at would be completly unrelated to the timeout).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#111

Ajin Cherian

itsajin@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#102)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 22, 2024 at 7:15 PM Bharath Rupireddy <
bharath.rupireddyforpostgres@gmail.com> wrote:

On Fri, Mar 22, 2024 at 12:39 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Please find the v14-0001 patch for now.

Thanks!

LGTM. Let's wait for Bertrand to see if he has more comments on 0001
and then I'll push it.

LGTM too.

Thanks. Here I'm implementing the following:

0001 Track invalidation_reason in pg_replication_slots
0002 Track last_inactive_at in pg_replication_slots
0003 Allow setting inactive_timeout for replication slots via SQL API
0004 Introduce new SQL funtion pg_alter_replication_slot
0005 Allow setting inactive_timeout in the replication command
0006 Add inactive_timeout based replication slot invalidation

1. Keep it last_inactive_at as a shared memory variable, but always
set it at restart if the slot's inactive_timeout has non-zero value
and reset it as soon as someone acquires that slot so that if the slot
doesn't get acquired till inactive_timeout, checkpointer will
invalidate the slot.
2. Ensure with pg_alter_replication_slot one could "only" alter the
timeout property for the time being, if not that could lead to the
subscription inconsistency.
3. Have some notes in the CREATE and ALTER SUBSCRIPTION docs about
using an existing slot to leverage inactive_timeout feature.
4. last_inactive_at should also be set to the current time during slot
creation because if one creates a slot and does nothing with it then
it's the time it starts to be inactive.
5. We don't set last_inactive_at to GetCurrentTimestamp() for failover
slots.
6. Leave the patch that added support for inactive_timeout in
subscriptions.

Please see the attached v14 patch set. No change in the attached
v14-0001 from the previous patch.

Some comments:
1. In patch 0005:
In ReplicationSlotAlter():
+ lock_acquired = false;
  if (MyReplicationSlot->data.failover != failover)
  {
  SpinLockAcquire(&MyReplicationSlot->mutex);
+ lock_acquired = true;
  MyReplicationSlot->data.failover = failover;
+ }
+
+ if (MyReplicationSlot->data.inactive_timeout != inactive_timeout)
+ {
+ if (!lock_acquired)
+ {
+ SpinLockAcquire(&MyReplicationSlot->mutex);
+ lock_acquired = true;
+ }
+
+ MyReplicationSlot->data.inactive_timeout = inactive_timeout;
+ }
+
+ if (lock_acquired)
+ {
  SpinLockRelease(&MyReplicationSlot->mutex);

Can't you make it shorter like below:
lock_acquired = false;

if (MyReplicationSlot->data.failover != failover ||
MyReplicationSlot->data.inactive_timeout != inactive_timeout) {
SpinLockAcquire(&MyReplicationSlot->mutex);
lock_acquired = true;
}

if (MyReplicationSlot->data.failover != failover) {
MyReplicationSlot->data.failover = failover;
}

if (MyReplicationSlot->data.inactive_timeout != inactive_timeout) {
MyReplicationSlot->data.inactive_timeout = inactive_timeout;
}

if (lock_acquired) {
SpinLockRelease(&MyReplicationSlot->mutex);
ReplicationSlotMarkDirty();
ReplicationSlotSave();
}

2. In patch 0005: why change walrcv_alter_slot option? it doesn't seem to
be used anywhere, any use case for it? If required, would the intention be
to add this as a Create Subscription option?

regards,
Ajin Cherian
Fujitsu Australia

#112

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#109)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 22, 2024 at 5:30 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 22, 2024 at 03:56:23PM +0530, Amit Kapila wrote:
On Fri, Mar 22, 2024 at 3:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
On Fri, Mar 22, 2024 at 01:45:01PM +0530, Bharath Rupireddy wrote:

1 ===

@@ -691,6 +699,13 @@ ReplicationSlotRelease(void)
ConditionVariableBroadcast(&slot->active_cv);
}
+       if (slot->data.persistency == RS_PERSISTENT)
+       {
+               SpinLockAcquire(&slot->mutex);
+               slot->last_inactive_at = GetCurrentTimestamp();
+               SpinLockRelease(&slot->mutex);
+       }
I'm not sure we should do system calls while we're holding a spinlock.
Assign a variable before?

2 ===

Also, what about moving this here?

"
if (slot->data.persistency == RS_PERSISTENT)
{
/*
* Mark persistent slot inactive. We're not freeing it, just
* disconnecting, but wake up others that may be waiting for it.
*/
SpinLockAcquire(&slot->mutex);
slot->active_pid = 0;
SpinLockRelease(&slot->mutex);
ConditionVariableBroadcast(&slot->active_cv);
}
"

That would avoid testing twice "slot->data.persistency == RS_PERSISTENT".
That sounds like a good idea. Also, don't we need to consider physical
slots where we don't reserve WAL during slot creation? I don't think
there is a need to set inactive_at for such slots.
If the slot is not active, why shouldn't we set inactive_at? I can understand
that such a slots do not present "any risks" but I think we should still set
inactive_at (also to not give the false impression that the slot is active).

But OTOH, there is a chance that we will invalidate such slots even
though they have never reserved WAL in the first place which doesn't
appear to be a good thing.

5 ===

Most of the fields that reflect a time (not duration) in the system views are
xxxx_time, so I'm wondering if instead of "last_inactive_at" we should use
something like "last_inactive_time"?

How about naming it as last_active_time? This will indicate the time
at which the slot was last active.

I thought about it too but I think it could be missleading as one could think that
it should be updated each time WAL record decoding is happening.

Fair enough.

--
With Regards,
Amit Kapila.

#113

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#112)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Fri, Mar 22, 2024 at 06:02:11PM +0530, Amit Kapila wrote:

On Fri, Mar 22, 2024 at 5:30 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 22, 2024 at 03:56:23PM +0530, Amit Kapila wrote:

That would avoid testing twice "slot->data.persistency == RS_PERSISTENT".

That sounds like a good idea. Also, don't we need to consider physical
slots where we don't reserve WAL during slot creation? I don't think
there is a need to set inactive_at for such slots.

If the slot is not active, why shouldn't we set inactive_at? I can understand
that such a slots do not present "any risks" but I think we should still set
inactive_at (also to not give the false impression that the slot is active).

But OTOH, there is a chance that we will invalidate such slots even
though they have never reserved WAL in the first place which doesn't
appear to be a good thing.

That's right but I don't think it is not a good thing. I think we should treat
inactive_at as an independent field (like if the timeout one does not exist at
all) and just focus on its meaning (slot being inactive). If one sets a timeout
(> 0) and gets an invalidation then I think it works as designed (even if the
slot does not present any "risk" as it does not hold any rows or WAL).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#114

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#105)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 22, 2024 at 3:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Looking at v14-0002:

Thanks for reviewing. I agree that 0002 with last_inactive_at can go
independently and be of use on its own in addition to helping
implement inactive_timeout based invalidation.

1 ===

@@ -691,6 +699,13 @@ ReplicationSlotRelease(void)
ConditionVariableBroadcast(&slot->active_cv);
}
+       if (slot->data.persistency == RS_PERSISTENT)
+       {
+               SpinLockAcquire(&slot->mutex);
+               slot->last_inactive_at = GetCurrentTimestamp();
+               SpinLockRelease(&slot->mutex);
+       }
I'm not sure we should do system calls while we're holding a spinlock.
Assign a variable before?

Can do that. Then, the last_inactive_at = current_timestamp + mutex
acquire time. But, that shouldn't be a problem than doing system calls
while holding the mutex. So, done that way.

2 ===

Also, what about moving this here?

"
if (slot->data.persistency == RS_PERSISTENT)
{
/*
* Mark persistent slot inactive. We're not freeing it, just
* disconnecting, but wake up others that may be waiting for it.
*/
SpinLockAcquire(&slot->mutex);
slot->active_pid = 0;
SpinLockRelease(&slot->mutex);
ConditionVariableBroadcast(&slot->active_cv);
}
"

That would avoid testing twice "slot->data.persistency == RS_PERSISTENT".

Ugh. Done that now.

3 ===

@@ -2341,6 +2356,7 @@ RestoreSlotFromDisk(const char *name)

slot->in_use = true;
slot->active_pid = 0;
+ slot->last_inactive_at = 0;

I think we should put GetCurrentTimestamp() here. It's done in v14-0006 but I
think it's better to do it in 0002 (and not taking care of inactive_timeout).

Done.

4 ===

Track last_inactive_at in pg_replication_slots

doc/src/sgml/system-views.sgml | 11 +++++++++++
src/backend/catalog/system_views.sql | 3 ++-
src/backend/replication/slot.c | 16 ++++++++++++++++
src/backend/replication/slotfuncs.c | 7 ++++++-
src/include/catalog/pg_proc.dat | 6 +++---
src/include/replication/slot.h | 3 +++
src/test/regress/expected/rules.out | 5 +++--
7 files changed, 44 insertions(+), 7 deletions(-)

Worth to add some tests too (or we postpone them in future commits because we're
confident enough they will follow soon)?

Yes. Added some tests in a new TAP test file named
src/test/recovery/t/043_replslot_misc.pl. This new file can be used to
add miscellaneous replication tests in future as well. I couldn't find
a better place in existing test files - tried having the new tests for
physical slots in t/001_stream_rep.pl and I didn't find a right place
for logical slots.

5 ===

Most of the fields that reflect a time (not duration) in the system views are
xxxx_time, so I'm wondering if instead of "last_inactive_at" we should use
something like "last_inactive_time"?

Yeah, I can see that. So, I changed it to last_inactive_time.

I agree with treating last_inactive_time as a separate property of the
slot having its own use in addition to helping implement
inactive_timeout based invalidation. I think it can go separately.

I tried to address the review comments received for this patch alone
and attached v15-0001. I'll post other patches soon.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v15-0001-Track-last_inactive_time-in-pg_replication_slots.patchapplication/octet-stream; name=v15-0001-Track-last_inactive_time-in-pg_replication_slots.patchDownload

From 239db578c1c84f264b60190078784a5b4f781c0b Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 22 Mar 2024 21:07:46 +0000
Subject: [PATCH v15] Track last_inactive_time in pg_replication_slots.

Till now, the time at which the replication slot became inactive
is not tracked directly in pg_replication_slots. This commit adds
a new column 'last_inactive_time' for that for persistent slots.
It is set to 0 whenever a slot is made active/acquired and set to
current timestamp whenever the slot is inactive/released.

The new column will be useful on production servers to debug and
analyze inactive replication slots. It will also help to know the
lifetime of a replication slot - one can know how long a streaming
standby, logical subscriber, or replication slot consumer is down.

The new column will be useful to implement inactive timeout based
replication slot invalidation in a future commit.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/system-views.sgml           |  11 +++
 src/backend/catalog/system_views.sql     |   3 +-
 src/backend/replication/slot.c           |  33 +++++++
 src/backend/replication/slotfuncs.c      |   7 +-
 src/include/catalog/pg_proc.dat          |   6 +-
 src/include/replication/slot.h           |   3 +
 src/test/recovery/meson.build            |   1 +
 src/test/recovery/t/043_replslot_misc.pl | 119 +++++++++++++++++++++++
 src/test/regress/expected/rules.out      |   5 +-
 9 files changed, 181 insertions(+), 7 deletions(-)
 create mode 100644 src/test/recovery/t/043_replslot_misc.pl

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index b5da476c20..2628aaa4db 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2592,6 +2592,17 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_time</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used or if this is a temporary replication slot.
+      </para></entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f69b7f5580..1bb350cc3c 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1026,7 +1026,8 @@ CREATE VIEW pg_replication_slots AS
             L.conflicting,
             L.invalidation_reason,
             L.failover,
-            L.synced
+            L.synced,
+            L.last_inactive_time
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index cdf0c450c5..643acc3f05 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -409,6 +409,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->candidate_restart_valid = InvalidXLogRecPtr;
 	slot->candidate_restart_lsn = InvalidXLogRecPtr;
 	slot->last_saved_confirmed_flush = InvalidXLogRecPtr;
+	slot->last_inactive_time = 0;
 
 	/*
 	 * Create the slot on disk.  We haven't actually marked the slot allocated
@@ -622,6 +623,17 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
+	/*
+	 * The slot is active by now, so reset the last inactive time. We don't
+	 * track the last inactive time for non-persistent slots.
+	 */
+	if (s->data.persistency == RS_PERSISTENT)
+	{
+		SpinLockAcquire(&s->mutex);
+		s->last_inactive_time = 0;
+		SpinLockRelease(&s->mutex);
+	}
+
 	if (am_walsender)
 	{
 		ereport(log_replication_commands ? LOG : DEBUG1,
@@ -681,12 +693,23 @@ ReplicationSlotRelease(void)
 
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
+		TimestampTz now;
+
+		/*
+		 * Get current time beforehand to avoid system call while holding the
+		 * lock.
+		 */
+		now = GetCurrentTimestamp();
+
 		/*
 		 * Mark persistent slot inactive.  We're not freeing it, just
 		 * disconnecting, but wake up others that may be waiting for it.
+		 *
+		 * We don't track the last inactive time for non-persistent slots.
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
+		slot->last_inactive_time = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
@@ -2342,6 +2365,16 @@ RestoreSlotFromDisk(const char *name)
 		slot->in_use = true;
 		slot->active_pid = 0;
 
+		/*
+		 * We set last inactive time after loading the slot from the disk into
+		 * memory. Whoever acquires the slot i.e. makes the slot active will
+		 * anyway reset it.
+		 *
+		 * Note that we don't need the slot's persistency check here because
+		 * non-persistent slots don't get saved to disk at all.
+		 */
+		slot->last_inactive_time = GetCurrentTimestamp();
+
 		restored = true;
 		break;
 	}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 4232c1e52e..d115fa88ce 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 18
+#define PG_GET_REPLICATION_SLOTS_COLS 19
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -436,6 +436,11 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.synced);
 
+		if (slot_contents.last_inactive_time > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.last_inactive_time);
+		else
+			nulls[i++] = true;
+
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 71c74350a0..96adf6c5b0 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11133,9 +11133,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,invalidation_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,text,bool,bool,timestamptz}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,invalidation_reason,failover,synced,last_inactive_time}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7f25a083ee..2f18433ecc 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -201,6 +201,9 @@ typedef struct ReplicationSlot
 	 * forcibly flushed or not.
 	 */
 	XLogRecPtr	last_saved_confirmed_flush;
+
+	/* The time at which this slot become inactive */
+	TimestampTz last_inactive_time;
 } ReplicationSlot;
 
 #define SlotIsPhysical(slot) ((slot)->data.database == InvalidOid)
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..c8259f99d5 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/043_replslot_misc.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/043_replslot_misc.pl b/src/test/recovery/t/043_replslot_misc.pl
new file mode 100644
index 0000000000..bdaaa8bce8
--- /dev/null
+++ b/src/test/recovery/t/043_replslot_misc.pl
@@ -0,0 +1,119 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Replication slot related miscellaneous tests
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# =============================================================================
+# Testcase start: Check last_inactive_time property of streaming standby's slot
+#
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby = PostgreSQL::Test::Cluster->new('standby');
+$standby->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $sb_slot = 'sb_slot';
+$standby->append_conf('postgresql.conf', "primary_slot_name = '$sb_slot'");
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := '$sb_slot');
+]);
+
+# Get last_inactive_time value after slot's creation. Note that the slot is still
+# inactive unless it's used by the standby below.
+my $last_inactive_time_1 = $primary->safe_psql('postgres',
+	qq(SELECT last_inactive_time FROM pg_replication_slots WHERE slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;)
+);
+
+$standby->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby);
+
+# Now the slot is active so last_inactive_time value must be NULL
+is( $primary->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$sb_slot';]
+	),
+	't',
+	'last inactive time for an active physical slot is NULL');
+
+# Stop the standby to check its last_inactive_time value is updated
+$standby->stop;
+
+is( $primary->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time > '$last_inactive_time_1'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;]
+	),
+	't',
+	'last inactive time for an inactive physical slot is updated correctly');
+
+# Testcase end: Check last_inactive_time property of streaming standby's slot
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Check last_inactive_time property of logical subscriber's slot
+my $publisher = $primary;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+
+my $lsub_slot = 'lsub_slot';
+$publisher->safe_psql('postgres',
+	"SELECT pg_create_logical_replication_slot(slot_name := '$lsub_slot', plugin := 'pgoutput');"
+);
+
+# Get last_inactive_time value after slot's creation. Note that the slot is still
+# inactive unless it's used by the subscriber below.
+$last_inactive_time_1 = $primary->safe_psql('postgres',
+	qq(SELECT last_inactive_time FROM pg_replication_slots WHERE slot_name = '$lsub_slot' AND last_inactive_time IS NOT NULL;)
+);
+
+$subscriber->start;
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = '$lsub_slot', create_slot = false)"
+);
+
+# Wait until subscriber has caught up
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+# Now the slot is active so last_inactive_time value must be NULL
+is( $publisher->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub_slot';]
+	),
+	't',
+	'last inactive time for an active logical slot is NULL');
+
+# Stop the subscriber to check its last_inactive_time value is updated
+$subscriber->stop;
+
+is( $publisher->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time > '$last_inactive_time_1'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub_slot' AND last_inactive_time IS NOT NULL;]
+	),
+	't',
+	'last inactive time for an inactive logical slot is updated correctly');
+
+# Testcase end: Check last_inactive_time property of logical subscriber's slot
+# =============================================================================
+
+done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 18829ea586..c2110e984d 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1476,8 +1476,9 @@ pg_replication_slots| SELECT l.slot_name,
     l.conflicting,
     l.invalidation_reason,
     l.failover,
-    l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, invalidation_reason, failover, synced)
+    l.synced,
+    l.last_inactive_time
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, invalidation_reason, failover, synced, last_inactive_time)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

#115

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#113)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 22, 2024 at 7:17 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 22, 2024 at 06:02:11PM +0530, Amit Kapila wrote:

On Fri, Mar 22, 2024 at 5:30 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 22, 2024 at 03:56:23PM +0530, Amit Kapila wrote:

That would avoid testing twice "slot->data.persistency == RS_PERSISTENT".

That sounds like a good idea. Also, don't we need to consider physical
slots where we don't reserve WAL during slot creation? I don't think
there is a need to set inactive_at for such slots.

If the slot is not active, why shouldn't we set inactive_at? I can understand
that such a slots do not present "any risks" but I think we should still set
inactive_at (also to not give the false impression that the slot is active).

But OTOH, there is a chance that we will invalidate such slots even
though they have never reserved WAL in the first place which doesn't
appear to be a good thing.

That's right but I don't think it is not a good thing. I think we should treat
inactive_at as an independent field (like if the timeout one does not exist at
all) and just focus on its meaning (slot being inactive). If one sets a timeout
(> 0) and gets an invalidation then I think it works as designed (even if the
slot does not present any "risk" as it does not hold any rows or WAL).

Fair point.

--
With Regards,
Amit Kapila.

#116

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#114)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sat, Mar 23, 2024 at 3:02 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Fri, Mar 22, 2024 at 3:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Worth to add some tests too (or we postpone them in future commits because we're
confident enough they will follow soon)?

Yes. Added some tests in a new TAP test file named
src/test/recovery/t/043_replslot_misc.pl. This new file can be used to
add miscellaneous replication tests in future as well. I couldn't find
a better place in existing test files - tried having the new tests for
physical slots in t/001_stream_rep.pl and I didn't find a right place
for logical slots.

How about adding the test in 019_replslot_limit? It is not a direct
fit but I feel later we can even add 'invalid_timeout' related tests
in this file which will use last_inactive_time feature. It is also
possible that some of the tests added by the 'invalid_timeout' feature
will obviate the need for some of these tests.

Review of v15
==============
1.
@@ -1026,7 +1026,8 @@ CREATE VIEW pg_replication_slots AS
L.conflicting,
L.invalidation_reason,
L.failover,
- L.synced
+ L.synced,
+ L.last_inactive_time
FROM pg_get_replication_slots() AS L

As mentioned previously, let's keep these new fields before
conflicting and after two_phase.

2.
+# Get last_inactive_time value after slot's creation. Note that the
slot is still
+# inactive unless it's used by the standby below.
+my $last_inactive_time_1 = $primary->safe_psql('postgres',
+ qq(SELECT last_inactive_time FROM pg_replication_slots WHERE
slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;)
+);

We should check $last_inactive_time_1 to be a valid value and add a
similar check for logical slots.

3. BTW, why don't we set last_inactive_time for temporary slots
(RS_TEMPORARY) as well? Don't we even invalidate temporary slots? If
so, then I think we should set last_inactive_time for those as well
and later allow them to be invalidated based on timeout parameter.

--
With Regards,
Amit Kapila.

#117

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#116)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sat, Mar 23, 2024 at 11:27 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

How about adding the test in 019_replslot_limit? It is not a direct
fit but I feel later we can even add 'invalid_timeout' related tests
in this file which will use last_inactive_time feature.

I'm thinking the other way. Now, the new TAP file 043_replslot_misc.pl
can have last_inactive_time tests, and later invalid_timeout ones too.
This way 019_replslot_limit.pl is not cluttered.

It is also
possible that some of the tests added by the 'invalid_timeout' feature
will obviate the need for some of these tests.

Might be. But, I prefer to keep both these tests separate but in the
same file 043_replslot_misc.pl. Because we cover some corner cases the
last_inactive_time is set upon loading the slot from disk.

Review of v15
==============
1.
@@ -1026,7 +1026,8 @@ CREATE VIEW pg_replication_slots AS
L.conflicting,
L.invalidation_reason,
L.failover,
- L.synced
+ L.synced,
+ L.last_inactive_time
FROM pg_get_replication_slots() AS L

As mentioned previously, let's keep these new fields before
conflicting and after two_phase.

Sorry, I forgot to notice that comment (out of a flood of comments
really :)). Now, done that way.

2.
+# Get last_inactive_time value after slot's creation. Note that the
slot is still
+# inactive unless it's used by the standby below.
+my $last_inactive_time_1 = $primary->safe_psql('postgres',
+ qq(SELECT last_inactive_time FROM pg_replication_slots WHERE
slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;)
+);

We should check $last_inactive_time_1 to be a valid value and add a
similar check for logical slots.

That's taken care by the type cast we do, right? Isn't that enough?

is( $primary->safe_psql(
'postgres',
qq[SELECT last_inactive_time >
'$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE
slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;]
),
't',
'last inactive time for an inactive physical slot is updated correctly');

For instance, setting last_inactive_time_1 to an invalid value fails
with the following error:

error running SQL: 'psql:<stdin>:1: ERROR: invalid input syntax for
type timestamp with time zone: "foo"
LINE 1: SELECT last_inactive_time > 'foo'::timestamptz FROM pg_repli...

3. BTW, why don't we set last_inactive_time for temporary slots
(RS_TEMPORARY) as well? Don't we even invalidate temporary slots? If
so, then I think we should set last_inactive_time for those as well
and later allow them to be invalidated based on timeout parameter.

WFM. Done that way.

Please see the attached v16 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v16-0001-Track-last_inactive_time-in-pg_replication_slots.patchapplication/x-patch; name=v16-0001-Track-last_inactive_time-in-pg_replication_slots.patchDownload

From ce85a48bbbd9de5d6ca0ce849993707cc01d1211 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 23 Mar 2024 07:27:48 +0000
Subject: [PATCH v16] Track last_inactive_time in pg_replication_slots.

Till now, the time at which the replication slot became inactive
is not tracked directly in pg_replication_slots. This commit adds
a new column 'last_inactive_time' for this. It is set to 0 whenever
a slot is made active/acquired and set to current timestamp
whenever the slot is inactive/released or restored from the disk.

The new column will be useful on production servers to debug and
analyze inactive replication slots. It will also help to know the
lifetime of a replication slot - one can know how long a streaming
standby, logical subscriber, or replication slot consumer is down.

The new column will be useful to implement inactive timeout based
replication slot invalidation in a future commit.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/system-views.sgml           |  11 ++
 src/backend/catalog/system_views.sql     |   1 +
 src/backend/replication/slot.c           |  27 +++++
 src/backend/replication/slotfuncs.c      |   7 +-
 src/include/catalog/pg_proc.dat          |   6 +-
 src/include/replication/slot.h           |   3 +
 src/test/recovery/meson.build            |   1 +
 src/test/recovery/t/043_replslot_misc.pl | 127 +++++++++++++++++++++++
 src/test/regress/expected/rules.out      |   3 +-
 9 files changed, 181 insertions(+), 5 deletions(-)
 create mode 100644 src/test/recovery/t/043_replslot_misc.pl

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index b5da476c20..2b36b5fef1 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2523,6 +2523,17 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_time</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>conflicting</structfield> <type>bool</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f69b7f5580..bc70ff193e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,6 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
+            L.last_inactive_time,
             L.conflicting,
             L.invalidation_reason,
             L.failover,
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index cdf0c450c5..0f48d6dc7c 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -409,6 +409,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->candidate_restart_valid = InvalidXLogRecPtr;
 	slot->candidate_restart_lsn = InvalidXLogRecPtr;
 	slot->last_saved_confirmed_flush = InvalidXLogRecPtr;
+	slot->last_inactive_time = 0;
 
 	/*
 	 * Create the slot on disk.  We haven't actually marked the slot allocated
@@ -622,6 +623,11 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
+	/* The slot is active by now, so reset the last inactive time. */
+	SpinLockAcquire(&s->mutex);
+	s->last_inactive_time = 0;
+	SpinLockRelease(&s->mutex);
+
 	if (am_walsender)
 	{
 		ereport(log_replication_commands ? LOG : DEBUG1,
@@ -645,6 +651,7 @@ ReplicationSlotRelease(void)
 	ReplicationSlot *slot = MyReplicationSlot;
 	char	   *slotname = NULL;	/* keep compiler quiet */
 	bool		is_logical = false; /* keep compiler quiet */
+	TimestampTz now;
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
@@ -679,6 +686,12 @@ ReplicationSlotRelease(void)
 		ReplicationSlotsComputeRequiredXmin(false);
 	}
 
+	/*
+	 * Set the last inactive time after marking slot inactive. We get current
+	 * time beforehand to avoid system call while holding the lock.
+	 */
+	now = GetCurrentTimestamp();
+
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
 		/*
@@ -687,9 +700,16 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
+		slot->last_inactive_time = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
+	else
+	{
+		SpinLockAcquire(&slot->mutex);
+		slot->last_inactive_time = now;
+		SpinLockRelease(&slot->mutex);
+	}
 
 	MyReplicationSlot = NULL;
 
@@ -2342,6 +2362,13 @@ RestoreSlotFromDisk(const char *name)
 		slot->in_use = true;
 		slot->active_pid = 0;
 
+		/*
+		 * We set last inactive time after loading the slot from the disk into
+		 * memory. Whoever acquires the slot i.e. makes the slot active will
+		 * anyway reset it.
+		 */
+		slot->last_inactive_time = GetCurrentTimestamp();
+
 		restored = true;
 		break;
 	}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 4232c1e52e..24f5e6d90a 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 18
+#define PG_GET_REPLICATION_SLOTS_COLS 19
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -410,6 +410,11 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
+		if (slot_contents.last_inactive_time > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.last_inactive_time);
+		else
+			nulls[i++] = true;
+
 		cause = slot_contents.data.invalidated;
 
 		if (SlotIsPhysical(&slot_contents))
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 71c74350a0..0d26e5b422 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11133,9 +11133,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,invalidation_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,timestamptz,bool,text,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,last_inactive_time,conflicting,invalidation_reason,failover,synced}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7f25a083ee..2f18433ecc 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -201,6 +201,9 @@ typedef struct ReplicationSlot
 	 * forcibly flushed or not.
 	 */
 	XLogRecPtr	last_saved_confirmed_flush;
+
+	/* The time at which this slot become inactive */
+	TimestampTz last_inactive_time;
 } ReplicationSlot;
 
 #define SlotIsPhysical(slot) ((slot)->data.database == InvalidOid)
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..c8259f99d5 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/043_replslot_misc.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/043_replslot_misc.pl b/src/test/recovery/t/043_replslot_misc.pl
new file mode 100644
index 0000000000..86e58691bf
--- /dev/null
+++ b/src/test/recovery/t/043_replslot_misc.pl
@@ -0,0 +1,127 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Replication slot related miscellaneous tests
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# =============================================================================
+# Testcase start: Check last_inactive_time property of streaming standby's slot
+#
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby = PostgreSQL::Test::Cluster->new('standby');
+$standby->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $sb_slot = 'sb_slot';
+$standby->append_conf('postgresql.conf', "primary_slot_name = '$sb_slot'");
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := '$sb_slot');
+]);
+
+# Get last_inactive_time value after slot's creation. Note that the slot is still
+# inactive unless it's used by the standby below.
+my $last_inactive_time = $primary->safe_psql('postgres',
+	qq(SELECT last_inactive_time FROM pg_replication_slots WHERE slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;)
+);
+
+$standby->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby);
+
+# Now the slot is active so last_inactive_time value must be NULL
+is( $primary->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$sb_slot';]
+	),
+	't',
+	'last inactive time for an active physical slot is NULL');
+
+# Stop the standby to check its last_inactive_time value is updated
+$standby->stop;
+
+# Let's also restart the primary so that the last_inactive_time is set upon
+# loading the slot from disk.
+$primary->restart;
+
+is( $primary->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;]
+	),
+	't',
+	'last inactive time for an inactive physical slot is updated correctly');
+
+# Testcase end: Check last_inactive_time property of streaming standby's slot
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Check last_inactive_time property of logical subscriber's slot
+my $publisher = $primary;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+
+my $lsub_slot = 'lsub_slot';
+$publisher->safe_psql('postgres',
+	"SELECT pg_create_logical_replication_slot(slot_name := '$lsub_slot', plugin := 'pgoutput');"
+);
+
+# Get last_inactive_time value after slot's creation. Note that the slot is still
+# inactive unless it's used by the subscriber below.
+$last_inactive_time = $primary->safe_psql('postgres',
+	qq(SELECT last_inactive_time FROM pg_replication_slots WHERE slot_name = '$lsub_slot' AND last_inactive_time IS NOT NULL;)
+);
+
+$subscriber->start;
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = '$lsub_slot', create_slot = false)"
+);
+
+# Wait until subscriber has caught up
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+# Now the slot is active so last_inactive_time value must be NULL
+is( $publisher->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub_slot';]
+	),
+	't',
+	'last inactive time for an active logical slot is NULL');
+
+# Stop the subscriber to check its last_inactive_time value is updated
+$subscriber->stop;
+
+# Let's also restart the publisher so that the last_inactive_time is set upon
+# loading the slot from disk.
+$publisher->restart;
+
+is( $publisher->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub_slot' AND last_inactive_time IS NOT NULL;]
+	),
+	't',
+	'last inactive time for an inactive logical slot is updated correctly');
+
+# Testcase end: Check last_inactive_time property of logical subscriber's slot
+# =============================================================================
+
+done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 18829ea586..dfcbaec387 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,11 +1473,12 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
+    l.last_inactive_time,
     l.conflicting,
     l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, invalidation_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, last_inactive_time, conflicting, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

#118

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#117)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Sat, Mar 23, 2024 at 01:11:50PM +0530, Bharath Rupireddy wrote:

On Sat, Mar 23, 2024 at 11:27 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

How about adding the test in 019_replslot_limit? It is not a direct
fit but I feel later we can even add 'invalid_timeout' related tests
in this file which will use last_inactive_time feature.

I'm thinking the other way. Now, the new TAP file 043_replslot_misc.pl
can have last_inactive_time tests, and later invalid_timeout ones too.
This way 019_replslot_limit.pl is not cluttered.

I share the same opinion as Amit: I think 019_replslot_limit would be a better
place, because I see the timeout as another kind of limit.

It is also
possible that some of the tests added by the 'invalid_timeout' feature
will obviate the need for some of these tests.

Might be. But, I prefer to keep both these tests separate but in the
same file 043_replslot_misc.pl. Because we cover some corner cases the
last_inactive_time is set upon loading the slot from disk.

Right but I think that this test does not necessary have to be in the same .pl
as the one testing the timeout. Could be added in one of the existing .pl like
001_stream_rep.pl for example.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#119

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#118)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sat, Mar 23, 2024 at 2:34 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

How about adding the test in 019_replslot_limit? It is not a direct
fit but I feel later we can even add 'invalid_timeout' related tests
in this file which will use last_inactive_time feature.

I'm thinking the other way. Now, the new TAP file 043_replslot_misc.pl
can have last_inactive_time tests, and later invalid_timeout ones too.
This way 019_replslot_limit.pl is not cluttered.

I share the same opinion as Amit: I think 019_replslot_limit would be a better
place, because I see the timeout as another kind of limit.

Hm. Done that way.

Please see the attached v17 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v17-0001-Track-last_inactive_time-in-pg_replication_slots.patchapplication/x-patch; name=v17-0001-Track-last_inactive_time-in-pg_replication_slots.patchDownload

From a1210ae2dd86afdfdfea9b95861ffed9c7ff2d3a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 23 Mar 2024 23:03:55 +0000
Subject: [PATCH v17] Track last_inactive_time in pg_replication_slots.

Till now, the time at which the replication slot became inactive
is not tracked directly in pg_replication_slots. This commit adds
a new column 'last_inactive_time' for this. It is set to 0 whenever
a slot is made active/acquired and set to current timestamp
whenever the slot is inactive/released or restored from the disk.

The new column will be useful on production servers to debug and
analyze inactive replication slots. It will also help to know the
lifetime of a replication slot - one can know how long a streaming
standby, logical subscriber, or replication slot consumer is down.

The new column will be useful to implement inactive timeout based
replication slot invalidation in a future commit.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/system-views.sgml            |  11 ++
 src/backend/catalog/system_views.sql      |   1 +
 src/backend/replication/slot.c            |  27 +++++
 src/backend/replication/slotfuncs.c       |   7 +-
 src/include/catalog/pg_proc.dat           |   6 +-
 src/include/replication/slot.h            |   3 +
 src/test/recovery/t/019_replslot_limit.pl | 122 ++++++++++++++++++++++
 src/test/regress/expected/rules.out       |   3 +-
 8 files changed, 175 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index b5da476c20..2b36b5fef1 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2523,6 +2523,17 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_time</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>conflicting</structfield> <type>bool</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f69b7f5580..bc70ff193e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,6 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
+            L.last_inactive_time,
             L.conflicting,
             L.invalidation_reason,
             L.failover,
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index cdf0c450c5..0f48d6dc7c 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -409,6 +409,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->candidate_restart_valid = InvalidXLogRecPtr;
 	slot->candidate_restart_lsn = InvalidXLogRecPtr;
 	slot->last_saved_confirmed_flush = InvalidXLogRecPtr;
+	slot->last_inactive_time = 0;
 
 	/*
 	 * Create the slot on disk.  We haven't actually marked the slot allocated
@@ -622,6 +623,11 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
+	/* The slot is active by now, so reset the last inactive time */
+	SpinLockAcquire(&s->mutex);
+	s->last_inactive_time = 0;
+	SpinLockRelease(&s->mutex);
+
 	if (am_walsender)
 	{
 		ereport(log_replication_commands ? LOG : DEBUG1,
@@ -645,6 +651,7 @@ ReplicationSlotRelease(void)
 	ReplicationSlot *slot = MyReplicationSlot;
 	char	   *slotname = NULL;	/* keep compiler quiet */
 	bool		is_logical = false; /* keep compiler quiet */
+	TimestampTz now;
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
@@ -679,6 +686,12 @@ ReplicationSlotRelease(void)
 		ReplicationSlotsComputeRequiredXmin(false);
 	}
 
+	/*
+	 * Set the last inactive time after marking slot inactive. We get current
+	 * time beforehand to avoid system call while holding the lock.
+	 */
+	now = GetCurrentTimestamp();
+
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
 		/*
@@ -687,9 +700,16 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
+		slot->last_inactive_time = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
+	else
+	{
+		SpinLockAcquire(&slot->mutex);
+		slot->last_inactive_time = now;
+		SpinLockRelease(&slot->mutex);
+	}
 
 	MyReplicationSlot = NULL;
 
@@ -2342,6 +2362,13 @@ RestoreSlotFromDisk(const char *name)
 		slot->in_use = true;
 		slot->active_pid = 0;
 
+		/*
+		 * We set last inactive time after loading the slot from the disk into
+		 * memory. Whoever acquires the slot i.e. makes the slot active will
+		 * anyway reset it.
+		 */
+		slot->last_inactive_time = GetCurrentTimestamp();
+
 		restored = true;
 		break;
 	}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 4232c1e52e..24f5e6d90a 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 18
+#define PG_GET_REPLICATION_SLOTS_COLS 19
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -410,6 +410,11 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
+		if (slot_contents.last_inactive_time > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.last_inactive_time);
+		else
+			nulls[i++] = true;
+
 		cause = slot_contents.data.invalidated;
 
 		if (SlotIsPhysical(&slot_contents))
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 71c74350a0..0d26e5b422 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11133,9 +11133,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,invalidation_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,timestamptz,bool,text,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,last_inactive_time,conflicting,invalidation_reason,failover,synced}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7f25a083ee..2f18433ecc 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -201,6 +201,9 @@ typedef struct ReplicationSlot
 	 * forcibly flushed or not.
 	 */
 	XLogRecPtr	last_saved_confirmed_flush;
+
+	/* The time at which this slot become inactive */
+	TimestampTz last_inactive_time;
 } ReplicationSlot;
 
 #define SlotIsPhysical(slot) ((slot)->data.database == InvalidOid)
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index fe00370c3e..a14b6283ee 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -410,4 +410,126 @@ kill 'CONT', $receiverpid;
 $node_primary3->stop;
 $node_standby3->stop;
 
+# =============================================================================
+# Testcase start: Check last_inactive_time property of streaming standby's slot
+#
+
+# Initialize primary node
+my $primary4 = PostgreSQL::Test::Cluster->new('primary4');
+$primary4->init(allows_streaming => 'logical');
+$primary4->start;
+
+# Take backup
+$backup_name = 'my_backup4';
+$primary4->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby4 = PostgreSQL::Test::Cluster->new('standby4');
+$standby4->init_from_backup($primary4, $backup_name, has_streaming => 1);
+
+my $sb4_slot = 'sb4_slot';
+$standby4->append_conf('postgresql.conf', "primary_slot_name = '$sb4_slot'");
+
+$primary4->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := '$sb4_slot');
+]);
+
+# Get last_inactive_time value after slot's creation. Note that the slot is still
+# inactive unless it's used by the standby below.
+my $last_inactive_time = $primary4->safe_psql('postgres',
+	qq(SELECT last_inactive_time FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND last_inactive_time IS NOT NULL;)
+);
+
+$standby4->start;
+
+# Wait until standby has replayed enough data
+$primary4->wait_for_catchup($standby4);
+
+# Now the slot is active so last_inactive_time value must be NULL
+is( $primary4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$sb4_slot';]
+	),
+	't',
+	'last inactive time for an active physical slot is NULL');
+
+# Stop the standby to check its last_inactive_time value is updated
+$standby4->stop;
+
+# Let's also restart the primary so that the last_inactive_time is set upon
+# loading the slot from disk.
+$primary4->restart;
+
+is( $primary4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND last_inactive_time IS NOT NULL;]
+	),
+	't',
+	'last inactive time for an inactive physical slot is updated correctly');
+
+$standby4->stop;
+
+# Testcase end: Check last_inactive_time property of streaming standby's slot
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Check last_inactive_time property of logical subscriber's slot
+my $publisher4 = $primary4;
+
+# Create subscriber node
+my $subscriber4 = PostgreSQL::Test::Cluster->new('subscriber4');
+$subscriber4->init;
+
+# Setup logical replication
+my $publisher4_connstr = $publisher4->connstr . ' dbname=postgres';
+$publisher4->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+
+my $lsub4_slot = 'lsub4_slot';
+$publisher4->safe_psql('postgres',
+	"SELECT pg_create_logical_replication_slot(slot_name := '$lsub4_slot', plugin := 'pgoutput');"
+);
+
+# Get last_inactive_time value after slot's creation. Note that the slot is still
+# inactive unless it's used by the subscriber below.
+$last_inactive_time = $publisher4->safe_psql('postgres',
+	qq(SELECT last_inactive_time FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND last_inactive_time IS NOT NULL;)
+);
+
+$subscriber4->start;
+$subscriber4->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher4_connstr' PUBLICATION pub WITH (slot_name = '$lsub4_slot', create_slot = false)"
+);
+
+# Wait until subscriber has caught up
+$subscriber4->wait_for_subscription_sync($publisher4, 'sub');
+
+# Now the slot is active so last_inactive_time value must be NULL
+is( $publisher4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub4_slot';]
+	),
+	't',
+	'last inactive time for an active logical slot is NULL');
+
+# Stop the subscriber to check its last_inactive_time value is updated
+$subscriber4->stop;
+
+# Let's also restart the publisher so that the last_inactive_time is set upon
+# loading the slot from disk.
+$publisher4->restart;
+
+is( $publisher4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND last_inactive_time IS NOT NULL;]
+	),
+	't',
+	'last inactive time for an inactive logical slot is updated correctly');
+
+# Testcase end: Check last_inactive_time property of logical subscriber's slot
+# =============================================================================
+
+$publisher4->stop;
+$subscriber4->stop;
+
 done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 18829ea586..dfcbaec387 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,11 +1473,12 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
+    l.last_inactive_time,
     l.conflicting,
     l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, invalidation_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, last_inactive_time, conflicting, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

#120

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#117)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sat, Mar 23, 2024 at 1:12 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Sat, Mar 23, 2024 at 11:27 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
2.
+# Get last_inactive_time value after slot's creation. Note that the
slot is still
+# inactive unless it's used by the standby below.
+my $last_inactive_time_1 = $primary->safe_psql('postgres',
+ qq(SELECT last_inactive_time FROM pg_replication_slots WHERE
slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;)
+);
We should check $last_inactive_time_1 to be a valid value and add a
similar check for logical slots.
That's taken care by the type cast we do, right? Isn't that enough?

is( $primary->safe_psql(
'postgres',
qq[SELECT last_inactive_time >
'$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE
slot_name = '$sb_slot' AND last_inactive_time IS NOT NULL;]
),
't',
'last inactive time for an inactive physical slot is updated correctly');

For instance, setting last_inactive_time_1 to an invalid value fails
with the following error:

error running SQL: 'psql:<stdin>:1: ERROR: invalid input syntax for
type timestamp with time zone: "foo"
LINE 1: SELECT last_inactive_time > 'foo'::timestamptz FROM pg_repli...

It would be found at a later point. It would be probably better to
verify immediately after the test that fetches the last_inactive_time
value.

--
With Regards,
Amit Kapila.

#121

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#120)

5 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sun, Mar 24, 2024 at 10:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

For instance, setting last_inactive_time_1 to an invalid value fails
with the following error:

error running SQL: 'psql:<stdin>:1: ERROR: invalid input syntax for
type timestamp with time zone: "foo"
LINE 1: SELECT last_inactive_time > 'foo'::timestamptz FROM pg_repli...

It would be found at a later point. It would be probably better to
verify immediately after the test that fetches the last_inactive_time
value.

Agree. I've added a few more checks explicitly to verify the
last_inactive_time is sane with the following:

qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0)
AND '$last_inactive_time'::timestamptz >
'$slot_creation_time'::timestamptz;]

I've attached the v18 patch set here. I've also addressed earlier
review comments from Amit, Ajin Cherian. Note that I've added new
invalidation mechanism tests in a separate TAP test file just because
I don't want to clutter or bloat any of the existing files and spread
tests for physical slots and logical slots into separate existing TAP
files.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v18-0001-Track-last_inactive_time-in-pg_replication_slots.patchapplication/octet-stream; name=v18-0001-Track-last_inactive_time-in-pg_replication_slots.patchDownload

From 79c3967c0dc25ec2741f7fe979b2b97939e2eeb1 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sun, 24 Mar 2024 07:12:55 +0000
Subject: [PATCH v18 1/5] Track last_inactive_time in pg_replication_slots.

Till now, the time at which the replication slot became inactive
is not tracked directly in pg_replication_slots. This commit adds
a new property called last_inactive_time for this. It is set to 0
whenever a slot is made active/acquired and set to current
timestamp whenever the slot is inactive/released or restored from
the disk.

The new property will be useful on production servers to debug and
analyze inactive replication slots. It will also help to know the
lifetime of a replication slot - one can know how long a streaming
standby, logical subscriber, or replication slot consumer is down.

The new property will be useful to implement inactive timeout based
replication slot invalidation in a future commit.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/system-views.sgml            |  11 ++
 src/backend/catalog/system_views.sql      |   1 +
 src/backend/replication/slot.c            |  27 ++++
 src/backend/replication/slotfuncs.c       |   7 +-
 src/include/catalog/pg_proc.dat           |   6 +-
 src/include/replication/slot.h            |   3 +
 src/test/recovery/t/019_replslot_limit.pl | 148 ++++++++++++++++++++++
 src/test/regress/expected/rules.out       |   3 +-
 8 files changed, 201 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index b5da476c20..2b36b5fef1 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2523,6 +2523,17 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_time</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently actively being
+        used.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>conflicting</structfield> <type>bool</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f69b7f5580..bc70ff193e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,6 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
+            L.last_inactive_time,
             L.conflicting,
             L.invalidation_reason,
             L.failover,
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index cdf0c450c5..0f48d6dc7c 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -409,6 +409,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->candidate_restart_valid = InvalidXLogRecPtr;
 	slot->candidate_restart_lsn = InvalidXLogRecPtr;
 	slot->last_saved_confirmed_flush = InvalidXLogRecPtr;
+	slot->last_inactive_time = 0;
 
 	/*
 	 * Create the slot on disk.  We haven't actually marked the slot allocated
@@ -622,6 +623,11 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
+	/* The slot is active by now, so reset the last inactive time. */
+	SpinLockAcquire(&s->mutex);
+	s->last_inactive_time = 0;
+	SpinLockRelease(&s->mutex);
+
 	if (am_walsender)
 	{
 		ereport(log_replication_commands ? LOG : DEBUG1,
@@ -645,6 +651,7 @@ ReplicationSlotRelease(void)
 	ReplicationSlot *slot = MyReplicationSlot;
 	char	   *slotname = NULL;	/* keep compiler quiet */
 	bool		is_logical = false; /* keep compiler quiet */
+	TimestampTz now;
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
@@ -679,6 +686,12 @@ ReplicationSlotRelease(void)
 		ReplicationSlotsComputeRequiredXmin(false);
 	}
 
+	/*
+	 * Set the last inactive time after marking slot inactive. We get current
+	 * time beforehand to avoid system call while holding the lock.
+	 */
+	now = GetCurrentTimestamp();
+
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
 		/*
@@ -687,9 +700,16 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
+		slot->last_inactive_time = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
+	else
+	{
+		SpinLockAcquire(&slot->mutex);
+		slot->last_inactive_time = now;
+		SpinLockRelease(&slot->mutex);
+	}
 
 	MyReplicationSlot = NULL;
 
@@ -2342,6 +2362,13 @@ RestoreSlotFromDisk(const char *name)
 		slot->in_use = true;
 		slot->active_pid = 0;
 
+		/*
+		 * We set last inactive time after loading the slot from the disk into
+		 * memory. Whoever acquires the slot i.e. makes the slot active will
+		 * anyway reset it.
+		 */
+		slot->last_inactive_time = GetCurrentTimestamp();
+
 		restored = true;
 		break;
 	}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 4232c1e52e..24f5e6d90a 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 18
+#define PG_GET_REPLICATION_SLOTS_COLS 19
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -410,6 +410,11 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
+		if (slot_contents.last_inactive_time > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.last_inactive_time);
+		else
+			nulls[i++] = true;
+
 		cause = slot_contents.data.invalidated;
 
 		if (SlotIsPhysical(&slot_contents))
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 71c74350a0..0d26e5b422 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11133,9 +11133,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,invalidation_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,timestamptz,bool,text,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,last_inactive_time,conflicting,invalidation_reason,failover,synced}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7f25a083ee..2f18433ecc 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -201,6 +201,9 @@ typedef struct ReplicationSlot
 	 * forcibly flushed or not.
 	 */
 	XLogRecPtr	last_saved_confirmed_flush;
+
+	/* The time at which this slot become inactive */
+	TimestampTz last_inactive_time;
 } ReplicationSlot;
 
 #define SlotIsPhysical(slot) ((slot)->data.database == InvalidOid)
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index fe00370c3e..bff84cc9c4 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -410,4 +410,152 @@ kill 'CONT', $receiverpid;
 $node_primary3->stop;
 $node_standby3->stop;
 
+# =============================================================================
+# Testcase start: Check last_inactive_time property of streaming standby's slot
+#
+
+# Initialize primary node
+my $primary4 = PostgreSQL::Test::Cluster->new('primary4');
+$primary4->init(allows_streaming => 'logical');
+$primary4->start;
+
+# Take backup
+$backup_name = 'my_backup4';
+$primary4->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby4 = PostgreSQL::Test::Cluster->new('standby4');
+$standby4->init_from_backup($primary4, $backup_name, has_streaming => 1);
+
+my $sb4_slot = 'sb4_slot';
+$standby4->append_conf('postgresql.conf', "primary_slot_name = '$sb4_slot'");
+
+my $slot_creation_time = $primary4->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
+$primary4->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := '$sb4_slot');
+]);
+
+# Get last_inactive_time value after slot's creation. Note that the slot is still
+# inactive unless it's used by the standby below.
+my $last_inactive_time = $primary4->safe_psql('postgres',
+	qq(SELECT last_inactive_time FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND last_inactive_time IS NOT NULL;)
+);
+
+# Check that the captured time is sane
+is( $primary4->safe_psql(
+		'postgres',
+		qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0) AND '$last_inactive_time'::timestamptz > '$slot_creation_time'::timestamptz;]
+	),
+	't',
+	'last inactive time for an active physical slot is sane');
+
+$standby4->start;
+
+# Wait until standby has replayed enough data
+$primary4->wait_for_catchup($standby4);
+
+# Now the slot is active so last_inactive_time value must be NULL
+is( $primary4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$sb4_slot';]
+	),
+	't',
+	'last inactive time for an active physical slot is NULL');
+
+# Stop the standby to check its last_inactive_time value is updated
+$standby4->stop;
+
+# Let's also restart the primary so that the last_inactive_time is set upon
+# loading the slot from disk.
+$primary4->restart;
+
+is( $primary4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND last_inactive_time IS NOT NULL;]
+	),
+	't',
+	'last inactive time for an inactive physical slot is updated correctly');
+
+$standby4->stop;
+
+# Testcase end: Check last_inactive_time property of streaming standby's slot
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Check last_inactive_time property of logical subscriber's slot
+my $publisher4 = $primary4;
+
+# Create subscriber node
+my $subscriber4 = PostgreSQL::Test::Cluster->new('subscriber4');
+$subscriber4->init;
+
+# Setup logical replication
+my $publisher4_connstr = $publisher4->connstr . ' dbname=postgres';
+$publisher4->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+
+$slot_creation_time = $publisher4->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
+my $lsub4_slot = 'lsub4_slot';
+$publisher4->safe_psql('postgres',
+	"SELECT pg_create_logical_replication_slot(slot_name := '$lsub4_slot', plugin := 'pgoutput');"
+);
+
+# Get last_inactive_time value after slot's creation. Note that the slot is still
+# inactive unless it's used by the subscriber below.
+$last_inactive_time = $publisher4->safe_psql('postgres',
+	qq(SELECT last_inactive_time FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND last_inactive_time IS NOT NULL;)
+);
+
+# Check that the captured time is sane
+is( $publisher4->safe_psql(
+		'postgres',
+		qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0) AND '$last_inactive_time'::timestamptz > '$slot_creation_time'::timestamptz;]
+	),
+	't',
+	'last inactive time for an active physical slot is sane');
+
+$subscriber4->start;
+$subscriber4->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher4_connstr' PUBLICATION pub WITH (slot_name = '$lsub4_slot', create_slot = false)"
+);
+
+# Wait until subscriber has caught up
+$subscriber4->wait_for_subscription_sync($publisher4, 'sub');
+
+# Now the slot is active so last_inactive_time value must be NULL
+is( $publisher4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub4_slot';]
+	),
+	't',
+	'last inactive time for an active logical slot is NULL');
+
+# Stop the subscriber to check its last_inactive_time value is updated
+$subscriber4->stop;
+
+# Let's also restart the publisher so that the last_inactive_time is set upon
+# loading the slot from disk.
+$publisher4->restart;
+
+is( $publisher4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND last_inactive_time IS NOT NULL;]
+	),
+	't',
+	'last inactive time for an inactive logical slot is updated correctly');
+
+# Testcase end: Check last_inactive_time property of logical subscriber's slot
+# =============================================================================
+
+$publisher4->stop;
+$subscriber4->stop;
+
 done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 18829ea586..dfcbaec387 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,11 +1473,12 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
+    l.last_inactive_time,
     l.conflicting,
     l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, invalidation_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, last_inactive_time, conflicting, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v18-0002-Allow-setting-inactive_timeout-for-replication-s.patchapplication/octet-stream; name=v18-0002-Allow-setting-inactive_timeout-for-replication-s.patchDownload

From 0d80cfad9658c7303d036833636b22293fd25109 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sun, 24 Mar 2024 07:14:51 +0000
Subject: [PATCH v18 2/5] Allow setting inactive_timeout for replication slots
 via SQL API.

This commit adds a new replication slot property called
inactive_timeout specifying the amount of time in seconds the slot
is allowed to be inactive. It is added to slot's persistent data
structure to survive during server restarts. It will be synced to
failover slots on the standby, and also will be carried over to
the new cluster as part of pg_upgrade.

This commit particularly lets one specify the inactive_timeout for
a slot via SQL functions pg_create_physical_replication_slot and
pg_create_logical_replication_slot.

The new property will be useful to implement inactive timeout based
replication slot invalidation in a future commit.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 contrib/test_decoding/expected/slot.out       | 102 ++++++++++++++++++
 contrib/test_decoding/sql/slot.sql            |  34 ++++++
 doc/src/sgml/func.sgml                        |  18 ++--
 doc/src/sgml/system-views.sgml                |   9 ++
 src/backend/catalog/system_functions.sql      |   2 +
 src/backend/catalog/system_views.sql          |   1 +
 src/backend/replication/logical/slotsync.c    |  17 ++-
 src/backend/replication/slot.c                |  20 +++-
 src/backend/replication/slotfuncs.c           |  31 +++++-
 src/backend/replication/walsender.c           |   4 +-
 src/bin/pg_upgrade/info.c                     |   6 +-
 src/bin/pg_upgrade/pg_upgrade.c               |   5 +-
 src/bin/pg_upgrade/pg_upgrade.h               |   2 +
 src/bin/pg_upgrade/t/003_logical_slots.pl     |  11 +-
 src/include/catalog/pg_proc.dat               |  22 ++--
 src/include/replication/slot.h                |   5 +-
 .../t/040_standby_failover_slots_sync.pl      |  13 ++-
 src/test/regress/expected/rules.out           |   3 +-
 18 files changed, 264 insertions(+), 41 deletions(-)

diff --git a/contrib/test_decoding/expected/slot.out b/contrib/test_decoding/expected/slot.out
index 349ab2d380..6771520afb 100644
--- a/contrib/test_decoding/expected/slot.out
+++ b/contrib/test_decoding/expected/slot.out
@@ -466,3 +466,105 @@ SELECT pg_drop_replication_slot('physical_slot');
  
 (1 row)
 
+-- Test negative value for inactive_timeout option for slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', inactive_timeout := -300);  -- error
+ERROR:  "inactive_timeout" must not be negative
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', inactive_timeout := -600);  -- error
+ERROR:  "inactive_timeout" must not be negative
+-- Test inactive_timeout option for temporary slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', temporary := true, inactive_timeout := 300);  -- error
+ERROR:  cannot set inactive_timeout for a temporary replication slot
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', temporary := true, inactive_timeout := 600);  -- error
+ERROR:  cannot set inactive_timeout for a temporary replication slot
+-- Test inactive_timeout option of physical slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot1', immediately_reserve := true, inactive_timeout := 300);
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot2');
+ ?column? 
+----------
+ init
+(1 row)
+
+-- Copy physical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_slot1', dst_slot_name := 'it_phy_slot3');
+ ?column? 
+----------
+ copy
+(1 row)
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+  slot_name   | slot_type | inactive_timeout 
+--------------+-----------+------------------
+ it_phy_slot1 | physical  |              300
+ it_phy_slot2 | physical  |                0
+ it_phy_slot3 | physical  |              300
+(3 rows)
+
+SELECT pg_drop_replication_slot('it_phy_slot1');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_phy_slot2');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_phy_slot3');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+-- Test inactive_timeout option of logical slots.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot2', plugin := 'test_decoding');
+ ?column? 
+----------
+ init
+(1 row)
+
+-- Copy logical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slot1', dst_slot_name := 'it_log_slot3');
+ ?column? 
+----------
+ copy
+(1 row)
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+  slot_name   | slot_type | inactive_timeout 
+--------------+-----------+------------------
+ it_log_slot1 | logical   |              600
+ it_log_slot2 | logical   |                0
+ it_log_slot3 | logical   |              600
+(3 rows)
+
+SELECT pg_drop_replication_slot('it_log_slot1');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_log_slot2');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_log_slot3');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
diff --git a/contrib/test_decoding/sql/slot.sql b/contrib/test_decoding/sql/slot.sql
index 580e3ae3be..443e91da07 100644
--- a/contrib/test_decoding/sql/slot.sql
+++ b/contrib/test_decoding/sql/slot.sql
@@ -190,3 +190,37 @@ SELECT pg_drop_replication_slot('failover_true_slot');
 SELECT pg_drop_replication_slot('failover_false_slot');
 SELECT pg_drop_replication_slot('failover_default_slot');
 SELECT pg_drop_replication_slot('physical_slot');
+
+-- Test negative value for inactive_timeout option for slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', inactive_timeout := -300);  -- error
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', inactive_timeout := -600);  -- error
+
+-- Test inactive_timeout option for temporary slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', temporary := true, inactive_timeout := 300);  -- error
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', temporary := true, inactive_timeout := 600);  -- error
+
+-- Test inactive_timeout option of physical slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot1', immediately_reserve := true, inactive_timeout := 300);
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot2');
+
+-- Copy physical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_slot1', dst_slot_name := 'it_phy_slot3');
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+
+SELECT pg_drop_replication_slot('it_phy_slot1');
+SELECT pg_drop_replication_slot('it_phy_slot2');
+SELECT pg_drop_replication_slot('it_phy_slot3');
+
+-- Test inactive_timeout option of logical slots.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot2', plugin := 'test_decoding');
+
+-- Copy logical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slot1', dst_slot_name := 'it_log_slot3');
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+
+SELECT pg_drop_replication_slot('it_log_slot1');
+SELECT pg_drop_replication_slot('it_log_slot2');
+SELECT pg_drop_replication_slot('it_log_slot3');
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 8ecc02f2b9..afaafa35ad 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28373,7 +28373,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <indexterm>
          <primary>pg_create_physical_replication_slot</primary>
         </indexterm>
-        <function>pg_create_physical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type> <optional>, <parameter>immediately_reserve</parameter> <type>boolean</type>, <parameter>temporary</parameter> <type>boolean</type> </optional> )
+        <function>pg_create_physical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type> <optional>, <parameter>immediately_reserve</parameter> <type>boolean</type>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>inactive_timeout</parameter> <type>integer</type> </optional>)
         <returnvalue>record</returnvalue>
         ( <parameter>slot_name</parameter> <type>name</type>,
         <parameter>lsn</parameter> <type>pg_lsn</type> )
@@ -28390,9 +28390,12 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         parameter, <parameter>temporary</parameter>, when set to true, specifies that
         the slot should not be permanently stored to disk and is only meant
         for use by the current session. Temporary slots are also
-        released upon any error. This function corresponds
-        to the replication protocol command <literal>CREATE_REPLICATION_SLOT
-        ... PHYSICAL</literal>.
+        released upon any error. The optional fourth
+        parameter, <parameter>inactive_timeout</parameter>, when set to a
+        non-zero value, specifies the amount of time in seconds the slot is
+        allowed to be inactive. This function corresponds to the replication
+        protocol command
+        <literal>CREATE_REPLICATION_SLOT ... PHYSICAL</literal>.
        </para></entry>
       </row>
 
@@ -28417,7 +28420,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <indexterm>
          <primary>pg_create_logical_replication_slot</primary>
         </indexterm>
-        <function>pg_create_logical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type>, <parameter>plugin</parameter> <type>name</type> <optional>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>twophase</parameter> <type>boolean</type>, <parameter>failover</parameter> <type>boolean</type> </optional> )
+        <function>pg_create_logical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type>, <parameter>plugin</parameter> <type>name</type> <optional>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>twophase</parameter> <type>boolean</type>, <parameter>failover</parameter> <type>boolean</type>, <parameter>inactive_timeout</parameter> <type>integer</type> </optional> )
         <returnvalue>record</returnvalue>
         ( <parameter>slot_name</parameter> <type>name</type>,
         <parameter>lsn</parameter> <type>pg_lsn</type> )
@@ -28436,7 +28439,10 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <parameter>failover</parameter>, when set to true,
         specifies that this slot is enabled to be synced to the
         standbys so that logical replication can be resumed after
-        failover. A call to this function has the same effect as
+        failover.  The optional sixth parameter,
+        <parameter>inactive_timeout</parameter>, when set to a
+        non-zero value, specifies the amount of time in seconds the slot is
+        allowed to be inactive. A call to this function has the same effect as
         the replication protocol command
         <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
        </para></entry>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 2b36b5fef1..dddbaa070f 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2534,6 +2534,15 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_timeout</structfield> <type>integer</type>
+      </para>
+      <para>
+        The amount of time in seconds the slot is allowed to be inactive.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>conflicting</structfield> <type>bool</type>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index fe2bb50f46..af27616657 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -469,6 +469,7 @@ AS 'pg_logical_emit_message_bytea';
 CREATE OR REPLACE FUNCTION pg_create_physical_replication_slot(
     IN slot_name name, IN immediately_reserve boolean DEFAULT false,
     IN temporary boolean DEFAULT false,
+    IN inactive_timeout int DEFAULT 0,
     OUT slot_name name, OUT lsn pg_lsn)
 RETURNS RECORD
 LANGUAGE INTERNAL
@@ -480,6 +481,7 @@ CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot(
     IN temporary boolean DEFAULT false,
     IN twophase boolean DEFAULT false,
     IN failover boolean DEFAULT false,
+    IN inactive_timeout int DEFAULT 0,
     OUT slot_name name, OUT lsn pg_lsn)
 RETURNS RECORD
 LANGUAGE INTERNAL
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index bc70ff193e..40d7ad469d 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1024,6 +1024,7 @@ CREATE VIEW pg_replication_slots AS
             L.safe_wal_size,
             L.two_phase,
             L.last_inactive_time,
+            L.inactive_timeout,
             L.conflicting,
             L.invalidation_reason,
             L.failover,
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..c01876ceeb 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -131,6 +131,7 @@ typedef struct RemoteSlot
 	char	   *database;
 	bool		two_phase;
 	bool		failover;
+	int			inactive_timeout;
 	XLogRecPtr	restart_lsn;
 	XLogRecPtr	confirmed_lsn;
 	TransactionId catalog_xmin;
@@ -167,7 +168,8 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		remote_slot->two_phase == slot->data.two_phase &&
 		remote_slot->failover == slot->data.failover &&
 		remote_slot->confirmed_lsn == slot->data.confirmed_flush &&
-		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
+		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 &&
+		remote_slot->inactive_timeout == slot->data.inactive_timeout)
 		return false;
 
 	/* Avoid expensive operations while holding a spinlock. */
@@ -182,6 +184,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 	slot->data.confirmed_flush = remote_slot->confirmed_lsn;
 	slot->data.catalog_xmin = remote_slot->catalog_xmin;
 	slot->effective_catalog_xmin = remote_slot->catalog_xmin;
+	slot->data.inactive_timeout = remote_slot->inactive_timeout;
 	SpinLockRelease(&slot->mutex);
 
 	if (xmin_changed)
@@ -607,7 +610,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
 							  remote_slot->two_phase,
 							  remote_slot->failover,
-							  true);
+							  true, 0);
 
 		/* For shorter lines. */
 		slot = MyReplicationSlot;
@@ -627,6 +630,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		SpinLockAcquire(&slot->mutex);
 		slot->effective_catalog_xmin = xmin_horizon;
 		slot->data.catalog_xmin = xmin_horizon;
+		slot->data.inactive_timeout = remote_slot->inactive_timeout;
 		SpinLockRelease(&slot->mutex);
 		ReplicationSlotsComputeRequiredXmin(true);
 		LWLockRelease(ProcArrayLock);
@@ -652,9 +656,9 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 static bool
 synchronize_slots(WalReceiverConn *wrconn)
 {
-#define SLOTSYNC_COLUMN_COUNT 9
+#define SLOTSYNC_COLUMN_COUNT 10
 	Oid			slotRow[SLOTSYNC_COLUMN_COUNT] = {TEXTOID, TEXTOID, LSNOID,
-	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID};
+	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID, INT4OID};
 
 	WalRcvExecResult *res;
 	TupleTableSlot *tupslot;
@@ -663,7 +667,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, invalidation_reason"
+		" database, invalidation_reason, inactive_timeout"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
@@ -743,6 +747,9 @@ synchronize_slots(WalReceiverConn *wrconn)
 		remote_slot->invalidated = isnull ? RS_INVAL_NONE :
 			GetSlotInvalidationCause(TextDatumGetCString(d));
 
+		remote_slot->inactive_timeout = DatumGetInt32(slot_getattr(tupslot, ++col,
+																   &isnull));
+
 		/* Sanity check */
 		Assert(col == SLOTSYNC_COLUMN_COUNT);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 0f48d6dc7c..852a657e97 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -129,7 +129,7 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	5		/* version for new files */
+#define SLOT_VERSION	6		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -304,11 +304,14 @@ ReplicationSlotValidateName(const char *name, int elevel)
  * failover: If enabled, allows the slot to be synced to standbys so
  *     that logical replication can be resumed after failover.
  * synced: True if the slot is synchronized from the primary server.
+ * inactive_timeout: The amount of time in seconds the slot is allowed to be
+ *     inactive.
  */
 void
 ReplicationSlotCreate(const char *name, bool db_specific,
 					  ReplicationSlotPersistency persistency,
-					  bool two_phase, bool failover, bool synced)
+					  bool two_phase, bool failover, bool synced,
+					  int inactive_timeout)
 {
 	ReplicationSlot *slot = NULL;
 	int			i;
@@ -345,6 +348,18 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 					errmsg("cannot enable failover for a temporary replication slot"));
 	}
 
+	if (inactive_timeout > 0)
+	{
+		/*
+		 * Do not allow users to set inactive_timeout for temporary slots,
+		 * because temporary slots will not be saved to the disk.
+		 */
+		if (persistency == RS_TEMPORARY)
+			ereport(ERROR,
+					errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					errmsg("cannot set inactive_timeout for a temporary replication slot"));
+	}
+
 	/*
 	 * If some other backend ran this code concurrently with us, we'd likely
 	 * both allocate the same slot, and that would be bad.  We'd also be at
@@ -398,6 +413,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
 	slot->data.synced = synced;
+	slot->data.inactive_timeout = inactive_timeout;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 24f5e6d90a..fb79401c50 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -38,14 +38,15 @@
  */
 static void
 create_physical_replication_slot(char *name, bool immediately_reserve,
-								 bool temporary, XLogRecPtr restart_lsn)
+								 bool temporary, int inactive_timeout,
+								 XLogRecPtr restart_lsn)
 {
 	Assert(!MyReplicationSlot);
 
 	/* acquire replication slot, this will check for conflicting names */
 	ReplicationSlotCreate(name, false,
 						  temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
-						  false, false);
+						  false, false, inactive_timeout);
 
 	if (immediately_reserve)
 	{
@@ -71,6 +72,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 	Name		name = PG_GETARG_NAME(0);
 	bool		immediately_reserve = PG_GETARG_BOOL(1);
 	bool		temporary = PG_GETARG_BOOL(2);
+	int			inactive_timeout = PG_GETARG_INT32(3);
 	Datum		values[2];
 	bool		nulls[2];
 	TupleDesc	tupdesc;
@@ -84,9 +86,15 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 
 	CheckSlotRequirements();
 
+	if (inactive_timeout < 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("\"inactive_timeout\" must not be negative")));
+
 	create_physical_replication_slot(NameStr(*name),
 									 immediately_reserve,
 									 temporary,
+									 inactive_timeout,
 									 InvalidXLogRecPtr);
 
 	values[0] = NameGetDatum(&MyReplicationSlot->data.name);
@@ -120,7 +128,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 static void
 create_logical_replication_slot(char *name, char *plugin,
 								bool temporary, bool two_phase,
-								bool failover,
+								bool failover, int inactive_timeout,
 								XLogRecPtr restart_lsn,
 								bool find_startpoint)
 {
@@ -138,7 +146,7 @@ create_logical_replication_slot(char *name, char *plugin,
 	 */
 	ReplicationSlotCreate(name, true,
 						  temporary ? RS_TEMPORARY : RS_EPHEMERAL, two_phase,
-						  failover, false);
+						  failover, false, inactive_timeout);
 
 	/*
 	 * Create logical decoding context to find start point or, if we don't
@@ -177,6 +185,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 	bool		temporary = PG_GETARG_BOOL(2);
 	bool		two_phase = PG_GETARG_BOOL(3);
 	bool		failover = PG_GETARG_BOOL(4);
+	int			inactive_timeout = PG_GETARG_INT32(5);
 	Datum		result;
 	TupleDesc	tupdesc;
 	HeapTuple	tuple;
@@ -190,11 +199,17 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 
 	CheckLogicalDecodingRequirements();
 
+	if (inactive_timeout < 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("\"inactive_timeout\" must not be negative")));
+
 	create_logical_replication_slot(NameStr(*name),
 									NameStr(*plugin),
 									temporary,
 									two_phase,
 									failover,
+									inactive_timeout,
 									InvalidXLogRecPtr,
 									true);
 
@@ -239,7 +254,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 19
+#define PG_GET_REPLICATION_SLOTS_COLS 20
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -415,6 +430,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		else
 			nulls[i++] = true;
 
+		values[i++] = Int32GetDatum(slot_contents.data.inactive_timeout);
+
 		cause = slot_contents.data.invalidated;
 
 		if (SlotIsPhysical(&slot_contents))
@@ -720,6 +737,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 	XLogRecPtr	src_restart_lsn;
 	bool		src_islogical;
 	bool		temporary;
+	int			inactive_timeout;
 	char	   *plugin;
 	Datum		values[2];
 	bool		nulls[2];
@@ -776,6 +794,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 	src_restart_lsn = first_slot_contents.data.restart_lsn;
 	temporary = (first_slot_contents.data.persistency == RS_TEMPORARY);
 	plugin = logical_slot ? NameStr(first_slot_contents.data.plugin) : NULL;
+	inactive_timeout = first_slot_contents.data.inactive_timeout;
 
 	/* Check type of replication slot */
 	if (src_islogical != logical_slot)
@@ -823,6 +842,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 										temporary,
 										false,
 										false,
+										inactive_timeout,
 										src_restart_lsn,
 										false);
 	}
@@ -830,6 +850,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 		create_physical_replication_slot(NameStr(*dst_name),
 										 true,
 										 temporary,
+										 inactive_timeout,
 										 src_restart_lsn);
 
 	/*
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bc40c454de..5315c08650 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1221,7 +1221,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	{
 		ReplicationSlotCreate(cmd->slotname, false,
 							  cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
-							  false, false, false);
+							  false, false, false, 0);
 
 		if (reserve_wal)
 		{
@@ -1252,7 +1252,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 		 */
 		ReplicationSlotCreate(cmd->slotname, true,
 							  cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
-							  two_phase, failover, false);
+							  two_phase, failover, false, 0);
 
 		/*
 		 * Do options check early so that we can bail before calling the
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 95c22a7200..12626987f0 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -676,7 +676,8 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid, "
+							"inactive_timeout "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
@@ -696,6 +697,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 		int			i_failover;
 		int			i_caught_up;
 		int			i_invalid;
+		int			i_inactive_timeout;
 
 		slotinfos = (LogicalSlotInfo *) pg_malloc(sizeof(LogicalSlotInfo) * num_slots);
 
@@ -705,6 +707,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 		i_failover = PQfnumber(res, "failover");
 		i_caught_up = PQfnumber(res, "caught_up");
 		i_invalid = PQfnumber(res, "invalid");
+		i_inactive_timeout = PQfnumber(res, "inactive_timeout");
 
 		for (int slotnum = 0; slotnum < num_slots; slotnum++)
 		{
@@ -716,6 +719,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 			curr->failover = (strcmp(PQgetvalue(res, slotnum, i_failover), "t") == 0);
 			curr->caught_up = (strcmp(PQgetvalue(res, slotnum, i_caught_up), "t") == 0);
 			curr->invalid = (strcmp(PQgetvalue(res, slotnum, i_invalid), "t") == 0);
+			curr->inactive_timeout = atooid(PQgetvalue(res, slotnum, i_inactive_timeout));
 		}
 	}
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index f6143b6bc4..2656056103 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -931,9 +931,10 @@ create_logical_replication_slots(void)
 			appendPQExpBuffer(query, ", ");
 			appendStringLiteralConn(query, slot_info->plugin, conn);
 
-			appendPQExpBuffer(query, ", false, %s, %s);",
+			appendPQExpBuffer(query, ", false, %s, %s, %d);",
 							  slot_info->two_phase ? "true" : "false",
-							  slot_info->failover ? "true" : "false");
+							  slot_info->failover ? "true" : "false",
+							  slot_info->inactive_timeout);
 
 			PQclear(executeQueryOrDie(conn, "%s", query->data));
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 92bcb693fb..eb86d000b1 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -162,6 +162,8 @@ typedef struct
 	bool		invalid;		/* if true, the slot is unusable */
 	bool		failover;		/* is the slot designated to be synced to the
 								 * physical standby? */
+	int			inactive_timeout;	/* The amount of time in seconds the slot
+									 * is allowed to be inactive. */
 } LogicalSlotInfo;
 
 typedef struct
diff --git a/src/bin/pg_upgrade/t/003_logical_slots.pl b/src/bin/pg_upgrade/t/003_logical_slots.pl
index 83d71c3084..6e82d2cb7b 100644
--- a/src/bin/pg_upgrade/t/003_logical_slots.pl
+++ b/src/bin/pg_upgrade/t/003_logical_slots.pl
@@ -153,14 +153,17 @@ like(
 # TEST: Successful upgrade
 
 # Preparations for the subsequent test:
-# 1. Setup logical replication (first, cleanup slots from the previous tests)
+# 1. Setup logical replication (first, cleanup slots from the previous tests,
+# and then create slot for this test with inactive_timeout set).
 my $old_connstr = $oldpub->connstr . ' dbname=postgres';
 
+my $inactive_timeout = 3600;
 $oldpub->start;
 $oldpub->safe_psql(
 	'postgres', qq[
 	SELECT * FROM pg_drop_replication_slot('test_slot1');
 	SELECT * FROM pg_drop_replication_slot('test_slot2');
+	SELECT pg_create_logical_replication_slot(slot_name := 'regress_sub', plugin := 'pgoutput', inactive_timeout := $inactive_timeout);
 	CREATE PUBLICATION regress_pub FOR ALL TABLES;
 ]);
 
@@ -172,7 +175,7 @@ $sub->start;
 $sub->safe_psql(
 	'postgres', qq[
 	CREATE TABLE tbl (a int);
-	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (two_phase = 'true', failover = 'true')
+	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (slot_name = 'regress_sub', create_slot = false, two_phase = 'true', failover = 'true')
 ]);
 $sub->wait_for_subscription_sync($oldpub, 'regress_sub');
 
@@ -192,8 +195,8 @@ command_ok([@pg_upgrade_cmd], 'run of pg_upgrade of old cluster');
 # Check that the slot 'regress_sub' has migrated to the new cluster
 $newpub->start;
 my $result = $newpub->safe_psql('postgres',
-	"SELECT slot_name, two_phase, failover FROM pg_replication_slots");
-is($result, qq(regress_sub|t|t), 'check the slot exists on new cluster');
+	"SELECT slot_name, two_phase, failover, inactive_timeout = $inactive_timeout FROM pg_replication_slots");
+is($result, qq(regress_sub|t|t|t), 'check the slot exists on new cluster');
 
 # Update the connection
 my $new_connstr = $newpub->connstr . ' dbname=postgres';
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 0d26e5b422..a09da44b6a 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11105,10 +11105,10 @@
 # replication slots
 { oid => '3779', descr => 'create a physical replication slot',
   proname => 'pg_create_physical_replication_slot', provolatile => 'v',
-  proparallel => 'u', prorettype => 'record', proargtypes => 'name bool bool',
-  proallargtypes => '{name,bool,bool,name,pg_lsn}',
-  proargmodes => '{i,i,i,o,o}',
-  proargnames => '{slot_name,immediately_reserve,temporary,slot_name,lsn}',
+  proparallel => 'u', prorettype => 'record', proargtypes => 'name bool bool int4',
+  proallargtypes => '{name,bool,bool,int4,name,pg_lsn}',
+  proargmodes => '{i,i,i,i,o,o}',
+  proargnames => '{slot_name,immediately_reserve,temporary,inactive_timeout,slot_name,lsn}',
   prosrc => 'pg_create_physical_replication_slot' },
 { oid => '4220',
   descr => 'copy a physical replication slot, changing temporality',
@@ -11133,17 +11133,17 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,timestamptz,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,last_inactive_time,conflicting,invalidation_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,timestamptz,int4,bool,text,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,last_inactive_time,inactive_timeout,conflicting,invalidation_reason,failover,synced}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
   proparallel => 'u', prorettype => 'record',
-  proargtypes => 'name name bool bool bool',
-  proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
-  proargmodes => '{i,i,i,i,i,o,o}',
-  proargnames => '{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
+  proargtypes => 'name name bool bool bool int4',
+  proallargtypes => '{name,name,bool,bool,bool,int4,name,pg_lsn}',
+  proargmodes => '{i,i,i,i,i,i,o,o}',
+  proargnames => '{slot_name,plugin,temporary,twophase,failover,inactive_timeout,slot_name,lsn}',
   prosrc => 'pg_create_logical_replication_slot' },
 { oid => '4222',
   descr => 'copy a logical replication slot, changing temporality and plugin',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 2f18433ecc..24623cfdc1 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -127,6 +127,9 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* The amount of time in seconds the slot is allowed to be inactive */
+	int			inactive_timeout;
 } ReplicationSlotPersistentData;
 
 /*
@@ -239,7 +242,7 @@ extern void ReplicationSlotsShmemInit(void);
 extern void ReplicationSlotCreate(const char *name, bool db_specific,
 								  ReplicationSlotPersistency persistency,
 								  bool two_phase, bool failover,
-								  bool synced);
+								  bool synced, int inactive_timeout);
 extern void ReplicationSlotPersist(void);
 extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index f47bfd78eb..3dd780beab 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -152,8 +152,9 @@ log_min_messages = 'debug2'
 $primary->append_conf('postgresql.conf', "log_min_messages = 'debug2'");
 $primary->reload;
 
+my $inactive_timeout = 3600;
 $primary->psql('postgres',
-	q{SELECT pg_create_logical_replication_slot('lsub2_slot', 'test_decoding', false, false, true);}
+	"SELECT pg_create_logical_replication_slot('lsub2_slot', 'test_decoding', false, false, true, $inactive_timeout);"
 );
 
 $primary->psql('postgres',
@@ -190,6 +191,16 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Confirm that the synced slot on the standby has got inactive_timeout from the
+# primary.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT inactive_timeout = $inactive_timeout FROM pg_replication_slots
+			WHERE slot_name = 'lsub2_slot' AND synced AND NOT temporary;"
+	),
+	"t",
+	'synced logical slot has got inactive_timeout on standby');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index dfcbaec387..d532e23176 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1474,11 +1474,12 @@ pg_replication_slots| SELECT l.slot_name,
     l.safe_wal_size,
     l.two_phase,
     l.last_inactive_time,
+    l.inactive_timeout,
     l.conflicting,
     l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, last_inactive_time, conflicting, invalidation_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, last_inactive_time, inactive_timeout, conflicting, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v18-0003-Introduce-new-SQL-funtion-pg_alter_replication_s.patchapplication/octet-stream; name=v18-0003-Introduce-new-SQL-funtion-pg_alter_replication_s.patchDownload

From 65be663680fbde9812392a4fa739633060625f82 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sun, 24 Mar 2024 07:18:17 +0000
Subject: [PATCH v18 3/5] Introduce new SQL funtion pg_alter_replication_slot

This commit adds a new function pg_alter_replication_slot to alter
the given property of a replication slot. It is similar to
replication protocol command ALTER_REPLICATION_SLOT, except that
for now it allows only inactive_timeout property to be set. The
reason for disallowing failover property to be altered via this
function is to avoid inconsistency with the catalog
pg_subscription on the logical subscriber. Because, the subscriber
won't know the altered value of its replication slot on the
publisher.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 contrib/test_decoding/expected/slot.out   | 44 ++++++++++++++-
 contrib/test_decoding/sql/slot.sql        | 10 ++++
 doc/src/sgml/func.sgml                    | 21 ++++++++
 src/backend/replication/slot.c            | 22 ++++----
 src/backend/replication/slotfuncs.c       | 66 ++++++++++++++++++++++-
 src/bin/pg_upgrade/t/003_logical_slots.pl | 14 +++--
 src/include/catalog/pg_proc.dat           |  5 ++
 src/include/replication/slot.h            |  2 +
 8 files changed, 167 insertions(+), 17 deletions(-)

diff --git a/contrib/test_decoding/expected/slot.out b/contrib/test_decoding/expected/slot.out
index 6771520afb..5b8dbf6f52 100644
--- a/contrib/test_decoding/expected/slot.out
+++ b/contrib/test_decoding/expected/slot.out
@@ -496,13 +496,27 @@ SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_sl
  copy
 (1 row)
 
+-- Test alter physical slot with inactive_timeout option set.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot4');
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'alter' FROM pg_alter_replication_slot(slot_name := 'it_phy_slot4', inactive_timeout := 900);
+ ?column? 
+----------
+ alter
+(1 row)
+
 SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
   slot_name   | slot_type | inactive_timeout 
 --------------+-----------+------------------
  it_phy_slot1 | physical  |              300
  it_phy_slot2 | physical  |                0
  it_phy_slot3 | physical  |              300
-(3 rows)
+ it_phy_slot4 | physical  |              900
+(4 rows)
 
 SELECT pg_drop_replication_slot('it_phy_slot1');
  pg_drop_replication_slot 
@@ -522,6 +536,12 @@ SELECT pg_drop_replication_slot('it_phy_slot3');
  
 (1 row)
 
+SELECT pg_drop_replication_slot('it_phy_slot4');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
 -- Test inactive_timeout option of logical slots.
 SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
  ?column? 
@@ -542,13 +562,27 @@ SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slo
  copy
 (1 row)
 
+-- Test alter logical slot with inactive_timeout option set.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot4', plugin := 'test_decoding');
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'alter' FROM pg_alter_replication_slot(slot_name := 'it_log_slot4', inactive_timeout := 900);
+ ?column? 
+----------
+ alter
+(1 row)
+
 SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
   slot_name   | slot_type | inactive_timeout 
 --------------+-----------+------------------
  it_log_slot1 | logical   |              600
  it_log_slot2 | logical   |                0
  it_log_slot3 | logical   |              600
-(3 rows)
+ it_log_slot4 | logical   |              900
+(4 rows)
 
 SELECT pg_drop_replication_slot('it_log_slot1');
  pg_drop_replication_slot 
@@ -568,3 +602,9 @@ SELECT pg_drop_replication_slot('it_log_slot3');
  
 (1 row)
 
+SELECT pg_drop_replication_slot('it_log_slot4');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
diff --git a/contrib/test_decoding/sql/slot.sql b/contrib/test_decoding/sql/slot.sql
index 443e91da07..6785714cc7 100644
--- a/contrib/test_decoding/sql/slot.sql
+++ b/contrib/test_decoding/sql/slot.sql
@@ -206,11 +206,16 @@ SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot
 -- Copy physical slot with inactive_timeout option set.
 SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_slot1', dst_slot_name := 'it_phy_slot3');
 
+-- Test alter physical slot with inactive_timeout option set.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot4');
+SELECT 'alter' FROM pg_alter_replication_slot(slot_name := 'it_phy_slot4', inactive_timeout := 900);
+
 SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
 
 SELECT pg_drop_replication_slot('it_phy_slot1');
 SELECT pg_drop_replication_slot('it_phy_slot2');
 SELECT pg_drop_replication_slot('it_phy_slot3');
+SELECT pg_drop_replication_slot('it_phy_slot4');
 
 -- Test inactive_timeout option of logical slots.
 SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
@@ -219,8 +224,13 @@ SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot2
 -- Copy logical slot with inactive_timeout option set.
 SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slot1', dst_slot_name := 'it_log_slot3');
 
+-- Test alter logical slot with inactive_timeout option set.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot4', plugin := 'test_decoding');
+SELECT 'alter' FROM pg_alter_replication_slot(slot_name := 'it_log_slot4', inactive_timeout := 900);
+
 SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
 
 SELECT pg_drop_replication_slot('it_log_slot1');
 SELECT pg_drop_replication_slot('it_log_slot2');
 SELECT pg_drop_replication_slot('it_log_slot3');
+SELECT pg_drop_replication_slot('it_log_slot4');
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index afaafa35ad..22c8e0d39c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28829,6 +28829,27 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
       </entry>
       </row>
 
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_alter_replication_slot</primary>
+        </indexterm>
+        <function>pg_alter_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type>, <parameter>inactive_timeout</parameter> <type>integer</type> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Alters the given property of a replication slot
+        named <parameter>slot_name</parameter>. Same as replication protocol
+        command <literal>ALTER_REPLICATION_SLOT</literal>, except that it
+        allows only <parameter>inactive_timeout</parameter> property to be set.
+        The reason for disallowing <parameter>failover</parameter> property to
+        be altered via this function is to avoid inconsistency with the catalog
+        <structname>pg_subscription</structname> on the logical subscriber.
+        Because, the subscriber won't know the altered value of its
+        replication slot on the publisher.
+       </para></entry>
+      </row>
+
      </tbody>
     </tgroup>
    </table>
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 852a657e97..3287aa2860 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -162,7 +162,6 @@ static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
-static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
 
 /*
  * Report shared-memory space needed by ReplicationSlotsShmemInit.
@@ -870,6 +869,7 @@ ReplicationSlotAlter(const char *name, bool failover)
 	ReplicationSlotRelease();
 }
 
+
 /*
  * Permanently drop the currently acquired replication slot.
  */
@@ -1005,7 +1005,7 @@ ReplicationSlotSave(void)
 	Assert(MyReplicationSlot != NULL);
 
 	sprintf(path, "pg_replslot/%s", NameStr(MyReplicationSlot->data.name));
-	SaveSlotToPath(MyReplicationSlot, path, ERROR);
+	ReplicationSlotSaveToPath(MyReplicationSlot, path, ERROR);
 }
 
 /*
@@ -1868,7 +1868,10 @@ CheckPointReplicationSlots(bool is_shutdown)
 		if (!s->in_use)
 			continue;
 
-		/* save the slot to disk, locking is handled in SaveSlotToPath() */
+		/*
+		 * Save the slot to disk, locking is handled in
+		 * ReplicationSlotSaveToPath.
+		 */
 		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
 
 		/*
@@ -1894,7 +1897,7 @@ CheckPointReplicationSlots(bool is_shutdown)
 			SpinLockRelease(&s->mutex);
 		}
 
-		SaveSlotToPath(s, path, LOG);
+		ReplicationSlotSaveToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
 }
@@ -1973,8 +1976,9 @@ CreateSlotOnDisk(ReplicationSlot *slot)
 
 	/*
 	 * No need to take out the io_in_progress_lock, nobody else can see this
-	 * slot yet, so nobody else will write. We're reusing SaveSlotToPath which
-	 * takes out the lock, if we'd take the lock here, we'd deadlock.
+	 * slot yet, so nobody else will write. We're reusing
+	 * ReplicationSlotSaveToPath which takes out the lock, if we'd take the
+	 * lock here, we'd deadlock.
 	 */
 
 	sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
@@ -2000,7 +2004,7 @@ CreateSlotOnDisk(ReplicationSlot *slot)
 
 	/* Write the actual state file. */
 	slot->dirty = true;			/* signal that we really need to write */
-	SaveSlotToPath(slot, tmppath, ERROR);
+	ReplicationSlotSaveToPath(slot, tmppath, ERROR);
 
 	/* Rename the directory into place. */
 	if (rename(tmppath, path) != 0)
@@ -2025,8 +2029,8 @@ CreateSlotOnDisk(ReplicationSlot *slot)
 /*
  * Shared functionality between saving and creating a replication slot.
  */
-static void
-SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel)
+void
+ReplicationSlotSaveToPath(ReplicationSlot *slot, const char *dir, int elevel)
 {
 	char		tmppath[MAXPGPATH];
 	char		path[MAXPGPATH];
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index fb79401c50..dba80ac1bb 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -229,7 +229,6 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 	PG_RETURN_DATUM(result);
 }
 
-
 /*
  * SQL function for dropping a replication slot.
  */
@@ -1038,3 +1037,68 @@ pg_sync_replication_slots(PG_FUNCTION_ARGS)
 
 	PG_RETURN_VOID();
 }
+
+/*
+ * SQL function for altering given properties of a replication slot.
+ */
+Datum
+pg_alter_replication_slot(PG_FUNCTION_ARGS)
+{
+	Name		name = PG_GETARG_NAME(0);
+	int			inactive_timeout = PG_GETARG_INT32(1);
+	ReplicationSlot *slot;
+	char		path[MAXPGPATH];
+
+	CheckSlotPermissions();
+
+	CheckSlotRequirements();
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	/* Check if the slot exits with the given name. */
+	slot = SearchNamedReplicationSlot(NameStr(*name), false);
+
+	if (!slot)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("replication slot \"%s\" does not exist",
+						NameStr(*name))));
+
+	/*
+	 * Do not allow users to set inactive_timeout for temporary slots because
+	 * temporary, slots will not be saved to the disk.
+	 */
+	if (slot->data.persistency == RS_TEMPORARY)
+		ereport(ERROR,
+				errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				errmsg("cannot set inactive_timeout for a temporary replication slot"));
+
+	LWLockRelease(ReplicationSlotControlLock);
+
+	if (inactive_timeout < 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("\"inactive_timeout\" must not be negative")));
+
+	/*
+	 * We need to briefly prevent any other backend from acquiring the slot
+	 * while we set the property. Without holding the ControlLock exclusively,
+	 * a concurrent ReplicationSlotAcquire() could acquire the slot as well.
+	 */
+	LWLockAcquire(ReplicationSlotControlLock, LW_EXCLUSIVE);
+
+	SpinLockAcquire(&slot->mutex);
+	slot->data.inactive_timeout = inactive_timeout;
+
+	/* Make sure the invalidated state persists across server restart */
+	slot->just_dirtied = true;
+	slot->dirty = true;
+	SpinLockRelease(&slot->mutex);
+
+	sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+	ReplicationSlotSaveToPath(slot, path, ERROR);
+
+	LWLockRelease(ReplicationSlotControlLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/src/bin/pg_upgrade/t/003_logical_slots.pl b/src/bin/pg_upgrade/t/003_logical_slots.pl
index 6e82d2cb7b..b79db24f42 100644
--- a/src/bin/pg_upgrade/t/003_logical_slots.pl
+++ b/src/bin/pg_upgrade/t/003_logical_slots.pl
@@ -153,17 +153,14 @@ like(
 # TEST: Successful upgrade
 
 # Preparations for the subsequent test:
-# 1. Setup logical replication (first, cleanup slots from the previous tests,
-# and then create slot for this test with inactive_timeout set).
+# 1. Setup logical replication (first, cleanup slots from the previous tests)
 my $old_connstr = $oldpub->connstr . ' dbname=postgres';
 
-my $inactive_timeout = 3600;
 $oldpub->start;
 $oldpub->safe_psql(
 	'postgres', qq[
 	SELECT * FROM pg_drop_replication_slot('test_slot1');
 	SELECT * FROM pg_drop_replication_slot('test_slot2');
-	SELECT pg_create_logical_replication_slot(slot_name := 'regress_sub', plugin := 'pgoutput', inactive_timeout := $inactive_timeout);
 	CREATE PUBLICATION regress_pub FOR ALL TABLES;
 ]);
 
@@ -175,7 +172,7 @@ $sub->start;
 $sub->safe_psql(
 	'postgres', qq[
 	CREATE TABLE tbl (a int);
-	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (slot_name = 'regress_sub', create_slot = false, two_phase = 'true', failover = 'true')
+	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (two_phase = 'true', failover = 'true')
 ]);
 $sub->wait_for_subscription_sync($oldpub, 'regress_sub');
 
@@ -185,6 +182,13 @@ my $twophase_query =
 $sub->poll_query_until('postgres', $twophase_query)
   or die "Timed out while waiting for subscriber to enable twophase";
 
+# Alter slot to set inactive_timeout
+my $inactive_timeout = 3600;
+$oldpub->safe_psql(
+	'postgres', qq[
+	SELECT pg_alter_replication_slot(slot_name := 'regress_sub', inactive_timeout := $inactive_timeout);
+]);
+
 # 2. Temporarily disable the subscription
 $sub->safe_psql('postgres', "ALTER SUBSCRIPTION regress_sub DISABLE");
 $oldpub->stop;
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a09da44b6a..9a8134aa46 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11222,6 +11222,11 @@
   proname => 'pg_sync_replication_slots', provolatile => 'v', proparallel => 'u',
   prorettype => 'void', proargtypes => '',
   prosrc => 'pg_sync_replication_slots' },
+{ oid => '9039', descr => 'alter given properties of a replication slot',
+  proname => 'pg_alter_replication_slot', provolatile => 'v', proparallel => 'u',
+  prorettype => 'void', proargtypes => 'name int4',
+  proargnames => '{slot_name,inactive_timeout}',
+  prosrc => 'pg_alter_replication_slot' },
 
 # event triggers
 { oid => '3566', descr => 'list objects dropped by the current command',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 24623cfdc1..915edf7617 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -252,6 +252,8 @@ extern void ReplicationSlotAcquire(const char *name, bool nowait);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
+extern void ReplicationSlotSaveToPath(ReplicationSlot *slot, const char *dir,
+									  int elevel);
 extern void ReplicationSlotMarkDirty(void);
 
 /* misc stuff */
-- 
2.34.1

v18-0004-Allow-setting-inactive_timeout-in-the-replicatio.patchapplication/octet-stream; name=v18-0004-Allow-setting-inactive_timeout-in-the-replicatio.patchDownload

From a1062c4c527693c6980dc2b63c5091eed19438e7 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sun, 24 Mar 2024 08:08:24 +0000
Subject: [PATCH v18 4/5] Allow setting inactive_timeout in the replication
 commands.

This commit allows replication connections to be able to set
inactive_timeout property of the slot using replication commands
CREATE_REPLICATION_SLOT and ALTER_REPLICATION_SLOT.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Ajin Cherian
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/protocol.sgml            | 20 +++++++++++
 src/backend/replication/slot.c        | 31 +++++++++++++++--
 src/backend/replication/walsender.c   | 38 ++++++++++++++++----
 src/include/replication/slot.h        |  3 +-
 src/test/recovery/t/001_stream_rep.pl | 50 +++++++++++++++++++++++++++
 5 files changed, 132 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index a5cb19357f..2ffa1b470a 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2068,6 +2068,16 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>INACTIVE_TIMEOUT [ <replaceable class="parameter">integer</replaceable> ]</literal></term>
+        <listitem>
+         <para>
+          If set to a non-zero value, specifies the amount of time in seconds
+          the slot is allowed to be inactive. The default is zero.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
       <para>
@@ -2168,6 +2178,16 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
          </para>
         </listitem>
        </varlistentry>
+
+       <varlistentry>
+        <term><literal>INACTIVE_TIMEOUT [ <replaceable class="parameter">integer</replaceable> ]</literal></term>
+        <listitem>
+         <para>
+          If set to a non-zero value, specifies the amount of time in seconds
+          the slot is allowed to be inactive. The default is zero.
+         </para>
+        </listitem>
+       </varlistentry>
       </variablelist>
 
      </listitem>
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 3287aa2860..baf0b9aa72 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -812,8 +812,10 @@ ReplicationSlotDrop(const char *name, bool nowait)
  * Change the definition of the slot identified by the specified name.
  */
 void
-ReplicationSlotAlter(const char *name, bool failover)
+ReplicationSlotAlter(const char *name, bool failover, int inactive_timeout)
 {
+	bool		lock_acquired;
+
 	Assert(MyReplicationSlot == NULL);
 
 	ReplicationSlotAcquire(name, false);
@@ -856,10 +858,35 @@ ReplicationSlotAlter(const char *name, bool failover)
 				errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				errmsg("cannot enable failover for a temporary replication slot"));
 
-	if (MyReplicationSlot->data.failover != failover)
+	/*
+	 * Do not allow users to set inactive_timeout for temporary slots because
+	 * temporary, slots will not be saved to the disk.
+	 */
+	if (inactive_timeout > 0 && MyReplicationSlot->data.persistency == RS_TEMPORARY)
+		ereport(ERROR,
+				errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				errmsg("cannot set inactive_timeout for a temporary replication slot"));
+
+	/*
+	 * If we are to change any of the slot property, acquire the lock once and
+	 * for all.
+	 */
+	lock_acquired = false;
+	if (MyReplicationSlot->data.failover != failover ||
+		MyReplicationSlot->data.inactive_timeout != inactive_timeout)
 	{
 		SpinLockAcquire(&MyReplicationSlot->mutex);
+		lock_acquired = true;
+	}
+
+	if (MyReplicationSlot->data.failover != failover)
 		MyReplicationSlot->data.failover = failover;
+
+	if (MyReplicationSlot->data.inactive_timeout != inactive_timeout)
+		MyReplicationSlot->data.inactive_timeout = inactive_timeout;
+
+	if (lock_acquired)
+	{
 		SpinLockRelease(&MyReplicationSlot->mutex);
 
 		ReplicationSlotMarkDirty();
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 5315c08650..0420274247 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1123,13 +1123,15 @@ static void
 parseCreateReplSlotOptions(CreateReplicationSlotCmd *cmd,
 						   bool *reserve_wal,
 						   CRSSnapshotAction *snapshot_action,
-						   bool *two_phase, bool *failover)
+						   bool *two_phase, bool *failover,
+						   int *inactive_timeout)
 {
 	ListCell   *lc;
 	bool		snapshot_action_given = false;
 	bool		reserve_wal_given = false;
 	bool		two_phase_given = false;
 	bool		failover_given = false;
+	bool		inactive_timeout_given = false;
 
 	/* Parse options */
 	foreach(lc, cmd->options)
@@ -1188,6 +1190,15 @@ parseCreateReplSlotOptions(CreateReplicationSlotCmd *cmd,
 			failover_given = true;
 			*failover = defGetBoolean(defel);
 		}
+		else if (strcmp(defel->defname, "inactive_timeout") == 0)
+		{
+			if (inactive_timeout_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			inactive_timeout_given = true;
+			*inactive_timeout = defGetInt32(defel);
+		}
 		else
 			elog(ERROR, "unrecognized option: %s", defel->defname);
 	}
@@ -1205,6 +1216,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	bool		reserve_wal = false;
 	bool		two_phase = false;
 	bool		failover = false;
+	int			inactive_timeout = 0;
 	CRSSnapshotAction snapshot_action = CRS_EXPORT_SNAPSHOT;
 	DestReceiver *dest;
 	TupOutputState *tstate;
@@ -1215,13 +1227,13 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	Assert(!MyReplicationSlot);
 
 	parseCreateReplSlotOptions(cmd, &reserve_wal, &snapshot_action, &two_phase,
-							   &failover);
+							   &failover, &inactive_timeout);
 
 	if (cmd->kind == REPLICATION_KIND_PHYSICAL)
 	{
 		ReplicationSlotCreate(cmd->slotname, false,
 							  cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
-							  false, false, false, 0);
+							  false, false, false, inactive_timeout);
 
 		if (reserve_wal)
 		{
@@ -1252,7 +1264,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 		 */
 		ReplicationSlotCreate(cmd->slotname, true,
 							  cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
-							  two_phase, failover, false, 0);
+							  two_phase, failover, false, inactive_timeout);
 
 		/*
 		 * Do options check early so that we can bail before calling the
@@ -1411,9 +1423,11 @@ DropReplicationSlot(DropReplicationSlotCmd *cmd)
  * Process extra options given to ALTER_REPLICATION_SLOT.
  */
 static void
-ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover)
+ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover,
+						  int *inactive_timeout)
 {
 	bool		failover_given = false;
+	bool		inactive_timeout_given = false;
 
 	/* Parse options */
 	foreach_ptr(DefElem, defel, cmd->options)
@@ -1427,6 +1441,15 @@ ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover)
 			failover_given = true;
 			*failover = defGetBoolean(defel);
 		}
+		else if (strcmp(defel->defname, "inactive_timeout") == 0)
+		{
+			if (inactive_timeout_given)
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("conflicting or redundant options")));
+			inactive_timeout_given = true;
+			*inactive_timeout = defGetInt32(defel);
+		}
 		else
 			elog(ERROR, "unrecognized option: %s", defel->defname);
 	}
@@ -1439,9 +1462,10 @@ static void
 AlterReplicationSlot(AlterReplicationSlotCmd *cmd)
 {
 	bool		failover = false;
+	int			inactive_timeout = 0;
 
-	ParseAlterReplSlotOptions(cmd, &failover);
-	ReplicationSlotAlter(cmd->slotname, failover);
+	ParseAlterReplSlotOptions(cmd, &failover, &inactive_timeout);
+	ReplicationSlotAlter(cmd->slotname, failover, inactive_timeout);
 }
 
 /*
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 915edf7617..ee9b385cf9 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -246,7 +246,8 @@ extern void ReplicationSlotCreate(const char *name, bool db_specific,
 extern void ReplicationSlotPersist(void);
 extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
-extern void ReplicationSlotAlter(const char *name, bool failover);
+extern void ReplicationSlotAlter(const char *name, bool failover,
+								 int inactive_timeout);
 
 extern void ReplicationSlotAcquire(const char *name, bool nowait);
 extern void ReplicationSlotRelease(void);
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 5311ade509..db00b6aa24 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -604,4 +604,54 @@ ok( pump_until(
 	'base backup cleanly canceled');
 $sigchld_bb->finish();
 
+# Drop any existing slots on the primary, for the follow-up tests.
+$node_primary->safe_psql('postgres',
+	"SELECT pg_drop_replication_slot(slot_name) FROM pg_replication_slots;");
+
+# Test setting inactive_timeout option via replication commands.
+$node_primary->append_conf(
+	'postgresql.conf', qq(
+wal_level = logical
+));
+$node_primary->restart;
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_phy_slot1 PHYSICAL (RESERVE_WAL, INACTIVE_TIMEOUT 100);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_phy_slot2 PHYSICAL (RESERVE_WAL);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"ALTER_REPLICATION_SLOT it_phy_slot2 (INACTIVE_TIMEOUT 200);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_log_slot1 LOGICAL pgoutput (TWO_PHASE, INACTIVE_TIMEOUT 300);",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"CREATE_REPLICATION_SLOT it_log_slot2 LOGICAL pgoutput;",
+	extra_params => [ '-d', $connstr_db ]);
+
+$node_primary->psql(
+	'postgres',
+	"ALTER_REPLICATION_SLOT it_log_slot2 (INACTIVE_TIMEOUT 400);",
+	extra_params => [ '-d', $connstr_db ]);
+
+my $slot_info_expected = 'it_log_slot1|logical|300
+it_log_slot2|logical|400
+it_phy_slot1|physical|100
+it_phy_slot2|physical|0';
+
+my $slot_info = $node_primary->safe_psql('postgres',
+	qq[SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;]);
+is($slot_info, $slot_info_expected, "replication slots with inactive_timeout on primary exist");
+
 done_testing();
-- 
2.34.1

v18-0005-Add-inactive_timeout-based-replication-slot-inva.patchapplication/octet-stream; name=v18-0005-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 2e1a8cba688291cf9150f0abbce89273832a644d Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sun, 24 Mar 2024 08:48:41 +0000
Subject: [PATCH v18 5/5] Add inactive_timeout based replication slot
 invalidation.

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days at slot level, after which the inactive slots get
dropped.

To achieve the above, postgres uses replication slot property
last_inactive_time (the time at which the slot became inactive),
and a new slot level parameter inactive_timeout and finds an
opportunity to invalidate the slot based on this new mechanism.
The invalidation check happens at various locations to help
being as latest as possible, these locations include the
following:
- Whenever the slot is acquired if the slot
  gets invalidated due to this new mechanism, an error is
  emitted.
- During checkpoint.
- Whenver pg_get_replication_slots() is called.

Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |  12 +-
 doc/src/sgml/system-views.sgml                |  10 +-
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 184 +++++++++++++++++-
 src/backend/replication/slotfuncs.c           |  19 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/include/replication/slot.h                |   9 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 170 ++++++++++++++++
 11 files changed, 395 insertions(+), 22 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 22c8e0d39c..4826e45c7d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28393,8 +28393,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         released upon any error. The optional fourth
         parameter, <parameter>inactive_timeout</parameter>, when set to a
         non-zero value, specifies the amount of time in seconds the slot is
-        allowed to be inactive. This function corresponds to the replication
-        protocol command
+        allowed to be inactive before getting invalidated.
+        This function corresponds to the replication protocol command
         <literal>CREATE_REPLICATION_SLOT ... PHYSICAL</literal>.
        </para></entry>
       </row>
@@ -28439,12 +28439,12 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <parameter>failover</parameter>, when set to true,
         specifies that this slot is enabled to be synced to the
         standbys so that logical replication can be resumed after
-        failover.  The optional sixth parameter,
+        failover. The optional sixth parameter,
         <parameter>inactive_timeout</parameter>, when set to a
         non-zero value, specifies the amount of time in seconds the slot is
-        allowed to be inactive. A call to this function has the same effect as
-        the replication protocol command
-        <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
+        allowed to be inactive before getting invalidated.
+        A call to this function has the same effect as the replication protocol
+        command <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
        </para></entry>
       </row>
 
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index dddbaa070f..1722609d39 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2539,7 +2539,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        <structfield>inactive_timeout</structfield> <type>integer</type>
       </para>
       <para>
-        The amount of time in seconds the slot is allowed to be inactive.
+        The amount of time in seconds the slot is allowed to be inactive
+        before getting invalidated.
       </para></entry>
      </row>
 
@@ -2583,6 +2584,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by slot's
+          <literal>inactive_timeout</literal> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index c01876ceeb..7f1ffab23c 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -319,7 +319,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -529,7 +529,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
 		 * if the slot is not acquired by other processes.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index baf0b9aa72..fae61020c4 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -158,6 +159,9 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool InvalidateSlotForInactiveTimeout(ReplicationSlot *slot,
+											 bool need_control_lock,
+											 bool need_mutex);
 
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
@@ -550,9 +554,14 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * If check_for_invalidation is true, the slot is checked for invalidation
+ * based on its inactive_timeout parameter and an error is raised after making
+ * the slot ours.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -630,6 +639,42 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * Check if the given slot can be invalidated based on its
+	 * inactive_timeout parameter. If yes, persist the invalidated state to
+	 * disk and then error out. We do this only after making the slot ours to
+	 * avoid anyone else acquiring it while we check for its invalidation.
+	 */
+	if (check_for_invalidation)
+	{
+		/* The slot is ours by now */
+		Assert(s->active_pid == MyProcPid);
+
+		/*
+		 * Well, the slot is not yet ours really unless we check for the
+		 * invalidation below.
+		 */
+		s->active_pid = 0;
+		if (InvalidateReplicationSlotForInactiveTimeout(s, true, true, true))
+		{
+			/*
+			 * If the slot has been invalidated, recalculate the resource
+			 * limits.
+			 */
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+
+			/* Might need it for slot clean up on error, so restore it */
+			s->active_pid = MyProcPid;
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("cannot acquire invalidated replication slot \"%s\"",
+							NameStr(MyReplicationSlot->data.name)),
+					 errdetail("This slot has been invalidated because of its inactive_timeout parameter.")));
+		}
+		s->active_pid = MyProcPid;
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -793,7 +838,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -818,7 +863,7 @@ ReplicationSlotAlter(const char *name, bool failover, int inactive_timeout)
 
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1546,6 +1591,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by slot's inactive_timeout parameter."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1659,6 +1707,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (InvalidateReplicationSlotForInactiveTimeout(s, false, false, false))
+						invalidation_cause = cause;
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1812,6 +1864,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1863,6 +1916,109 @@ restart:
 	return invalidated;
 }
 
+/*
+ * Invalidate given slot based on its inactive_timeout parameter.
+ *
+ * Returns true if the slot has got invalidated.
+ *
+ * NB - this function also runs as part of checkpoint, so avoid raising errors
+ * if possible.
+ */
+bool
+InvalidateReplicationSlotForInactiveTimeout(ReplicationSlot *slot,
+											bool need_control_lock,
+											bool need_mutex,
+											bool persist_state)
+{
+	if (!InvalidateSlotForInactiveTimeout(slot, need_control_lock, need_mutex))
+		return false;
+
+	Assert(slot->active_pid == 0);
+
+	SpinLockAcquire(&slot->mutex);
+	slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT;
+
+	/* Make sure the invalidated state persists across server restart */
+	slot->just_dirtied = true;
+	slot->dirty = true;
+	SpinLockRelease(&slot->mutex);
+
+	if (persist_state)
+	{
+		char		path[MAXPGPATH];
+
+		sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+		ReplicationSlotSaveToPath(slot, path, ERROR);
+	}
+
+	ReportSlotInvalidation(RS_INVAL_INACTIVE_TIMEOUT, false, 0,
+						   slot->data.name, InvalidXLogRecPtr,
+						   InvalidXLogRecPtr, InvalidTransactionId);
+
+	return true;
+}
+
+/*
+ * Helper for InvalidateReplicationSlotForInactiveTimeout
+ */
+static bool
+InvalidateSlotForInactiveTimeout(ReplicationSlot *slot,
+								 bool need_control_lock,
+								 bool need_mutex)
+{
+	ReplicationSlotInvalidationCause inavidation_cause = RS_INVAL_NONE;
+
+	if (slot->last_inactive_time == 0 ||
+		slot->data.inactive_timeout == 0)
+		return false;
+
+	/* inactive_timeout is only tracked for permanent slots */
+	if (slot->data.persistency != RS_PERSISTENT)
+		return false;
+
+	/*
+	 * Do not invalidate the slots which are currently being synced from the
+	 * primary to the standby.
+	 */
+	if (RecoveryInProgress() && slot->data.synced)
+		return false;
+
+	if (need_control_lock)
+		LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
+
+	/*
+	 * Check if the slot needs to be invalidated due to inactive_timeout. We
+	 * do this with the spinlock held to avoid race conditions -- for example
+	 * the restart_lsn could move forward, or the slot could be dropped.
+	 */
+	if (need_mutex)
+		SpinLockAcquire(&slot->mutex);
+
+	if (slot->last_inactive_time > 0 &&
+		slot->data.inactive_timeout > 0)
+	{
+		TimestampTz now;
+
+		/* last_inactive_time is only tracked for inactive slots */
+		Assert(slot->active_pid == 0);
+
+		now = GetCurrentTimestamp();
+		if (TimestampDifferenceExceeds(slot->last_inactive_time, now,
+									   slot->data.inactive_timeout * 1000))
+			inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+	}
+
+	if (need_mutex)
+		SpinLockRelease(&slot->mutex);
+
+	if (need_control_lock)
+		LWLockRelease(ReplicationSlotControlLock);
+
+	return (inavidation_cause == RS_INVAL_INACTIVE_TIMEOUT);
+}
+
 /*
  * Flush all replication slots to disk.
  *
@@ -1875,6 +2031,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated = false;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -1896,10 +2053,11 @@ CheckPointReplicationSlots(bool is_shutdown)
 			continue;
 
 		/*
-		 * Save the slot to disk, locking is handled in
-		 * ReplicationSlotSaveToPath.
+		 * Here's an opportunity to invalidate inactive replication slots
+		 * based on timeout, so let's do it.
 		 */
-		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
+		if (InvalidateReplicationSlotForInactiveTimeout(s, true, true, false))
+			invalidated = true;
 
 		/*
 		 * Slot's data is not flushed each time the confirmed_flush LSN is
@@ -1924,9 +2082,21 @@ CheckPointReplicationSlots(bool is_shutdown)
 			SpinLockRelease(&s->mutex);
 		}
 
+		/*
+		 * Save the slot to disk, locking is handled in
+		 * ReplicationSlotSaveToPath.
+		 */
+		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
 		ReplicationSlotSaveToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	/* If the slot has been invalidated, recalculate the resource limits */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index dba80ac1bb..aadba68c11 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -257,6 +257,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
+	bool		invalidated = false;
 
 	/*
 	 * We don't require any special permission to see this function's data
@@ -287,6 +288,13 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		slot_contents = *slot;
 		SpinLockRelease(&slot->mutex);
 
+		/*
+		 * Here's an opportunity to invalidate inactive replication slots
+		 * based on timeout, so let's do it.
+		 */
+		if (InvalidateReplicationSlotForInactiveTimeout(slot, false, true, true))
+			invalidated = true;
+
 		memset(values, 0, sizeof(values));
 		memset(nulls, 0, sizeof(nulls));
 
@@ -465,6 +473,15 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 	LWLockRelease(ReplicationSlotControlLock);
 
+	/*
+	 * If the slot has been invalidated, recalculate the resource limits
+	 */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
+
 	return (Datum) 0;
 }
 
@@ -667,7 +684,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 0420274247..aa886412a5 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -846,7 +846,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1483,7 +1483,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index ee9b385cf9..00ff8e5ef5 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -249,7 +251,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, bool failover,
 								 int inactive_timeout);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
@@ -270,6 +273,10 @@ extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
+extern bool InvalidateReplicationSlotForInactiveTimeout(ReplicationSlot *slot,
+														bool need_control_lock,
+														bool need_mutex,
+														bool persist_state);
 extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock);
 extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..77499dde07
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,170 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Check for invalidation of slot in server log.
+sub check_slots_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"", $offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated, "check that slot $slot_name invalidation has been logged");
+}
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot due to inactive_timeout
+#
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+$standby1->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+});
+
+# Set timeout so that the slot when inactive will get invalidated after the
+# timeout.
+my $inactive_timeout = 1;
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot', inactive_timeout := $inactive_timeout);
+]);
+
+$standby1->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Check inactive_timeout is what we've set above
+my $result = $primary->safe_psql(
+	'postgres', qq[
+	SELECT inactive_timeout = $inactive_timeout
+		FROM pg_replication_slots WHERE slot_name = 'sb1_slot';
+]);
+is($result, "t",
+	'check the inactive replication slot info for an active slot');
+
+my $logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby1->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_time IS NOT NULL
+            AND slot_name = 'sb1_slot'
+            AND inactive_timeout = $inactive_timeout;
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+check_slots_invalidation_in_server_log($primary, 'sb1_slot', $logstart);
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for inactive replication slot sb1_slot to be invalidated";
+
+# Testcase end: Invalidate streaming standby's slot due to inactive_timeout
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to inactive_timeout
+my $publisher = $primary;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$subscriber->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot')"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+$result = $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+# Alter slot to set inactive_timeout
+$publisher->safe_psql(
+	'postgres', qq[
+	SELECT pg_alter_replication_slot(slot_name := 'lsub1_slot', inactive_timeout := $inactive_timeout);
+]);
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the inactive replication slot info to be updated
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE last_inactive_time IS NOT NULL
+            AND slot_name = 'lsub1_slot'
+            AND inactive_timeout = $inactive_timeout;
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+check_slots_invalidation_in_server_log($publisher, 'lsub1_slot', $logstart);
+
+# Testcase end: Invalidate logical subscriber's slot due to inactive_timeout
+# =============================================================================
+
+done_testing();
-- 
2.34.1

#122

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#121)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sun, Mar 24, 2024 at 3:05 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Sun, Mar 24, 2024 at 10:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

For instance, setting last_inactive_time_1 to an invalid value fails
with the following error:

error running SQL: 'psql:<stdin>:1: ERROR: invalid input syntax for
type timestamp with time zone: "foo"
LINE 1: SELECT last_inactive_time > 'foo'::timestamptz FROM pg_repli...

It would be found at a later point. It would be probably better to
verify immediately after the test that fetches the last_inactive_time
value.

Agree. I've added a few more checks explicitly to verify the
last_inactive_time is sane with the following:

qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0)
AND '$last_inactive_time'::timestamptz >
'$slot_creation_time'::timestamptz;]

Such a test looks reasonable but shall we add equal to in the second
part of the test (like '$last_inactive_time'::timestamptz >=

'$slot_creation_time'::timestamptz;). This is just to be sure that even if the test ran fast enough to give the same time, the test shouldn't fail. I think it won't matter for correctness as well.

--
With Regards,
Amit Kapila.

#123

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#122)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 25, 2024 at 9:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Such a test looks reasonable but shall we add equal to in the second
part of the test (like '$last_inactive_time'::timestamptz >=

'$slot_creation_time'::timestamptz;). This is just to be sure that even if the test ran fast enough to give the same time, the test shouldn't fail. I think it won't matter for correctness as well.

Apart from this, I have made minor changes in the comments. See and
let me know what you think of attached.

--
With Regards,
Amit Kapila.

Attachments:

v18_0001_diff_amit.patch.txttext/plain; charset=US-ASCII; name=v18_0001_diff_amit.patch.txtDownload

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 2b36b5fef1..5f4165a945 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2529,8 +2529,7 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time at which the slot became inactive.
-        <literal>NULL</literal> if the slot is currently actively being
-        used.
+        <literal>NULL</literal> if the slot is currently being used.
       </para></entry>
      </row>
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 0f48d6dc7c..77cb633812 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -623,7 +623,7 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
-	/* The slot is active by now, so reset the last inactive time. */
+	/* Reset the last inactive time as the slot is active now. */
 	SpinLockAcquire(&s->mutex);
 	s->last_inactive_time = 0;
 	SpinLockRelease(&s->mutex);
@@ -687,8 +687,8 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking slot inactive. We get current
-	 * time beforehand to avoid system call while holding the lock.
+	 * Set the last inactive time after marking the slot inactive. We get the
+	 * current time beforehand to avoid a system call while holding the lock.
 	 */
 	now = GetCurrentTimestamp();
 
@@ -2363,9 +2363,9 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set last inactive time after loading the slot from the disk into
-		 * memory. Whoever acquires the slot i.e. makes the slot active will
-		 * anyway reset it.
+		 * We set the last inactive time after loading the slot from the disk
+		 * into memory. Whoever acquires the slot i.e. makes the slot active
+		 * will reset it.
 		 */
 		slot->last_inactive_time = GetCurrentTimestamp();
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 2f18433ecc..eefd7abd39 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -202,7 +202,7 @@ typedef struct ReplicationSlot
 	 */
 	XLogRecPtr	last_saved_confirmed_flush;
 
-	/* The time at which this slot become inactive */
+	/* The time at which this slot becomes inactive */
 	TimestampTz last_inactive_time;
 } ReplicationSlot;
 
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index bff84cc9c4..81bd36f5d8 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -411,7 +411,7 @@ $node_primary3->stop;
 $node_standby3->stop;
 
 # =============================================================================
-# Testcase start: Check last_inactive_time property of streaming standby's slot
+# Testcase start: Check last_inactive_time property of the streaming standby's slot
 #
 
 # Initialize primary node
@@ -440,8 +440,8 @@ $primary4->safe_psql(
     SELECT pg_create_physical_replication_slot(slot_name := '$sb4_slot');
 ]);
 
-# Get last_inactive_time value after slot's creation. Note that the slot is still
-# inactive unless it's used by the standby below.
+# Get last_inactive_time value after the slot's creation. Note that the slot
+# is still inactive till it's used by the standby below.
 my $last_inactive_time = $primary4->safe_psql('postgres',
 	qq(SELECT last_inactive_time FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND last_inactive_time IS NOT NULL;)
 );
@@ -470,8 +470,8 @@ is( $primary4->safe_psql(
 # Stop the standby to check its last_inactive_time value is updated
 $standby4->stop;
 
-# Let's also restart the primary so that the last_inactive_time is set upon
-# loading the slot from disk.
+# Let's restart the primary so that the last_inactive_time is set upon
+# loading the slot from the disk.
 $primary4->restart;
 
 is( $primary4->safe_psql(
@@ -483,11 +483,11 @@ is( $primary4->safe_psql(
 
 $standby4->stop;
 
-# Testcase end: Check last_inactive_time property of streaming standby's slot
+# Testcase end: Check last_inactive_time property of the streaming standby's slot
 # =============================================================================
 
 # =============================================================================
-# Testcase start: Check last_inactive_time property of logical subscriber's slot
+# Testcase start: Check last_inactive_time property of the logical subscriber's slot
 my $publisher4 = $primary4;
 
 # Create subscriber node
@@ -508,8 +508,8 @@ $publisher4->safe_psql('postgres',
 	"SELECT pg_create_logical_replication_slot(slot_name := '$lsub4_slot', plugin := 'pgoutput');"
 );
 
-# Get last_inactive_time value after slot's creation. Note that the slot is still
-# inactive unless it's used by the subscriber below.
+# Get last_inactive_time value after the slot's creation. Note that the slot
+# is still inactive till it's used by the subscriber below.
 $last_inactive_time = $publisher4->safe_psql('postgres',
 	qq(SELECT last_inactive_time FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND last_inactive_time IS NOT NULL;)
 );
@@ -541,8 +541,8 @@ is( $publisher4->safe_psql(
 # Stop the subscriber to check its last_inactive_time value is updated
 $subscriber4->stop;
 
-# Let's also restart the publisher so that the last_inactive_time is set upon
-# loading the slot from disk.
+# Let's restart the publisher so that the last_inactive_time is set upon
+# loading the slot from the disk.
 $publisher4->restart;
 
 is( $publisher4->safe_psql(
@@ -552,7 +552,7 @@ is( $publisher4->safe_psql(
 	't',
 	'last inactive time for an inactive logical slot is updated correctly');
 
-# Testcase end: Check last_inactive_time property of logical subscriber's slot
+# Testcase end: Check last_inactive_time property of the logical subscriber's slot
 # =============================================================================
 
 $publisher4->stop;

#124

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#121)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

I've attached the v18 patch set here.

Thanks for the patches. Please find few comments:

patch 001:
--------

1)
slot.h:

+ /* The time at which this slot become inactive */
+ TimestampTz last_inactive_time;

become -->became

---------
patch 002:

2)
slotsync.c:

ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
remote_slot->two_phase,
remote_slot->failover,
- true);
+ true, 0);

+ slot->data.inactive_timeout = remote_slot->inactive_timeout;

Is there a reason we are not passing 'remote_slot->inactive_timeout'
to ReplicationSlotCreate() directly?

---------

3)
slotfuncs.c
pg_create_logical_replication_slot():
+ int inactive_timeout = PG_GETARG_INT32(5);

Can we mention here that timeout is in seconds either in comment or
rename variable to inactive_timeout_secs?

Please do this for create_physical_replication_slot(),
create_logical_replication_slot(),
pg_create_physical_replication_slot() as well.

---------
4)
+ int inactive_timeout; /* The amount of time in seconds the slot
+ * is allowed to be inactive. */
 } LogicalSlotInfo;

Do we need to mention "before getting invalided" like other places
(in last patch)?

----------

5)
Same at these two places. "before getting invalided" to be added in
the last patch otherwise the info is incompleted.

+
+ /* The amount of time in seconds the slot is allowed to be inactive */
+ int inactive_timeout;
 } ReplicationSlotPersistentData;

+ * inactive_timeout: The amount of time in seconds the slot is allowed to be
+ *     inactive.
  */
 void
 ReplicationSlotCreate(const char *name, bool db_specific,
 Same here. "before getting invalidated" ?

--------

Reviewing more..

thanks
Shveta

#125

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: shveta malik (#124)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:

On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

I've attached the v18 patch set here.

I have a question. Don't we allow creating subscriptions on an
existing slot with a non-null 'inactive_timeout' set where
'inactive_timeout' of the slot is retained even after subscription
creation?

I tried this:

===================
--On publisher, create slot with 120sec inactive_timeout:
SELECT * FROM pg_create_logical_replication_slot('logical_slot1',
'pgoutput', false, true, true, 120);

--On subscriber, create sub using logical_slot1
create subscription mysubnew1_1 connection 'dbname=newdb1
host=localhost user=shveta port=5433' publication mypubnew1_1 WITH
(failover = true, create_slot=false, slot_name='logical_slot1');

--Before creating sub, pg_replication_slots output:
slot_name | failover | synced | active | temp | conf |
lat | inactive_timeout
---------------+----------+--------+--------+------+------+----------------------------------+------------------
logical_slot1 | t | f | f | f | f | 2024-03-25
11:11:55.375736+05:30 | 120

--After creating sub pg_replication_slots output: (inactive_timeout is 0 now):
slot_name |failover | synced | active | temp | conf | | lat |
inactive_timeout
---------------+---------+--------+--------+------+------+-+-----+------------------
logical_slot1 |t | f | t | f | f | | |
0
===================

In CreateSubscription, we call 'walrcv_alter_slot()' /
'ReplicationSlotAlter()' when create_slot is false. This call ends up
setting active_timeout from 120sec to 0. Is it intentional?

thanks
Shveta

#126

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#123)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 25, 2024 at 10:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Mar 25, 2024 at 9:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Such a test looks reasonable but shall we add equal to in the second
part of the test (like '$last_inactive_time'::timestamptz >=

'$slot_creation_time'::timestamptz;). This is just to be sure that even if the test ran fast enough to give the same time, the test shouldn't fail. I think it won't matter for correctness as well.

Agree. I added that in v19 patch. I was having that concern in my
mind. That's the reason I wasn't capturing current_time something like
below for the same worry that current_timestamp might be the same (or
nearly the same) as the slot creation time. That's why I ended up
capturing current_timestamp in a separate query than clubbing it up
with pg_create_physical_replication_slot.

SELECT current_timestamp FROM pg_create_physical_replication_slot('foo');

Apart from this, I have made minor changes in the comments. See and
let me know what you think of attached.

LGTM. I've merged the diff into v19 patch.

Please find the attached v19 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v19-0001-Track-last_inactive_time-in-pg_replication_slots.patchapplication/octet-stream; name=v19-0001-Track-last_inactive_time-in-pg_replication_slots.patchDownload

From 16a64093fcb3c4747ebc1ac6fe9e3ddfcbeeab63 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 25 Mar 2024 06:42:19 +0000
Subject: [PATCH v19] Track last_inactive_time in pg_replication_slots.

Till now, the time at which the replication slot became inactive
is not tracked directly in pg_replication_slots. This commit adds
a new property called last_inactive_time for this. It is set to 0
whenever a slot is made active/acquired and set to current
timestamp whenever the slot is inactive/released or restored from
the disk.

The new property will be useful on production servers to debug and
analyze inactive replication slots. It will also help to know the
lifetime of a replication slot - one can know how long a streaming
standby, logical subscriber, or replication slot consumer is down.

The new property will be useful to implement inactive timeout based
replication slot invalidation in a future commit.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Reviewed-by: Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/system-views.sgml            |  10 ++
 src/backend/catalog/system_views.sql      |   1 +
 src/backend/replication/slot.c            |  27 ++++
 src/backend/replication/slotfuncs.c       |   7 +-
 src/include/catalog/pg_proc.dat           |   6 +-
 src/include/replication/slot.h            |   3 +
 src/test/recovery/t/019_replslot_limit.pl | 148 ++++++++++++++++++++++
 src/test/regress/expected/rules.out       |   3 +-
 8 files changed, 200 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index b5da476c20..5f4165a945 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2523,6 +2523,16 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_time</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently being used.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>conflicting</structfield> <type>bool</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f69b7f5580..bc70ff193e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,6 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
+            L.last_inactive_time,
             L.conflicting,
             L.invalidation_reason,
             L.failover,
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index cdf0c450c5..77cb633812 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -409,6 +409,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->candidate_restart_valid = InvalidXLogRecPtr;
 	slot->candidate_restart_lsn = InvalidXLogRecPtr;
 	slot->last_saved_confirmed_flush = InvalidXLogRecPtr;
+	slot->last_inactive_time = 0;
 
 	/*
 	 * Create the slot on disk.  We haven't actually marked the slot allocated
@@ -622,6 +623,11 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
+	/* Reset the last inactive time as the slot is active now. */
+	SpinLockAcquire(&s->mutex);
+	s->last_inactive_time = 0;
+	SpinLockRelease(&s->mutex);
+
 	if (am_walsender)
 	{
 		ereport(log_replication_commands ? LOG : DEBUG1,
@@ -645,6 +651,7 @@ ReplicationSlotRelease(void)
 	ReplicationSlot *slot = MyReplicationSlot;
 	char	   *slotname = NULL;	/* keep compiler quiet */
 	bool		is_logical = false; /* keep compiler quiet */
+	TimestampTz now;
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
@@ -679,6 +686,12 @@ ReplicationSlotRelease(void)
 		ReplicationSlotsComputeRequiredXmin(false);
 	}
 
+	/*
+	 * Set the last inactive time after marking the slot inactive. We get the
+	 * current time beforehand to avoid a system call while holding the lock.
+	 */
+	now = GetCurrentTimestamp();
+
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
 		/*
@@ -687,9 +700,16 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
+		slot->last_inactive_time = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
+	else
+	{
+		SpinLockAcquire(&slot->mutex);
+		slot->last_inactive_time = now;
+		SpinLockRelease(&slot->mutex);
+	}
 
 	MyReplicationSlot = NULL;
 
@@ -2342,6 +2362,13 @@ RestoreSlotFromDisk(const char *name)
 		slot->in_use = true;
 		slot->active_pid = 0;
 
+		/*
+		 * We set the last inactive time after loading the slot from the disk
+		 * into memory. Whoever acquires the slot i.e. makes the slot active
+		 * will reset it.
+		 */
+		slot->last_inactive_time = GetCurrentTimestamp();
+
 		restored = true;
 		break;
 	}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 4232c1e52e..24f5e6d90a 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 18
+#define PG_GET_REPLICATION_SLOTS_COLS 19
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -410,6 +410,11 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
+		if (slot_contents.last_inactive_time > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.last_inactive_time);
+		else
+			nulls[i++] = true;
+
 		cause = slot_contents.data.invalidated;
 
 		if (SlotIsPhysical(&slot_contents))
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 71c74350a0..0d26e5b422 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11133,9 +11133,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,invalidation_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,timestamptz,bool,text,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,last_inactive_time,conflicting,invalidation_reason,failover,synced}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7f25a083ee..eefd7abd39 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -201,6 +201,9 @@ typedef struct ReplicationSlot
 	 * forcibly flushed or not.
 	 */
 	XLogRecPtr	last_saved_confirmed_flush;
+
+	/* The time at which this slot becomes inactive */
+	TimestampTz last_inactive_time;
 } ReplicationSlot;
 
 #define SlotIsPhysical(slot) ((slot)->data.database == InvalidOid)
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index fe00370c3e..413a291b76 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -410,4 +410,152 @@ kill 'CONT', $receiverpid;
 $node_primary3->stop;
 $node_standby3->stop;
 
+# =============================================================================
+# Testcase start: Check last_inactive_time property of the streaming standby's slot
+#
+
+# Initialize primary node
+my $primary4 = PostgreSQL::Test::Cluster->new('primary4');
+$primary4->init(allows_streaming => 'logical');
+$primary4->start;
+
+# Take backup
+$backup_name = 'my_backup4';
+$primary4->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby4 = PostgreSQL::Test::Cluster->new('standby4');
+$standby4->init_from_backup($primary4, $backup_name, has_streaming => 1);
+
+my $sb4_slot = 'sb4_slot';
+$standby4->append_conf('postgresql.conf', "primary_slot_name = '$sb4_slot'");
+
+my $slot_creation_time = $primary4->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
+$primary4->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := '$sb4_slot');
+]);
+
+# Get last_inactive_time value after the slot's creation. Note that the slot
+# is still inactive till it's used by the standby below.
+my $last_inactive_time = $primary4->safe_psql('postgres',
+	qq(SELECT last_inactive_time FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND last_inactive_time IS NOT NULL;)
+);
+
+# Check that the captured time is sane
+is( $primary4->safe_psql(
+		'postgres',
+		qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0) AND '$last_inactive_time'::timestamptz >= '$slot_creation_time'::timestamptz;]
+	),
+	't',
+	'last inactive time for an active physical slot is sane');
+
+$standby4->start;
+
+# Wait until standby has replayed enough data
+$primary4->wait_for_catchup($standby4);
+
+# Now the slot is active so last_inactive_time value must be NULL
+is( $primary4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$sb4_slot';]
+	),
+	't',
+	'last inactive time for an active physical slot is NULL');
+
+# Stop the standby to check its last_inactive_time value is updated
+$standby4->stop;
+
+# Let's restart the primary so that the last_inactive_time is set upon
+# loading the slot from the disk.
+$primary4->restart;
+
+is( $primary4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND last_inactive_time IS NOT NULL;]
+	),
+	't',
+	'last inactive time for an inactive physical slot is updated correctly');
+
+$standby4->stop;
+
+# Testcase end: Check last_inactive_time property of the streaming standby's slot
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Check last_inactive_time property of the logical subscriber's slot
+my $publisher4 = $primary4;
+
+# Create subscriber node
+my $subscriber4 = PostgreSQL::Test::Cluster->new('subscriber4');
+$subscriber4->init;
+
+# Setup logical replication
+my $publisher4_connstr = $publisher4->connstr . ' dbname=postgres';
+$publisher4->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+
+$slot_creation_time = $publisher4->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
+my $lsub4_slot = 'lsub4_slot';
+$publisher4->safe_psql('postgres',
+	"SELECT pg_create_logical_replication_slot(slot_name := '$lsub4_slot', plugin := 'pgoutput');"
+);
+
+# Get last_inactive_time value after the slot's creation. Note that the slot
+# is still inactive till it's used by the subscriber below.
+$last_inactive_time = $publisher4->safe_psql('postgres',
+	qq(SELECT last_inactive_time FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND last_inactive_time IS NOT NULL;)
+);
+
+# Check that the captured time is sane
+is( $publisher4->safe_psql(
+		'postgres',
+		qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0) AND '$last_inactive_time'::timestamptz >= '$slot_creation_time'::timestamptz;]
+	),
+	't',
+	'last inactive time for an active physical slot is sane');
+
+$subscriber4->start;
+$subscriber4->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher4_connstr' PUBLICATION pub WITH (slot_name = '$lsub4_slot', create_slot = false)"
+);
+
+# Wait until subscriber has caught up
+$subscriber4->wait_for_subscription_sync($publisher4, 'sub');
+
+# Now the slot is active so last_inactive_time value must be NULL
+is( $publisher4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub4_slot';]
+	),
+	't',
+	'last inactive time for an active logical slot is NULL');
+
+# Stop the subscriber to check its last_inactive_time value is updated
+$subscriber4->stop;
+
+# Let's restart the publisher so that the last_inactive_time is set upon
+# loading the slot from the disk.
+$publisher4->restart;
+
+is( $publisher4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND last_inactive_time IS NOT NULL;]
+	),
+	't',
+	'last inactive time for an inactive logical slot is updated correctly');
+
+# Testcase end: Check last_inactive_time property of the logical subscriber's slot
+# =============================================================================
+
+$publisher4->stop;
+$subscriber4->stop;
+
 done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 18829ea586..dfcbaec387 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,11 +1473,12 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
+    l.last_inactive_time,
     l.conflicting,
     l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, invalidation_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, last_inactive_time, conflicting, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

#127

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: shveta malik (#125)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 25, 2024 at 11:53 AM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:

On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

I've attached the v18 patch set here.

I have one concern, for synced slots on standby, how do we disallow
invalidation due to inactive-timeout immediately after promotion?

For synced slots, last_inactive_time and inactive_timeout are both
set. Let's say I bring down primary for promotion of standby and then
promote standby, there are chances that it may end up invalidating
synced slots (considering standby is not brought down during promotion
and thus inactive_timeout may already be past 'last_inactive_time'). I
tried with smaller unit of inactive_timeout:

--Shutdown primary to prepare for planned promotion.

--On standby, one synced slot with last_inactive_time (lat) as 12:21
slot_name | failover | synced | active | temp | conf | res |
lat | inactive_timeout
---------------+----------+--------+--------+------+------+-----+----------------------------------+------------------
logical_slot1 | t | t | f | f |
f | | 2024-03-25 12:21:09.020757+05:30 | 60

--wait for some time, now the time is 12:24
postgres=# select now();
now
----------------------------------
2024-03-25 12:24:17.616716+05:30

-- promote immediately:
./pg_ctl -D ../../standbydb/ promote -w

--on promoted standby:
postgres=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f

--synced slot is invalidated immediately on promotion.
slot_name | failover | synced | active | temp | conf
| res | lat |
inactive_timeout
---------------+----------+--------+--------+------+------+------------------+----------------------------------+--------
logical_slot1 | t | t | f | f
| f | inactive_timeout | 2024-03-25
12:21:09.020757+05:30 |

thanks
Shveta

#128

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: shveta malik (#127)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Mar 25, 2024 at 11:53 AM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:

On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

I've attached the v18 patch set here.

I have one concern, for synced slots on standby, how do we disallow
invalidation due to inactive-timeout immediately after promotion?

For synced slots, last_inactive_time and inactive_timeout are both
set. Let's say I bring down primary for promotion of standby and then
promote standby, there are chances that it may end up invalidating
synced slots (considering standby is not brought down during promotion
and thus inactive_timeout may already be past 'last_inactive_time').

This raises the question of whether we need to set
'last_inactive_time' synced slots on the standby?

--
With Regards,
Amit Kapila.

#129

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#126)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Mon, Mar 25, 2024 at 12:25:21PM +0530, Bharath Rupireddy wrote:

On Mon, Mar 25, 2024 at 10:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Mar 25, 2024 at 9:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Such a test looks reasonable but shall we add equal to in the second
part of the test (like '$last_inactive_time'::timestamptz >=

'$slot_creation_time'::timestamptz;). This is just to be sure that even if the test ran fast enough to give the same time, the test shouldn't fail. I think it won't matter for correctness as well.

Agree. I added that in v19 patch. I was having that concern in my
mind. That's the reason I wasn't capturing current_time something like
below for the same worry that current_timestamp might be the same (or
nearly the same) as the slot creation time. That's why I ended up
capturing current_timestamp in a separate query than clubbing it up
with pg_create_physical_replication_slot.

SELECT current_timestamp FROM pg_create_physical_replication_slot('foo');

Apart from this, I have made minor changes in the comments. See and
let me know what you think of attached.

Thanks!

v19-0001 LGTM, just one Nit comment for 019_replslot_limit.pl:

The code for "Get last_inactive_time value after the slot's creation" and
"Check that the captured time is sane" is somehow duplicated: is it worth creating
2 functions?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#130

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#128)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Mon, Mar 25, 2024 at 12:59:52PM +0530, Amit Kapila wrote:

On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Mar 25, 2024 at 11:53 AM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:

On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

I've attached the v18 patch set here.

I have one concern, for synced slots on standby, how do we disallow
invalidation due to inactive-timeout immediately after promotion?

For synced slots, last_inactive_time and inactive_timeout are both
set.

Yeah, and I can see last_inactive_time is moving on the standby (while not the
case on the primary), probably due to the sync worker slot acquisition/release
which does not seem right.

Let's say I bring down primary for promotion of standby and then

promote standby, there are chances that it may end up invalidating
synced slots (considering standby is not brought down during promotion
and thus inactive_timeout may already be past 'last_inactive_time').

This raises the question of whether we need to set
'last_inactive_time' synced slots on the standby?

Yeah, I think that last_inactive_time should stay at 0 on synced slots on the
standby because such slots are not usable anyway (until the standby gets promoted).

So, I think that last_inactive_time does not make sense if the slot never had
the chance to be active.

OTOH I think the timeout invalidation (if any) should be synced from primary.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#131

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#130)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Mon, Mar 25, 2024 at 12:59:52PM +0530, Amit Kapila wrote:

On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Mar 25, 2024 at 11:53 AM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:

On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

I've attached the v18 patch set here.

I have one concern, for synced slots on standby, how do we disallow
invalidation due to inactive-timeout immediately after promotion?

For synced slots, last_inactive_time and inactive_timeout are both
set.

Yeah, and I can see last_inactive_time is moving on the standby (while not the
case on the primary), probably due to the sync worker slot acquisition/release
which does not seem right.

Let's say I bring down primary for promotion of standby and then

promote standby, there are chances that it may end up invalidating
synced slots (considering standby is not brought down during promotion
and thus inactive_timeout may already be past 'last_inactive_time').

This raises the question of whether we need to set
'last_inactive_time' synced slots on the standby?

Yeah, I think that last_inactive_time should stay at 0 on synced slots on the
standby because such slots are not usable anyway (until the standby gets promoted).

So, I think that last_inactive_time does not make sense if the slot never had
the chance to be active.

OTOH I think the timeout invalidation (if any) should be synced from primary.

Yes, even I feel that last_inactive_time makes sense only when the
slot is available to be used. Synced slots are not available to be
used until standby is promoted and thus last_inactive_time can be
skipped to be set for synced_slots. But once primay is invalidated due
to inactive-timeout, that invalidation should be synced to standby
(which is happening currently).

thanks
Shveta

#132

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: shveta malik (#131)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Mon, Mar 25, 2024 at 02:07:21PM +0530, shveta malik wrote:

On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Mon, Mar 25, 2024 at 12:59:52PM +0530, Amit Kapila wrote:

On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Mar 25, 2024 at 11:53 AM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:

On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

I've attached the v18 patch set here.

I have one concern, for synced slots on standby, how do we disallow
invalidation due to inactive-timeout immediately after promotion?

For synced slots, last_inactive_time and inactive_timeout are both
set.

Yeah, and I can see last_inactive_time is moving on the standby (while not the
case on the primary), probably due to the sync worker slot acquisition/release
which does not seem right.

Let's say I bring down primary for promotion of standby and then

promote standby, there are chances that it may end up invalidating
synced slots (considering standby is not brought down during promotion
and thus inactive_timeout may already be past 'last_inactive_time').

This raises the question of whether we need to set
'last_inactive_time' synced slots on the standby?

Yeah, I think that last_inactive_time should stay at 0 on synced slots on the
standby because such slots are not usable anyway (until the standby gets promoted).

So, I think that last_inactive_time does not make sense if the slot never had
the chance to be active.

OTOH I think the timeout invalidation (if any) should be synced from primary.

Yes, even I feel that last_inactive_time makes sense only when the
slot is available to be used. Synced slots are not available to be
used until standby is promoted and thus last_inactive_time can be
skipped to be set for synced_slots. But once primay is invalidated due
to inactive-timeout, that invalidation should be synced to standby
(which is happening currently).

yeah, syncing the invalidation and always keeping last_inactive_time to zero
for synced slots looks right to me.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#133

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#130)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

Yeah, and I can see last_inactive_time is moving on the standby (while not the
case on the primary), probably due to the sync worker slot acquisition/release
which does not seem right.

Yes, you are right, last_inactive_time keeps on moving for synced
slots on standby. Once I disabled slot-sync worker, then it is
constant. Then it only changes if I call pg_sync_replication_slots().

On a different note, I noticed that we allow altering
inactive_timeout for synced-slots on standby. And again overwrite it
with the primary's value in the next sync cycle. Steps:

====================
--Check pg_replication_slots for synced slot on standby, inactive_timeout is 120
slot_name | failover | synced | active | inactive_timeout
---------------+----------+--------+--------+------------------
logical_slot1 | t | t | f | 120

--Alter on standby
SELECT 'alter' FROM pg_alter_replication_slot('logical_slot1', 900);

--Check pg_replication_slots:
slot_name | failover | synced | active | inactive_timeout
---------------+----------+--------+--------+------------------
logical_slot1 | t | t | f | 900

--Run sync function
SELECT pg_sync_replication_slots();

--check again, inactive_timeout is set back to primary's value.
slot_name | failover | synced | active | inactive_timeout
---------------+----------+--------+--------+------------------
logical_slot1 | t | t | f | 120

====================

I feel altering synced slot's inactive_timeout should be prohibited on
standby. It should be in sync with primary always. Thoughts?

I am listing the concerns raised by me:
1) create-subscription with create_slot=false overwriting
inactive_timeout of existing slot ([1]/messages/by-id/CAJpy0uAqBi+GbNn2ngJ-A_Z905CD3ss896bqY2ACUjGiF1Gkng@mail.gmail.com)
2) last_inactive_time set for synced slots may result in invalidation
of slot on promotion. ([2]/messages/by-id/CAJpy0uCLu+mqAwAMum=pXE9YYsy0BE7hOSw_Wno5vjwpFY=63g@mail.gmail.com)
3) alter replication slot to alter inactive_timout for synced slots on
standby, should this be allowed?

[1]: /messages/by-id/CAJpy0uAqBi+GbNn2ngJ-A_Z905CD3ss896bqY2ACUjGiF1Gkng@mail.gmail.com
[2]: /messages/by-id/CAJpy0uCLu+mqAwAMum=pXE9YYsy0BE7hOSw_Wno5vjwpFY=63g@mail.gmail.com

thanks
Shveta

#134

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: shveta malik (#133)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Mon, Mar 25, 2024 at 02:39:50PM +0530, shveta malik wrote:

I am listing the concerns raised by me:
3) alter replication slot to alter inactive_timout for synced slots on
standby, should this be allowed?

I don't think it should be allowed.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#135

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#130)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

I have one concern, for synced slots on standby, how do we disallow
invalidation due to inactive-timeout immediately after promotion?

For synced slots, last_inactive_time and inactive_timeout are both
set.

Yeah, and I can see last_inactive_time is moving on the standby (while not the
case on the primary), probably due to the sync worker slot acquisition/release
which does not seem right.

Let's say I bring down primary for promotion of standby and then

promote standby, there are chances that it may end up invalidating
synced slots (considering standby is not brought down during promotion
and thus inactive_timeout may already be past 'last_inactive_time').

This raises the question of whether we need to set
'last_inactive_time' synced slots on the standby?

Yeah, I think that last_inactive_time should stay at 0 on synced slots on the
standby because such slots are not usable anyway (until the standby gets promoted).

So, I think that last_inactive_time does not make sense if the slot never had
the chance to be active.

Right. Done that way i.e. not setting the last_inactive_time for slots
both while releasing the slot and restoring from the disk.

Also, I've added a TAP function to check if the captured times are
sane per Bertrand's review comment.

Please see the attached v20 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v20-0001-Track-last_inactive_time-in-pg_replication_slots.patchapplication/octet-stream; name=v20-0001-Track-last_inactive_time-in-pg_replication_slots.patchDownload

From a9907426dfda7ceb8e28a45f8711cca676d91949 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 25 Mar 2024 09:47:05 +0000
Subject: [PATCH v20] Track last_inactive_time in pg_replication_slots.

Till now, the time at which the replication slot became inactive
is not tracked directly in pg_replication_slots. This commit adds
a new property called last_inactive_time for this. It is set to 0
whenever a slot is made active/acquired and set to current
timestamp whenever the slot is inactive/released or restored from
the disk.

The new property will be useful on production servers to debug and
analyze inactive replication slots. It will also help to know the
lifetime of a replication slot - one can know how long a streaming
standby, logical subscriber, or replication slot consumer is down.

The new property will be useful to implement inactive timeout based
replication slot invalidation in a future commit.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Reviewed-by: Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/system-views.sgml            |  10 ++
 src/backend/catalog/system_views.sql      |   1 +
 src/backend/replication/slot.c            |  34 +++++
 src/backend/replication/slotfuncs.c       |   7 +-
 src/include/catalog/pg_proc.dat           |   6 +-
 src/include/replication/slot.h            |   3 +
 src/test/recovery/t/019_replslot_limit.pl | 152 ++++++++++++++++++++++
 src/test/regress/expected/rules.out       |   3 +-
 8 files changed, 211 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index b5da476c20..5f4165a945 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2523,6 +2523,16 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_inactive_time</structfield> <type>timestamptz</type>
+      </para>
+      <para>
+        The time at which the slot became inactive.
+        <literal>NULL</literal> if the slot is currently being used.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>conflicting</structfield> <type>bool</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index f69b7f5580..bc70ff193e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,6 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
+            L.last_inactive_time,
             L.conflicting,
             L.invalidation_reason,
             L.failover,
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index cdf0c450c5..0919e4e56b 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -409,6 +409,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->candidate_restart_valid = InvalidXLogRecPtr;
 	slot->candidate_restart_lsn = InvalidXLogRecPtr;
 	slot->last_saved_confirmed_flush = InvalidXLogRecPtr;
+	slot->last_inactive_time = 0;
 
 	/*
 	 * Create the slot on disk.  We haven't actually marked the slot allocated
@@ -622,6 +623,11 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
+	/* Reset the last inactive time as the slot is active now. */
+	SpinLockAcquire(&s->mutex);
+	s->last_inactive_time = 0;
+	SpinLockRelease(&s->mutex);
+
 	if (am_walsender)
 	{
 		ereport(log_replication_commands ? LOG : DEBUG1,
@@ -645,6 +651,7 @@ ReplicationSlotRelease(void)
 	ReplicationSlot *slot = MyReplicationSlot;
 	char	   *slotname = NULL;	/* keep compiler quiet */
 	bool		is_logical = false; /* keep compiler quiet */
+	TimestampTz now = 0;
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
@@ -679,6 +686,15 @@ ReplicationSlotRelease(void)
 		ReplicationSlotsComputeRequiredXmin(false);
 	}
 
+	/*
+	 * Set the last inactive time after marking the slot inactive. Exempted
+	 * are the slots that are currently being synced from the primary to the
+	 * standby, because such slots are typically inactive in the sense that
+	 * they don't have associated walsender process to replicate.
+	 */
+	if (!(RecoveryInProgress() && slot->data.synced))
+		now = GetCurrentTimestamp();
+
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
 		/*
@@ -687,9 +703,16 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
+		slot->last_inactive_time = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
+	else
+	{
+		SpinLockAcquire(&slot->mutex);
+		slot->last_inactive_time = now;
+		SpinLockRelease(&slot->mutex);
+	}
 
 	MyReplicationSlot = NULL;
 
@@ -2342,6 +2365,17 @@ RestoreSlotFromDisk(const char *name)
 		slot->in_use = true;
 		slot->active_pid = 0;
 
+		/*
+		 * We set the last inactive time after loading the slot from the disk
+		 * into memory. Whoever acquires the slot i.e. makes the slot active
+		 * will reset it. Exempted are the slots that are currently being
+		 * synced from the primary to the standby, because such slots are
+		 * typically inactive in the sense that they don't have associated
+		 * walsender process to replicate.
+		 */
+		if (!(RecoveryInProgress() && slot->data.synced))
+			slot->last_inactive_time = GetCurrentTimestamp();
+
 		restored = true;
 		break;
 	}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 4232c1e52e..24f5e6d90a 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -239,7 +239,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 18
+#define PG_GET_REPLICATION_SLOTS_COLS 19
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -410,6 +410,11 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
+		if (slot_contents.last_inactive_time > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.last_inactive_time);
+		else
+			nulls[i++] = true;
+
 		cause = slot_contents.data.invalidated;
 
 		if (SlotIsPhysical(&slot_contents))
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 71c74350a0..0d26e5b422 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11133,9 +11133,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting,invalidation_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,timestamptz,bool,text,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,last_inactive_time,conflicting,invalidation_reason,failover,synced}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7f25a083ee..eefd7abd39 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -201,6 +201,9 @@ typedef struct ReplicationSlot
 	 * forcibly flushed or not.
 	 */
 	XLogRecPtr	last_saved_confirmed_flush;
+
+	/* The time at which this slot becomes inactive */
+	TimestampTz last_inactive_time;
 } ReplicationSlot;
 
 #define SlotIsPhysical(slot) ((slot)->data.database == InvalidOid)
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index fe00370c3e..3409cf88cd 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -410,4 +410,156 @@ kill 'CONT', $receiverpid;
 $node_primary3->stop;
 $node_standby3->stop;
 
+# =============================================================================
+# Testcase start: Check last_inactive_time property of the streaming standby's slot
+#
+
+# Initialize primary node
+my $primary4 = PostgreSQL::Test::Cluster->new('primary4');
+$primary4->init(allows_streaming => 'logical');
+$primary4->start;
+
+# Take backup
+$backup_name = 'my_backup4';
+$primary4->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby4 = PostgreSQL::Test::Cluster->new('standby4');
+$standby4->init_from_backup($primary4, $backup_name, has_streaming => 1);
+
+my $sb4_slot = 'sb4_slot';
+$standby4->append_conf('postgresql.conf', "primary_slot_name = '$sb4_slot'");
+
+my $slot_creation_time = $primary4->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
+$primary4->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := '$sb4_slot');
+]);
+
+# Get last_inactive_time value after the slot's creation. Note that the slot
+# is still inactive till it's used by the standby below.
+my $last_inactive_time =
+	capture_and_validate_slot_last_inactive_time($primary4, $sb4_slot, $slot_creation_time);
+
+$standby4->start;
+
+# Wait until standby has replayed enough data
+$primary4->wait_for_catchup($standby4);
+
+# Now the slot is active so last_inactive_time value must be NULL
+is( $primary4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$sb4_slot';]
+	),
+	't',
+	'last inactive time for an active physical slot is NULL');
+
+# Stop the standby to check its last_inactive_time value is updated
+$standby4->stop;
+
+# Let's restart the primary so that the last_inactive_time is set upon
+# loading the slot from the disk.
+$primary4->restart;
+
+is( $primary4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND last_inactive_time IS NOT NULL;]
+	),
+	't',
+	'last inactive time for an inactive physical slot is updated correctly');
+
+$standby4->stop;
+
+# Testcase end: Check last_inactive_time property of the streaming standby's slot
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Check last_inactive_time property of the logical subscriber's slot
+my $publisher4 = $primary4;
+
+# Create subscriber node
+my $subscriber4 = PostgreSQL::Test::Cluster->new('subscriber4');
+$subscriber4->init;
+
+# Setup logical replication
+my $publisher4_connstr = $publisher4->connstr . ' dbname=postgres';
+$publisher4->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+
+$slot_creation_time = $publisher4->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
+my $lsub4_slot = 'lsub4_slot';
+$publisher4->safe_psql('postgres',
+	"SELECT pg_create_logical_replication_slot(slot_name := '$lsub4_slot', plugin := 'pgoutput');"
+);
+
+# Get last_inactive_time value after the slot's creation. Note that the slot
+# is still inactive till it's used by the subscriber below.
+$last_inactive_time =
+	capture_and_validate_slot_last_inactive_time($publisher4, $lsub4_slot, $slot_creation_time);
+
+$subscriber4->start;
+$subscriber4->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher4_connstr' PUBLICATION pub WITH (slot_name = '$lsub4_slot', create_slot = false)"
+);
+
+# Wait until subscriber has caught up
+$subscriber4->wait_for_subscription_sync($publisher4, 'sub');
+
+# Now the slot is active so last_inactive_time value must be NULL
+is( $publisher4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub4_slot';]
+	),
+	't',
+	'last inactive time for an active logical slot is NULL');
+
+# Stop the subscriber to check its last_inactive_time value is updated
+$subscriber4->stop;
+
+# Let's restart the publisher so that the last_inactive_time is set upon
+# loading the slot from the disk.
+$publisher4->restart;
+
+is( $publisher4->safe_psql(
+		'postgres',
+		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND last_inactive_time IS NOT NULL;]
+	),
+	't',
+	'last inactive time for an inactive logical slot is updated correctly');
+
+# Testcase end: Check last_inactive_time property of the logical subscriber's slot
+# =============================================================================
+
+$publisher4->stop;
+$subscriber4->stop;
+
+# Capture and validate last_inactive_time of a given slot.
+sub capture_and_validate_slot_last_inactive_time
+{
+	my ($node, $slot_name, $slot_creation_time) = @_;
+
+	my $last_inactive_time = $node->safe_psql('postgres',
+		qq(SELECT last_inactive_time FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND last_inactive_time IS NOT NULL;)
+		);
+
+	# Check that the captured time is sane
+	is( $node->safe_psql(
+			'postgres',
+			qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0) AND
+				'$last_inactive_time'::timestamptz >= '$slot_creation_time'::timestamptz;]
+		),
+		't',
+		"last inactive time for an active slot $slot_name is sane");
+
+	return $last_inactive_time;
+}
+
 done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 18829ea586..dfcbaec387 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,11 +1473,12 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
+    l.last_inactive_time,
     l.conflicting,
     l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting, invalidation_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, last_inactive_time, conflicting, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

#136

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#135)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 25, 2024 at 3:31 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Right. Done that way i.e. not setting the last_inactive_time for slots
both while releasing the slot and restoring from the disk.

Also, I've added a TAP function to check if the captured times are
sane per Bertrand's review comment.

Please see the attached v20 patch.

Thanks for the patch. The issue of unnecessary invalidation of synced
slots on promotion is resolved in this patch.

thanks
Shveta

#137

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#135)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 25, 2024 at 3:31 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Right. Done that way i.e. not setting the last_inactive_time for slots
both while releasing the slot and restoring from the disk.

Also, I've added a TAP function to check if the captured times are
sane per Bertrand's review comment.

Please see the attached v20 patch.

Pushed, after minor changes.

--
With Regards,
Amit Kapila.

#138

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: shveta malik (#133)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 25, 2024 at 2:40 PM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

Yeah, and I can see last_inactive_time is moving on the standby (while not the
case on the primary), probably due to the sync worker slot acquisition/release
which does not seem right.

Yes, you are right, last_inactive_time keeps on moving for synced
slots on standby. Once I disabled slot-sync worker, then it is
constant. Then it only changes if I call pg_sync_replication_slots().

On a different note, I noticed that we allow altering
inactive_timeout for synced-slots on standby. And again overwrite it
with the primary's value in the next sync cycle. Steps:

====================
--Check pg_replication_slots for synced slot on standby, inactive_timeout is 120
slot_name | failover | synced | active | inactive_timeout
---------------+----------+--------+--------+------------------
logical_slot1 | t | t | f | 120

--Alter on standby
SELECT 'alter' FROM pg_alter_replication_slot('logical_slot1', 900);

I think we should keep pg_alter_replication_slot() as the last
priority among the remaining patches for this release. Let's try to
first finish the primary functionality of inactive_timeout patch.
Otherwise, I agree that the problem reported by you should be fixed.

--
With Regards,
Amit Kapila.

#139

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#138)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 25, 2024 at 5:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

I think we should keep pg_alter_replication_slot() as the last
priority among the remaining patches for this release. Let's try to
first finish the primary functionality of inactive_timeout patch.
Otherwise, I agree that the problem reported by you should be fixed.

Noted. Will focus on v18-002 patch now.

I was debugging the flow and just noticed that RecoveryInProgress()
always returns 'true' during
StartupReplicationSlots()-->RestoreSlotFromDisk() (even on primary) as
'xlogctl->SharedRecoveryState' is always 'RECOVERY_STATE_CRASH' at
that time. The 'xlogctl->SharedRecoveryState' is changed to
'RECOVERY_STATE_DONE' on primary and to 'RECOVERY_STATE_ARCHIVE' on
standby at a later stage in StartupXLOG() (after we are done loading
slots).

The impact of this is, the condition in RestoreSlotFromDisk() in v20-001:

if (!(RecoveryInProgress() && slot->data.synced))
slot->last_inactive_time = GetCurrentTimestamp();

is merely equivalent to:

if (!slot->data.synced)
slot->last_inactive_time = GetCurrentTimestamp();

Thus on primary, after restart, last_inactive_at is set correctly,
while on promoted standby (new primary), last_inactive_at is always
NULL after restart for the synced slots.

thanks
Shveta

#140

Nathan Bossart

nathandbossart@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#121)

Re: Introduce XID age and inactive timeout based replication slot invalidation

I apologize that I haven't been able to keep up with this thread for a
while, but I'm happy to see the continued interest in $SUBJECT.

On Sun, Mar 24, 2024 at 03:05:44PM +0530, Bharath Rupireddy wrote:

This commit particularly lets one specify the inactive_timeout for
a slot via SQL functions pg_create_physical_replication_slot and
pg_create_logical_replication_slot.

Off-list, Bharath brought to my attention that the current proposal was to
set the timeout at the slot level. While I think that is an entirely
reasonable thing to support, the main use-case I have in mind for this
feature is for an administrator that wants to prevent inactive slots from
causing problems (e.g., transaction ID wraparound) on a server or a number
of servers. For that use-case, I think a GUC would be much more
convenient. Perhaps there could be a default inactive slot timeout GUC
that would be used in the absence of a slot-level setting. Thoughts?

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#141

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: shveta malik (#127)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:

I have one concern, for synced slots on standby, how do we disallow
invalidation due to inactive-timeout immediately after promotion?

For synced slots, last_inactive_time and inactive_timeout are both
set. Let's say I bring down primary for promotion of standby and then
promote standby, there are chances that it may end up invalidating
synced slots (considering standby is not brought down during promotion
and thus inactive_timeout may already be past 'last_inactive_time').

On standby, if we decide to maintain valid last_inactive_time for
synced slots, then invalidation is correctly restricted in
InvalidateSlotForInactiveTimeout() for synced slots using the check:

if (RecoveryInProgress() && slot->data.synced)
return false;

But immediately after promotion, we can not rely on the above check
and thus possibility of synced slots invalidation is there. To
maintain consistent behavior regarding the setting of
last_inactive_time for synced slots, similar to user slots, one
potential solution to prevent this invalidation issue is to update the
last_inactive_time of all synced slots within the ShutDownSlotSync()
function during FinishWalRecovery(). This approach ensures that
promotion doesn't immediately invalidate slots, and henceforth, we
possess a correct last_inactive_time as a basis for invalidation going
forward. This will be equivalent to updating last_inactive_time during
restart (but without actual restart during promotion).
The plus point of maintaining last_inactive_time for synced slots
could be, this can provide data to the user on when last time the sync
was attempted on that particular slot by background slot sync worker
or SQl function. Thoughts?

thanks
Shveta

#142

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Nathan Bossart (#140)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 1:24 AM Nathan Bossart <nathandbossart@gmail.com> wrote:

On Sun, Mar 24, 2024 at 03:05:44PM +0530, Bharath Rupireddy wrote:

This commit particularly lets one specify the inactive_timeout for
a slot via SQL functions pg_create_physical_replication_slot and
pg_create_logical_replication_slot.

Off-list, Bharath brought to my attention that the current proposal was to
set the timeout at the slot level. While I think that is an entirely
reasonable thing to support, the main use-case I have in mind for this
feature is for an administrator that wants to prevent inactive slots from
causing problems (e.g., transaction ID wraparound) on a server or a number
of servers. For that use-case, I think a GUC would be much more
convenient. Perhaps there could be a default inactive slot timeout GUC
that would be used in the absence of a slot-level setting. Thoughts?

Yeah, that is a valid point. One of the reasons for keeping it at slot
level was to allow different subscribers/output plugins to have a
different setting for invalid_timeout for their respective slots based
on their usage. Now, having it as a GUC also has some valid use cases
as pointed out by you but I am not sure having both at slot level and
at GUC level is required. I was a bit inclined to have it at slot
level for now and then based on some field usage report we can later
add GUC as well.

--
With Regards,
Amit Kapila.

#143

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: shveta malik (#141)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 9:30 AM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:

I have one concern, for synced slots on standby, how do we disallow
invalidation due to inactive-timeout immediately after promotion?

For synced slots, last_inactive_time and inactive_timeout are both
set. Let's say I bring down primary for promotion of standby and then
promote standby, there are chances that it may end up invalidating
synced slots (considering standby is not brought down during promotion
and thus inactive_timeout may already be past 'last_inactive_time').

On standby, if we decide to maintain valid last_inactive_time for
synced slots, then invalidation is correctly restricted in
InvalidateSlotForInactiveTimeout() for synced slots using the check:

if (RecoveryInProgress() && slot->data.synced)
return false;

But immediately after promotion, we can not rely on the above check
and thus possibility of synced slots invalidation is there. To
maintain consistent behavior regarding the setting of
last_inactive_time for synced slots, similar to user slots, one
potential solution to prevent this invalidation issue is to update the
last_inactive_time of all synced slots within the ShutDownSlotSync()
function during FinishWalRecovery(). This approach ensures that
promotion doesn't immediately invalidate slots, and henceforth, we
possess a correct last_inactive_time as a basis for invalidation going
forward. This will be equivalent to updating last_inactive_time during
restart (but without actual restart during promotion).
The plus point of maintaining last_inactive_time for synced slots
could be, this can provide data to the user on when last time the sync
was attempted on that particular slot by background slot sync worker
or SQl function. Thoughts?

Please find the attached v21 patch implementing the above idea. It
also has changes for renaming last_inactive_time to inactive_since.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v21-0001-Fix-review-comments-for-slot-s-last_inactive_tim.patchapplication/octet-stream; name=v21-0001-Fix-review-comments-for-slot-s-last_inactive_tim.patchDownload

From fc9e61195b19768eacc856f72c185323483d2187 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 26 Mar 2024 05:33:39 +0000
Subject: [PATCH v21] Fix review comments for slot's last_inactive_time
 property

This commit addresses review comments received for the slot's
last_inactive_time property added by commit a11f330b55. It does
the following:

1. Name last_inactive_time seems confusing. With that, one
expects it to tell the last time that the slot was inactive. But,
it tells the last time that a currently-inactive slot previously
*WAS* active.

This commit uses a less confusing name inactive_since for the
property. Other names considered were released_time,
deactivated_at but inactive_since won the race since the word
inactive is predominant as far as the replication slots are
concerned.

2. The slot's last_inactive_time isn't currently maintained for
synced slots on the standby. The commit a11f330b55 prevents
updating last_inactive_time with RecoveryInProgress() check in
RestoreSlotFromDisk(). But, the issue is that RecoveryInProgress()
always returns true in RestoreSlotFromDisk() as
'xlogctl->SharedRecoveryState' is always 'RECOVERY_STATE_CRASH' at
that time. The impact of this on a promoted standby
last_inactive_at is always NULL for all synced slots even after
server restart.

Above issue led us to a question as to why we can't maintain
last_inactive_time for synced slots on the standby. There's a
use-case for having it that as it can tell the last synced time on
the standby apart from fixing the above issue. So, this commit does
two things a) maintains last_inactive_time for such slots,
b) ensures the value is set to current timestamp during the
shutdown to help correctly interpret the time if the standby gets
promoted without a restart.

Reported-by: Robert Haas, Shveta Malik
Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/ZgGrCBQoktdLi1Ir%40ip-10-97-1-34.eu-west-3.compute.internal
Discussion: https://www.postgresql.org/message-id/ZgGrCBQoktdLi1Ir%40ip-10-97-1-34.eu-west-3.compute.internal
Discussion: https://www.postgresql.org/message-id/CAJpy0uB-yE%2BRiw7JQ4hW0%2BigJxvPc%2Brq%2B9c7WyTa1Jz7%2B2gAiA%40mail.gmail.com
---
 doc/src/sgml/system-views.sgml                |  4 +-
 src/backend/catalog/system_views.sql          |  2 +-
 src/backend/replication/logical/slotsync.c    | 43 +++++++++++++
 src/backend/replication/slot.c                | 38 +++++-------
 src/backend/replication/slotfuncs.c           |  4 +-
 src/include/catalog/pg_proc.dat               |  2 +-
 src/include/replication/slot.h                |  4 +-
 src/test/recovery/t/019_replslot_limit.pl     | 62 +++++++++----------
 .../t/040_standby_failover_slots_sync.pl      | 34 ++++++++++
 src/test/regress/expected/rules.out           |  4 +-
 10 files changed, 135 insertions(+), 62 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 5f4165a945..19a08ca0ac 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,10 +2525,10 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>last_inactive_time</structfield> <type>timestamptz</type>
+       <structfield>inactive_since</structfield> <type>timestamptz</type>
       </para>
       <para>
-        The time at which the slot became inactive.
+        The time since the slot has became inactive.
         <literal>NULL</literal> if the slot is currently being used.
       </para></entry>
      </row>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index bc70ff193e..401fb35947 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.last_inactive_time,
+            L.inactive_since,
             L.conflicting,
             L.invalidation_reason,
             L.failover,
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..cfb5affeaa 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -140,6 +140,7 @@ typedef struct RemoteSlot
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void reset_synced_slots_info(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -1296,6 +1297,45 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Reset the synced slots info such as inactive_since after shutting
+ * down the slot sync machinery.
+ */
+static void
+reset_synced_slots_info(void)
+{
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			TimestampTz now;
+
+			Assert(SlotIsLogical(s));
+			Assert(s->active_pid == 0);
+
+			/*
+			 * Set the time since the slot has became inactive after shutting
+			 * down slot sync machinery. This helps correctly interpret the
+			 * time if the standby gets promoted without a restart. We get the
+			 * current time beforehand to avoid a system call while holding
+			 * the lock.
+			 */
+			now = GetCurrentTimestamp();
+
+			SpinLockAcquire(&s->mutex);
+			s->inactive_since = now;
+			SpinLockRelease(&s->mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1309,6 +1349,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(&SlotSyncCtx->mutex);
+		reset_synced_slots_info();
 		return;
 	}
 	SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1341,6 +1382,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(&SlotSyncCtx->mutex);
+
+	reset_synced_slots_info();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 45f7a28f7d..860c7fbeb0 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -409,7 +409,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->candidate_restart_valid = InvalidXLogRecPtr;
 	slot->candidate_restart_lsn = InvalidXLogRecPtr;
 	slot->last_saved_confirmed_flush = InvalidXLogRecPtr;
-	slot->last_inactive_time = 0;
+	slot->inactive_since = 0;
 
 	/*
 	 * Create the slot on disk.  We haven't actually marked the slot allocated
@@ -623,9 +623,12 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
-	/* Reset the last inactive time as the slot is active now. */
+	/*
+	 * Reset the time since the slot has became inactive as the slot is active
+	 * now.
+	 */
 	SpinLockAcquire(&s->mutex);
-	s->last_inactive_time = 0;
+	s->inactive_since = 0;
 	SpinLockRelease(&s->mutex);
 
 	if (am_walsender)
@@ -651,7 +654,7 @@ ReplicationSlotRelease(void)
 	ReplicationSlot *slot = MyReplicationSlot;
 	char	   *slotname = NULL;	/* keep compiler quiet */
 	bool		is_logical = false; /* keep compiler quiet */
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
@@ -687,13 +690,11 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has became inactive after marking it
+	 * inactive. We get the current time beforehand to avoid a system call
+	 * while holding the lock.
 	 */
-	if (!(RecoveryInProgress() && slot->data.synced))
-		now = GetCurrentTimestamp();
+	now = GetCurrentTimestamp();
 
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
@@ -703,14 +704,14 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->last_inactive_time = now;
+		slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
 	{
 		SpinLockAcquire(&slot->mutex);
-		slot->last_inactive_time = now;
+		slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 	}
 
@@ -2366,16 +2367,11 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * We set the time since the slot has became inactive after loading
+		 * the slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it.
 		 */
-		if (!(RecoveryInProgress() && slot->data.synced))
-			slot->last_inactive_time = GetCurrentTimestamp();
-		else
-			slot->last_inactive_time = 0;
+		slot->inactive_since = GetCurrentTimestamp();
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 24f5e6d90a..da57177c25 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -410,8 +410,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.last_inactive_time > 0)
-			values[i++] = TimestampTzGetDatum(slot_contents.last_inactive_time);
+		if (slot_contents.inactive_since > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.inactive_since);
 		else
 			nulls[i++] = true;
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 0d26e5b422..2f7cfc02c6 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11135,7 +11135,7 @@
   proargtypes => '',
   proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,timestamptz,bool,text,bool,bool}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,last_inactive_time,conflicting,invalidation_reason,failover,synced}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,inactive_since,conflicting,invalidation_reason,failover,synced}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index eefd7abd39..d032ce8d28 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -202,8 +202,8 @@ typedef struct ReplicationSlot
 	 */
 	XLogRecPtr	last_saved_confirmed_flush;
 
-	/* The time at which this slot becomes inactive */
-	TimestampTz last_inactive_time;
+	/* The time since the slot has became inactive */
+	TimestampTz inactive_since;
 } ReplicationSlot;
 
 #define SlotIsPhysical(slot) ((slot)->data.database == InvalidOid)
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 3409cf88cd..3b9a306a8b 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -411,7 +411,7 @@ $node_primary3->stop;
 $node_standby3->stop;
 
 # =============================================================================
-# Testcase start: Check last_inactive_time property of the streaming standby's slot
+# Testcase start: Check inactive_since property of the streaming standby's slot
 #
 
 # Initialize primary node
@@ -440,45 +440,45 @@ $primary4->safe_psql(
     SELECT pg_create_physical_replication_slot(slot_name := '$sb4_slot');
 ]);
 
-# Get last_inactive_time value after the slot's creation. Note that the slot
-# is still inactive till it's used by the standby below.
-my $last_inactive_time =
-	capture_and_validate_slot_last_inactive_time($primary4, $sb4_slot, $slot_creation_time);
+# Get inactive_since value after the slot's creation. Note that the slot is
+# still inactive till it's used by the standby below.
+my $inactive_since =
+	capture_and_validate_slot_inactive_since($primary4, $sb4_slot, $slot_creation_time);
 
 $standby4->start;
 
 # Wait until standby has replayed enough data
 $primary4->wait_for_catchup($standby4);
 
-# Now the slot is active so last_inactive_time value must be NULL
+# Now the slot is active so inactive_since value must be NULL
 is( $primary4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$sb4_slot';]
+		qq[SELECT inactive_since IS NULL FROM pg_replication_slots WHERE slot_name = '$sb4_slot';]
 	),
 	't',
 	'last inactive time for an active physical slot is NULL');
 
-# Stop the standby to check its last_inactive_time value is updated
+# Stop the standby to check its inactive_since value is updated
 $standby4->stop;
 
-# Let's restart the primary so that the last_inactive_time is set upon
-# loading the slot from the disk.
+# Let's restart the primary so that the inactive_since is set upon loading the
+# slot from the disk.
 $primary4->restart;
 
 is( $primary4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND last_inactive_time IS NOT NULL;]
+		qq[SELECT inactive_since > '$inactive_since'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND inactive_since IS NOT NULL;]
 	),
 	't',
 	'last inactive time for an inactive physical slot is updated correctly');
 
 $standby4->stop;
 
-# Testcase end: Check last_inactive_time property of the streaming standby's slot
+# Testcase end: Check inactive_since property of the streaming standby's slot
 # =============================================================================
 
 # =============================================================================
-# Testcase start: Check last_inactive_time property of the logical subscriber's slot
+# Testcase start: Check inactive_since property of the logical subscriber's slot
 my $publisher4 = $primary4;
 
 # Create subscriber node
@@ -499,10 +499,10 @@ $publisher4->safe_psql('postgres',
 	"SELECT pg_create_logical_replication_slot(slot_name := '$lsub4_slot', plugin := 'pgoutput');"
 );
 
-# Get last_inactive_time value after the slot's creation. Note that the slot
-# is still inactive till it's used by the subscriber below.
-$last_inactive_time =
-	capture_and_validate_slot_last_inactive_time($publisher4, $lsub4_slot, $slot_creation_time);
+# Get inactive_since value after the slot's creation. Note that the slot is
+# still inactive till it's used by the subscriber below.
+$inactive_since =
+	capture_and_validate_slot_inactive_since($publisher4, $lsub4_slot, $slot_creation_time);
 
 $subscriber4->start;
 $subscriber4->safe_psql('postgres',
@@ -512,54 +512,54 @@ $subscriber4->safe_psql('postgres',
 # Wait until subscriber has caught up
 $subscriber4->wait_for_subscription_sync($publisher4, 'sub');
 
-# Now the slot is active so last_inactive_time value must be NULL
+# Now the slot is active so inactive_since value must be NULL
 is( $publisher4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub4_slot';]
+		qq[SELECT inactive_since IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub4_slot';]
 	),
 	't',
 	'last inactive time for an active logical slot is NULL');
 
-# Stop the subscriber to check its last_inactive_time value is updated
+# Stop the subscriber to check its inactive_since value is updated
 $subscriber4->stop;
 
-# Let's restart the publisher so that the last_inactive_time is set upon
+# Let's restart the publisher so that the inactive_since is set upon
 # loading the slot from the disk.
 $publisher4->restart;
 
 is( $publisher4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND last_inactive_time IS NOT NULL;]
+		qq[SELECT inactive_since > '$inactive_since'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND inactive_since IS NOT NULL;]
 	),
 	't',
 	'last inactive time for an inactive logical slot is updated correctly');
 
-# Testcase end: Check last_inactive_time property of the logical subscriber's slot
+# Testcase end: Check inactive_since property of the logical subscriber's slot
 # =============================================================================
 
 $publisher4->stop;
 $subscriber4->stop;
 
-# Capture and validate last_inactive_time of a given slot.
-sub capture_and_validate_slot_last_inactive_time
+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
 {
 	my ($node, $slot_name, $slot_creation_time) = @_;
 
-	my $last_inactive_time = $node->safe_psql('postgres',
-		qq(SELECT last_inactive_time FROM pg_replication_slots
-			WHERE slot_name = '$slot_name' AND last_inactive_time IS NOT NULL;)
+	my $inactive_since = $node->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
 		);
 
 	# Check that the captured time is sane
 	is( $node->safe_psql(
 			'postgres',
-			qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0) AND
-				'$last_inactive_time'::timestamptz >= '$slot_creation_time'::timestamptz;]
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
 		),
 		't',
 		"last inactive time for an active slot $slot_name is sane");
 
-	return $last_inactive_time;
+	return $inactive_since;
 }
 
 done_testing();
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index f47bfd78eb..e64308cbf1 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -178,6 +178,13 @@ $primary->poll_query_until(
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
 
+# Capture the time before which the logical failover slots are synced/created
+# on the standby.
+my $slots_creation_time = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Synchronize the primary server slots to the standby.
 $standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
 
@@ -190,6 +197,11 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Confirm that the logical failover slots that have synced on the standby has
+# got a valid inactive_since value representing the last slot sync time. 
+capture_and_validate_slot_inactive_since($standby1, 'lsub1_slot', $slots_creation_time);
+capture_and_validate_slot_inactive_since($standby1, 'lsub2_slot', $slots_creation_time);
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
@@ -773,4 +785,26 @@ is( $subscriber1->safe_psql('postgres', q{SELECT count(*) FROM tab_int;}),
 	"20",
 	'data replicated from the new primary');
 
+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
+{
+	my ($node, $slot_name, $slot_creation_time) = @_;
+
+	my $inactive_since = $node->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
+		);
+
+	# Check that the captured time is sane
+	is( $node->safe_psql(
+			'postgres',
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
+		),
+		't',
+		"last inactive time for a synced slot $slot_name is sane");
+
+	return $inactive_since;
+}
+
 done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index dfcbaec387..f53c3036a6 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,12 +1473,12 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.last_inactive_time,
+    l.inactive_since,
     l.conflicting,
     l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, last_inactive_time, conflicting, invalidation_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, inactive_since, conflicting, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

#144

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: shveta malik (#141)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Mar 26, 2024 at 09:30:32AM +0530, shveta malik wrote:

On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:

I have one concern, for synced slots on standby, how do we disallow
invalidation due to inactive-timeout immediately after promotion?

For synced slots, last_inactive_time and inactive_timeout are both
set. Let's say I bring down primary for promotion of standby and then
promote standby, there are chances that it may end up invalidating
synced slots (considering standby is not brought down during promotion
and thus inactive_timeout may already be past 'last_inactive_time').

On standby, if we decide to maintain valid last_inactive_time for
synced slots, then invalidation is correctly restricted in
InvalidateSlotForInactiveTimeout() for synced slots using the check:

if (RecoveryInProgress() && slot->data.synced)
return false;

Right.

But immediately after promotion, we can not rely on the above check
and thus possibility of synced slots invalidation is there. To
maintain consistent behavior regarding the setting of
last_inactive_time for synced slots, similar to user slots, one
potential solution to prevent this invalidation issue is to update the
last_inactive_time of all synced slots within the ShutDownSlotSync()
function during FinishWalRecovery(). This approach ensures that
promotion doesn't immediately invalidate slots, and henceforth, we
possess a correct last_inactive_time as a basis for invalidation going
forward. This will be equivalent to updating last_inactive_time during
restart (but without actual restart during promotion).
The plus point of maintaining last_inactive_time for synced slots
could be, this can provide data to the user on when last time the sync
was attempted on that particular slot by background slot sync worker
or SQl function. Thoughts?

Yeah, another plus point is that if the primary is down then one could look
at the synced "active_since" on the standby to get an idea of it (depends of the
last sync though).

The issue that I can see with your proposal is: what if one synced the slots
manually (with pg_sync_replication_slots()) but does not use the sync worker?
Then I think ShutDownSlotSync() is not going to help in that case.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#145

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#121)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sun, Mar 24, 2024 at 3:05 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

I've attached the v18 patch set here. I've also addressed earlier
review comments from Amit, Ajin Cherian. Note that I've added new
invalidation mechanism tests in a separate TAP test file just because
I don't want to clutter or bloat any of the existing files and spread
tests for physical slots and logical slots into separate existing TAP
files.

Review comments on v18_0002 and v18_0005
=======================================
1.
 ReplicationSlotCreate(const char *name, bool db_specific,
    ReplicationSlotPersistency persistency,
-   bool two_phase, bool failover, bool synced)
+   bool two_phase, bool failover, bool synced,
+   int inactive_timeout)
 {
  ReplicationSlot *slot = NULL;
  int i;
@@ -345,6 +348,18 @@ ReplicationSlotCreate(const char *name, bool db_specific,
  errmsg("cannot enable failover for a temporary replication slot"));
  }

+ if (inactive_timeout > 0)
+ {
+ /*
+ * Do not allow users to set inactive_timeout for temporary slots,
+ * because temporary slots will not be saved to the disk.
+ */
+ if (persistency == RS_TEMPORARY)
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot set inactive_timeout for a temporary replication slot"));
+ }

We have decided to update inactive_since for temporary slots. So,
unless there is some reason, we should allow inactive_timeout to also
be set for temporary slots.

2.
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1024,6 +1024,7 @@ CREATE VIEW pg_replication_slots AS
             L.safe_wal_size,
             L.two_phase,
             L.last_inactive_time,
+            L.inactive_timeout,

Shall we keep inactive_timeout before
last_inactive_time/inactive_since? I don't have any strong reason to
propose that way apart from that the former is provided by the user.

3.
@@ -287,6 +288,13 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
slot_contents = *slot;
SpinLockRelease(&slot->mutex);

+ /*
+ * Here's an opportunity to invalidate inactive replication slots
+ * based on timeout, so let's do it.
+ */
+ if (InvalidateReplicationSlotForInactiveTimeout(slot, false, true, true))
+ invalidated = true;

I don't think we should try to invalidate the slots in
pg_get_replication_slots. This function's purpose is to get the
current information on slots and has no intention to perform any work
for slots. Any error due to invalidation won't be what the user would
be expecting here.

4.
+static bool
+InvalidateSlotForInactiveTimeout(ReplicationSlot *slot,
+ bool need_control_lock,
+ bool need_mutex)
{
...
...
+ if (need_control_lock)
+ LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+ Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
+
+ /*
+ * Check if the slot needs to be invalidated due to inactive_timeout. We
+ * do this with the spinlock held to avoid race conditions -- for example
+ * the restart_lsn could move forward, or the slot could be dropped.
+ */
+ if (need_mutex)
+ SpinLockAcquire(&slot->mutex);
...

I find this combination of parameters a bit strange. Because, say if
need_mutex is false and need_control_lock is true then that means this
function will acquire LWlock after acquiring spinlock which is
unacceptable. Now, this may not happen in practice as the callers
won't pass such a combination but still, this functionality should be
improved.

--
With Regards,
Amit Kapila.

#146

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#144)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Mar 26, 2024 at 05:55:11AM +0000, Bertrand Drouvot wrote:

Hi,

On Tue, Mar 26, 2024 at 09:30:32AM +0530, shveta malik wrote:

On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:

I have one concern, for synced slots on standby, how do we disallow
invalidation due to inactive-timeout immediately after promotion?

For synced slots, last_inactive_time and inactive_timeout are both
set. Let's say I bring down primary for promotion of standby and then
promote standby, there are chances that it may end up invalidating
synced slots (considering standby is not brought down during promotion
and thus inactive_timeout may already be past 'last_inactive_time').

On standby, if we decide to maintain valid last_inactive_time for
synced slots, then invalidation is correctly restricted in
InvalidateSlotForInactiveTimeout() for synced slots using the check:

if (RecoveryInProgress() && slot->data.synced)
return false;

Right.

But immediately after promotion, we can not rely on the above check
and thus possibility of synced slots invalidation is there. To
maintain consistent behavior regarding the setting of
last_inactive_time for synced slots, similar to user slots, one
potential solution to prevent this invalidation issue is to update the
last_inactive_time of all synced slots within the ShutDownSlotSync()
function during FinishWalRecovery(). This approach ensures that
promotion doesn't immediately invalidate slots, and henceforth, we
possess a correct last_inactive_time as a basis for invalidation going
forward. This will be equivalent to updating last_inactive_time during
restart (but without actual restart during promotion).
The plus point of maintaining last_inactive_time for synced slots
could be, this can provide data to the user on when last time the sync
was attempted on that particular slot by background slot sync worker
or SQl function. Thoughts?

Yeah, another plus point is that if the primary is down then one could look
at the synced "active_since" on the standby to get an idea of it (depends of the
last sync though).

The issue that I can see with your proposal is: what if one synced the slots
manually (with pg_sync_replication_slots()) but does not use the sync worker?
Then I think ShutDownSlotSync() is not going to help in that case.

It looks like ShutDownSlotSync() is always called (even if sync_replication_slots = off),
so that sounds ok to me (I should have checked the code, I was under the impression
ShutDownSlotSync() was not called if sync_replication_slots = off).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#147

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#146)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 11:36 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

The issue that I can see with your proposal is: what if one synced the slots
manually (with pg_sync_replication_slots()) but does not use the sync worker?
Then I think ShutDownSlotSync() is not going to help in that case.

It looks like ShutDownSlotSync() is always called (even if sync_replication_slots = off),
so that sounds ok to me (I should have checked the code, I was under the impression
ShutDownSlotSync() was not called if sync_replication_slots = off).

Right, it is called irrespective of sync_replication_slots.

thanks
Shveta

#148

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#143)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 11:08 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Mar 26, 2024 at 9:30 AM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Mar 25, 2024 at 12:43 PM shveta malik <shveta.malik@gmail.com> wrote:

I have one concern, for synced slots on standby, how do we disallow
invalidation due to inactive-timeout immediately after promotion?

For synced slots, last_inactive_time and inactive_timeout are both
set. Let's say I bring down primary for promotion of standby and then
promote standby, there are chances that it may end up invalidating
synced slots (considering standby is not brought down during promotion
and thus inactive_timeout may already be past 'last_inactive_time').

On standby, if we decide to maintain valid last_inactive_time for
synced slots, then invalidation is correctly restricted in
InvalidateSlotForInactiveTimeout() for synced slots using the check:

if (RecoveryInProgress() && slot->data.synced)
return false;

But immediately after promotion, we can not rely on the above check
and thus possibility of synced slots invalidation is there. To
maintain consistent behavior regarding the setting of
last_inactive_time for synced slots, similar to user slots, one
potential solution to prevent this invalidation issue is to update the
last_inactive_time of all synced slots within the ShutDownSlotSync()
function during FinishWalRecovery(). This approach ensures that
promotion doesn't immediately invalidate slots, and henceforth, we
possess a correct last_inactive_time as a basis for invalidation going
forward. This will be equivalent to updating last_inactive_time during
restart (but without actual restart during promotion).
The plus point of maintaining last_inactive_time for synced slots
could be, this can provide data to the user on when last time the sync
was attempted on that particular slot by background slot sync worker
or SQl function. Thoughts?

Please find the attached v21 patch implementing the above idea. It
also has changes for renaming last_inactive_time to inactive_since.

Thanks for the patch. I have tested this patch alone, and it does what
it says. One additional thing which I noticed is that now it sets
inactive_since for temp slots as well, but that idea looks fine to me.

I could not test 'invalidation on promotion bug' with this change, as
that needed rebasing of the rest of the patches.

Few trivial things:

1)
Commti msg:

ensures the value is set to current timestamp during the
shutdown to help correctly interpret the time if the standby gets
promoted without a restart.

shutdown --> shutdown of slot sync worker (as it was not clear if it
is instance shutdown or something else)

2)
'The time since the slot has became inactive'.

has became-->has become
or just became

Please check it in all the files. There are multiple places.

thanks
Shveta

#149

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#143)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Mar 26, 2024 at 11:07:51AM +0530, Bharath Rupireddy wrote:

On Tue, Mar 26, 2024 at 9:30 AM shveta malik <shveta.malik@gmail.com> wrote:

But immediately after promotion, we can not rely on the above check
and thus possibility of synced slots invalidation is there. To
maintain consistent behavior regarding the setting of
last_inactive_time for synced slots, similar to user slots, one
potential solution to prevent this invalidation issue is to update the
last_inactive_time of all synced slots within the ShutDownSlotSync()
function during FinishWalRecovery(). This approach ensures that
promotion doesn't immediately invalidate slots, and henceforth, we
possess a correct last_inactive_time as a basis for invalidation going
forward. This will be equivalent to updating last_inactive_time during
restart (but without actual restart during promotion).
The plus point of maintaining last_inactive_time for synced slots
could be, this can provide data to the user on when last time the sync
was attempted on that particular slot by background slot sync worker
or SQl function. Thoughts?

Please find the attached v21 patch implementing the above idea. It
also has changes for renaming last_inactive_time to inactive_since.

Thanks!

A few comments:

1 ===

One trailing whitespace:

Applying: Fix review comments for slot's last_inactive_time property
.git/rebase-apply/patch:433: trailing whitespace.
# got a valid inactive_since value representing the last slot sync time.
warning: 1 line adds whitespace errors.

2 ===

It looks like inactive_since is set to the current timestamp on the standby
each time the sync worker does a cycle:

primary:

postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
slot_name | inactive_since
-------------+-------------------------------
lsub27_slot | 2024-03-26 07:39:19.745517+00
lsub28_slot | 2024-03-26 07:40:24.953826+00

standby:

postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
slot_name | inactive_since
-------------+-------------------------------
lsub27_slot | 2024-03-26 07:43:56.387324+00
lsub28_slot | 2024-03-26 07:43:56.387338+00

I don't think that should be the case.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#150

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#149)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

2 ===

It looks like inactive_since is set to the current timestamp on the standby
each time the sync worker does a cycle:

primary:

postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
slot_name | inactive_since
-------------+-------------------------------
lsub27_slot | 2024-03-26 07:39:19.745517+00
lsub28_slot | 2024-03-26 07:40:24.953826+00

standby:

postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
slot_name | inactive_since
-------------+-------------------------------
lsub27_slot | 2024-03-26 07:43:56.387324+00
lsub28_slot | 2024-03-26 07:43:56.387338+00

I don't think that should be the case.

But why? This is exactly what we discussed in another thread where we
agreed to update inactive_since even for sync slots. In each sync
cycle, we acquire/release the slot, so the inactive_since gets
updated. See synchronize_one_slot().

--
With Regards,
Amit Kapila.

#151

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#150)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Mar 26, 2024 at 01:37:21PM +0530, Amit Kapila wrote:

On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

2 ===

It looks like inactive_since is set to the current timestamp on the standby
each time the sync worker does a cycle:

primary:

postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
slot_name | inactive_since
-------------+-------------------------------
lsub27_slot | 2024-03-26 07:39:19.745517+00
lsub28_slot | 2024-03-26 07:40:24.953826+00

standby:

postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
slot_name | inactive_since
-------------+-------------------------------
lsub27_slot | 2024-03-26 07:43:56.387324+00
lsub28_slot | 2024-03-26 07:43:56.387338+00

I don't think that should be the case.

But why? This is exactly what we discussed in another thread where we
agreed to update inactive_since even for sync slots.

Hum, I thought we agreed to "sync" it and to "update it to current time"
only at promotion time.

I don't think updating inactive_since to current time during each cycle makes
sense (I mean I understand the use case: being able to say when slots have been
sync, but if this is what we want then we should consider an extra view or an
extra field but not relying on the inactive_since one).

If the primary goes down, not updating inactive_since to the current time could
also provide benefit such as knowing the inactive_since of the primary slots
(from the standby) the last time it has been synced. If we update it to the current
time then this information is lost.

In each sync
cycle, we acquire/release the slot, so the inactive_since gets
updated. See synchronize_one_slot().

Right, and I think we should put an extra condition if in recovery.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#152

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#145)

3 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 11:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Review comments on v18_0002 and v18_0005
=======================================

1.
We have decided to update inactive_since for temporary slots. So,
unless there is some reason, we should allow inactive_timeout to also
be set for temporary slots.

WFM. A temporary slot that's inactive for a long time before even the
server isn't shutdown can utilize this inactive_timeout based
invalidation mechanism. And, I'd also vote for we being consistent for
temporary and synced slots.

L.last_inactive_time,
+ L.inactive_timeout,

Shall we keep inactive_timeout before
last_inactive_time/inactive_since? I don't have any strong reason to
propose that way apart from that the former is provided by the user.

Done.

+ if (InvalidateReplicationSlotForInactiveTimeout(slot, false, true, true))
+ invalidated = true;
I don't think we should try to invalidate the slots in
pg_get_replication_slots. This function's purpose is to get the
current information on slots and has no intention to perform any work
for slots. Any error due to invalidation won't be what the user would
be expecting here.

Agree. Removed.

4.
+static bool
+InvalidateSlotForInactiveTimeout(ReplicationSlot *slot,
+ bool need_control_lock,
+ bool need_mutex)
{
...
...
+ if (need_control_lock)
+ LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+ Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
+
+ /*
+ * Check if the slot needs to be invalidated due to inactive_timeout. We
+ * do this with the spinlock held to avoid race conditions -- for example
+ * the restart_lsn could move forward, or the slot could be dropped.
+ */
+ if (need_mutex)
+ SpinLockAcquire(&slot->mutex);
...
I find this combination of parameters a bit strange. Because, say if
need_mutex is false and need_control_lock is true then that means this
function will acquire LWlock after acquiring spinlock which is
unacceptable. Now, this may not happen in practice as the callers
won't pass such a combination but still, this functionality should be
improved.

Right. Either we need two locks or not. So, changed it to use just one
bool need_locks, upon set both control lock and spin lock are acquired
and released.

On Mon, Mar 25, 2024 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:

patch 002:

2)
slotsync.c:

ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
remote_slot->two_phase,
remote_slot->failover,
- true);
+ true, 0);

+ slot->data.inactive_timeout = remote_slot->inactive_timeout;

Is there a reason we are not passing 'remote_slot->inactive_timeout'
to ReplicationSlotCreate() directly?

The slot there gets created temporarily for which we were not
supporting inactive_timeout being set. But, in the latest v22 patch we
are supporting, so passing the remote_slot->inactive_timeout directly.

3)
slotfuncs.c
pg_create_logical_replication_slot():
+ int inactive_timeout = PG_GETARG_INT32(5);

Can we mention here that timeout is in seconds either in comment or
rename variable to inactive_timeout_secs?

Please do this for create_physical_replication_slot(),
create_logical_replication_slot(),
pg_create_physical_replication_slot() as well.

Added /* in seconds */ next the variable declaration.

---------
4)
+ int inactive_timeout; /* The amount of time in seconds the slot
+ * is allowed to be inactive. */
} LogicalSlotInfo;
Do we need to mention "before getting invalided" like other places
(in last patch)?

Done.

5)
Same at these two places. "before getting invalided" to be added in
the last patch otherwise the info is incompleted.

+
+ /* The amount of time in seconds the slot is allowed to be inactive */
+ int inactive_timeout;
} ReplicationSlotPersistentData;

+ * inactive_timeout: The amount of time in seconds the slot is allowed to be
+ *     inactive.
*/
void
ReplicationSlotCreate(const char *name, bool db_specific,
Same here. "before getting invalidated" ?

Done.

On Tue, Mar 26, 2024 at 12:04 PM shveta malik <shveta.malik@gmail.com> wrote:

Please find the attached v21 patch implementing the above idea. It
also has changes for renaming last_inactive_time to inactive_since.

Thanks for the patch. I have tested this patch alone, and it does what
it says. One additional thing which I noticed is that now it sets
inactive_since for temp slots as well, but that idea looks fine to me.

Right. Let's be consistent by treating all slots the same.

I could not test 'invalidation on promotion bug' with this change, as
that needed rebasing of the rest of the patches.

Please use the v22 patch set.

Few trivial things:

1)
Commti msg:

ensures the value is set to current timestamp during the
shutdown to help correctly interpret the time if the standby gets
promoted without a restart.

shutdown --> shutdown of slot sync worker (as it was not clear if it
is instance shutdown or something else)

Changed it to "shutdown of slot sync machinery" to be consistent with
the comments.

2)
'The time since the slot has became inactive'.

has became-->has become
or just became

Please check it in all the files. There are multiple places.

Fixed.

Please see the attached v23 patches. I've addressed all the review
comments received so far from Amit and Shveta.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v22-0001-Fix-review-comments-for-slot-s-last_inactive_tim.patchapplication/x-patch; name=v22-0001-Fix-review-comments-for-slot-s-last_inactive_tim.patchDownload

From a19a324c057994025f0486f5016dd67ca39b731b Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 26 Mar 2024 07:56:25 +0000
Subject: [PATCH v22 1/3] Fix review comments for slot's last_inactive_time
 property

This commit addresses review comments received for the slot's
last_inactive_time property added by commit a11f330b55. It does
the following:

1. Name last_inactive_time seems confusing. With that, one
expects it to tell the last time that the slot was inactive. But,
it tells the last time that a currently-inactive slot previously
*WAS* active.

This commit uses a less confusing name inactive_since for the
property. Other names considered were released_time,
deactivated_at but inactive_since won the race since the word
inactive is predominant as far as the replication slots are
concerned.

2. The slot's last_inactive_time isn't currently maintained for
synced slots on the standby. The commit a11f330b55 prevents
updating last_inactive_time with RecoveryInProgress() check in
RestoreSlotFromDisk(). But, the issue is that RecoveryInProgress()
always returns true in RestoreSlotFromDisk() as
'xlogctl->SharedRecoveryState' is always 'RECOVERY_STATE_CRASH' at
that time. The impact of this on a promoted standby
last_inactive_at is always NULL for all synced slots even after
server restart.

Above issue led us to a question as to why we can't maintain
last_inactive_time for synced slots on the standby. There's a
use-case for having it that as it can tell the last synced time on
the standby apart from fixing the above issue. So, this commit does
two things a) maintains last_inactive_time for such slots,
b) ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Reported-by: Robert Haas, Shveta Malik
Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/ZgGrCBQoktdLi1Ir%40ip-10-97-1-34.eu-west-3.compute.internal
Discussion: https://www.postgresql.org/message-id/ZgGrCBQoktdLi1Ir%40ip-10-97-1-34.eu-west-3.compute.internal
Discussion: https://www.postgresql.org/message-id/CAJpy0uB-yE%2BRiw7JQ4hW0%2BigJxvPc%2Brq%2B9c7WyTa1Jz7%2B2gAiA%40mail.gmail.com
---
 doc/src/sgml/system-views.sgml                |  4 +-
 src/backend/catalog/system_views.sql          |  2 +-
 src/backend/replication/logical/slotsync.c    | 43 +++++++++++++
 src/backend/replication/slot.c                | 38 +++++-------
 src/backend/replication/slotfuncs.c           |  4 +-
 src/include/catalog/pg_proc.dat               |  2 +-
 src/include/replication/slot.h                |  4 +-
 src/test/recovery/t/019_replslot_limit.pl     | 62 +++++++++----------
 .../t/040_standby_failover_slots_sync.pl      | 34 ++++++++++
 src/test/regress/expected/rules.out           |  4 +-
 10 files changed, 135 insertions(+), 62 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 5f4165a945..3c8dca8ca3 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,10 +2525,10 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>last_inactive_time</structfield> <type>timestamptz</type>
+       <structfield>inactive_since</structfield> <type>timestamptz</type>
       </para>
       <para>
-        The time at which the slot became inactive.
+        The time since the slot has become inactive.
         <literal>NULL</literal> if the slot is currently being used.
       </para></entry>
      </row>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index bc70ff193e..401fb35947 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.last_inactive_time,
+            L.inactive_since,
             L.conflicting,
             L.invalidation_reason,
             L.failover,
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..bbf9a2c485 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -140,6 +140,7 @@ typedef struct RemoteSlot
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void reset_synced_slots_info(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -1296,6 +1297,45 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Reset the synced slots info such as inactive_since after shutting
+ * down the slot sync machinery.
+ */
+static void
+reset_synced_slots_info(void)
+{
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			TimestampTz now;
+
+			Assert(SlotIsLogical(s));
+			Assert(s->active_pid == 0);
+
+			/*
+			 * Set the time since the slot has become inactive after shutting
+			 * down slot sync machinery. This helps correctly interpret the
+			 * time if the standby gets promoted without a restart. We get the
+			 * current time beforehand to avoid a system call while holding
+			 * the lock.
+			 */
+			now = GetCurrentTimestamp();
+
+			SpinLockAcquire(&s->mutex);
+			s->inactive_since = now;
+			SpinLockRelease(&s->mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1309,6 +1349,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(&SlotSyncCtx->mutex);
+		reset_synced_slots_info();
 		return;
 	}
 	SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1341,6 +1382,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(&SlotSyncCtx->mutex);
+
+	reset_synced_slots_info();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 45f7a28f7d..d0a2f440ef 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -409,7 +409,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->candidate_restart_valid = InvalidXLogRecPtr;
 	slot->candidate_restart_lsn = InvalidXLogRecPtr;
 	slot->last_saved_confirmed_flush = InvalidXLogRecPtr;
-	slot->last_inactive_time = 0;
+	slot->inactive_since = 0;
 
 	/*
 	 * Create the slot on disk.  We haven't actually marked the slot allocated
@@ -623,9 +623,12 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
-	/* Reset the last inactive time as the slot is active now. */
+	/*
+	 * Reset the time since the slot has become inactive as the slot is active
+	 * now.
+	 */
 	SpinLockAcquire(&s->mutex);
-	s->last_inactive_time = 0;
+	s->inactive_since = 0;
 	SpinLockRelease(&s->mutex);
 
 	if (am_walsender)
@@ -651,7 +654,7 @@ ReplicationSlotRelease(void)
 	ReplicationSlot *slot = MyReplicationSlot;
 	char	   *slotname = NULL;	/* keep compiler quiet */
 	bool		is_logical = false; /* keep compiler quiet */
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
@@ -687,13 +690,11 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has become inactive after marking it
+	 * inactive. We get the current time beforehand to avoid a system call
+	 * while holding the lock.
 	 */
-	if (!(RecoveryInProgress() && slot->data.synced))
-		now = GetCurrentTimestamp();
+	now = GetCurrentTimestamp();
 
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
@@ -703,14 +704,14 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->last_inactive_time = now;
+		slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
 	{
 		SpinLockAcquire(&slot->mutex);
-		slot->last_inactive_time = now;
+		slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 	}
 
@@ -2366,16 +2367,11 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * We set the time since the slot has become inactive after loading
+		 * the slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it.
 		 */
-		if (!(RecoveryInProgress() && slot->data.synced))
-			slot->last_inactive_time = GetCurrentTimestamp();
-		else
-			slot->last_inactive_time = 0;
+		slot->inactive_since = GetCurrentTimestamp();
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 24f5e6d90a..da57177c25 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -410,8 +410,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.last_inactive_time > 0)
-			values[i++] = TimestampTzGetDatum(slot_contents.last_inactive_time);
+		if (slot_contents.inactive_since > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.inactive_since);
 		else
 			nulls[i++] = true;
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 0d26e5b422..2f7cfc02c6 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11135,7 +11135,7 @@
   proargtypes => '',
   proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,timestamptz,bool,text,bool,bool}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,last_inactive_time,conflicting,invalidation_reason,failover,synced}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,inactive_since,conflicting,invalidation_reason,failover,synced}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index eefd7abd39..7b937d1a0c 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -202,8 +202,8 @@ typedef struct ReplicationSlot
 	 */
 	XLogRecPtr	last_saved_confirmed_flush;
 
-	/* The time at which this slot becomes inactive */
-	TimestampTz last_inactive_time;
+	/* The time since the slot has become inactive */
+	TimestampTz inactive_since;
 } ReplicationSlot;
 
 #define SlotIsPhysical(slot) ((slot)->data.database == InvalidOid)
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 3409cf88cd..3b9a306a8b 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -411,7 +411,7 @@ $node_primary3->stop;
 $node_standby3->stop;
 
 # =============================================================================
-# Testcase start: Check last_inactive_time property of the streaming standby's slot
+# Testcase start: Check inactive_since property of the streaming standby's slot
 #
 
 # Initialize primary node
@@ -440,45 +440,45 @@ $primary4->safe_psql(
     SELECT pg_create_physical_replication_slot(slot_name := '$sb4_slot');
 ]);
 
-# Get last_inactive_time value after the slot's creation. Note that the slot
-# is still inactive till it's used by the standby below.
-my $last_inactive_time =
-	capture_and_validate_slot_last_inactive_time($primary4, $sb4_slot, $slot_creation_time);
+# Get inactive_since value after the slot's creation. Note that the slot is
+# still inactive till it's used by the standby below.
+my $inactive_since =
+	capture_and_validate_slot_inactive_since($primary4, $sb4_slot, $slot_creation_time);
 
 $standby4->start;
 
 # Wait until standby has replayed enough data
 $primary4->wait_for_catchup($standby4);
 
-# Now the slot is active so last_inactive_time value must be NULL
+# Now the slot is active so inactive_since value must be NULL
 is( $primary4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$sb4_slot';]
+		qq[SELECT inactive_since IS NULL FROM pg_replication_slots WHERE slot_name = '$sb4_slot';]
 	),
 	't',
 	'last inactive time for an active physical slot is NULL');
 
-# Stop the standby to check its last_inactive_time value is updated
+# Stop the standby to check its inactive_since value is updated
 $standby4->stop;
 
-# Let's restart the primary so that the last_inactive_time is set upon
-# loading the slot from the disk.
+# Let's restart the primary so that the inactive_since is set upon loading the
+# slot from the disk.
 $primary4->restart;
 
 is( $primary4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND last_inactive_time IS NOT NULL;]
+		qq[SELECT inactive_since > '$inactive_since'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND inactive_since IS NOT NULL;]
 	),
 	't',
 	'last inactive time for an inactive physical slot is updated correctly');
 
 $standby4->stop;
 
-# Testcase end: Check last_inactive_time property of the streaming standby's slot
+# Testcase end: Check inactive_since property of the streaming standby's slot
 # =============================================================================
 
 # =============================================================================
-# Testcase start: Check last_inactive_time property of the logical subscriber's slot
+# Testcase start: Check inactive_since property of the logical subscriber's slot
 my $publisher4 = $primary4;
 
 # Create subscriber node
@@ -499,10 +499,10 @@ $publisher4->safe_psql('postgres',
 	"SELECT pg_create_logical_replication_slot(slot_name := '$lsub4_slot', plugin := 'pgoutput');"
 );
 
-# Get last_inactive_time value after the slot's creation. Note that the slot
-# is still inactive till it's used by the subscriber below.
-$last_inactive_time =
-	capture_and_validate_slot_last_inactive_time($publisher4, $lsub4_slot, $slot_creation_time);
+# Get inactive_since value after the slot's creation. Note that the slot is
+# still inactive till it's used by the subscriber below.
+$inactive_since =
+	capture_and_validate_slot_inactive_since($publisher4, $lsub4_slot, $slot_creation_time);
 
 $subscriber4->start;
 $subscriber4->safe_psql('postgres',
@@ -512,54 +512,54 @@ $subscriber4->safe_psql('postgres',
 # Wait until subscriber has caught up
 $subscriber4->wait_for_subscription_sync($publisher4, 'sub');
 
-# Now the slot is active so last_inactive_time value must be NULL
+# Now the slot is active so inactive_since value must be NULL
 is( $publisher4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub4_slot';]
+		qq[SELECT inactive_since IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub4_slot';]
 	),
 	't',
 	'last inactive time for an active logical slot is NULL');
 
-# Stop the subscriber to check its last_inactive_time value is updated
+# Stop the subscriber to check its inactive_since value is updated
 $subscriber4->stop;
 
-# Let's restart the publisher so that the last_inactive_time is set upon
+# Let's restart the publisher so that the inactive_since is set upon
 # loading the slot from the disk.
 $publisher4->restart;
 
 is( $publisher4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND last_inactive_time IS NOT NULL;]
+		qq[SELECT inactive_since > '$inactive_since'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND inactive_since IS NOT NULL;]
 	),
 	't',
 	'last inactive time for an inactive logical slot is updated correctly');
 
-# Testcase end: Check last_inactive_time property of the logical subscriber's slot
+# Testcase end: Check inactive_since property of the logical subscriber's slot
 # =============================================================================
 
 $publisher4->stop;
 $subscriber4->stop;
 
-# Capture and validate last_inactive_time of a given slot.
-sub capture_and_validate_slot_last_inactive_time
+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
 {
 	my ($node, $slot_name, $slot_creation_time) = @_;
 
-	my $last_inactive_time = $node->safe_psql('postgres',
-		qq(SELECT last_inactive_time FROM pg_replication_slots
-			WHERE slot_name = '$slot_name' AND last_inactive_time IS NOT NULL;)
+	my $inactive_since = $node->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
 		);
 
 	# Check that the captured time is sane
 	is( $node->safe_psql(
 			'postgres',
-			qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0) AND
-				'$last_inactive_time'::timestamptz >= '$slot_creation_time'::timestamptz;]
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
 		),
 		't',
 		"last inactive time for an active slot $slot_name is sane");
 
-	return $last_inactive_time;
+	return $inactive_since;
 }
 
 done_testing();
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index f47bfd78eb..e7c33c0066 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -178,6 +178,13 @@ $primary->poll_query_until(
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
 
+# Capture the time before which the logical failover slots are synced/created
+# on the standby.
+my $slots_creation_time = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Synchronize the primary server slots to the standby.
 $standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
 
@@ -190,6 +197,11 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Confirm that the logical failover slots that have synced on the standby has
+# got a valid inactive_since value representing the last slot sync time.
+capture_and_validate_slot_inactive_since($standby1, 'lsub1_slot', $slots_creation_time);
+capture_and_validate_slot_inactive_since($standby1, 'lsub2_slot', $slots_creation_time);
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
@@ -773,4 +785,26 @@ is( $subscriber1->safe_psql('postgres', q{SELECT count(*) FROM tab_int;}),
 	"20",
 	'data replicated from the new primary');
 
+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
+{
+	my ($node, $slot_name, $slot_creation_time) = @_;
+
+	my $inactive_since = $node->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
+		);
+
+	# Check that the captured time is sane
+	is( $node->safe_psql(
+			'postgres',
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
+		),
+		't',
+		"last inactive time for a synced slot $slot_name is sane");
+
+	return $inactive_since;
+}
+
 done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index dfcbaec387..f53c3036a6 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,12 +1473,12 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.last_inactive_time,
+    l.inactive_since,
     l.conflicting,
     l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, last_inactive_time, conflicting, invalidation_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, inactive_since, conflicting, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v22-0002-Allow-setting-inactive_timeout-for-replication-s.patchapplication/x-patch; name=v22-0002-Allow-setting-inactive_timeout-for-replication-s.patchDownload

From bfb5031edf2efefc9461ab649342099925b671e8 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 26 Mar 2024 08:45:01 +0000
Subject: [PATCH v22 2/3] Allow setting inactive_timeout for replication slots
 via SQL API.

This commit adds a new replication slot property called
inactive_timeout specifying the amount of time in seconds the slot
is allowed to be inactive. It is added to slot's persistent data
structure to survive during server restarts. It will be synced to
failover slots on the standby, and also will be carried over to
the new cluster as part of pg_upgrade.

This commit particularly lets one specify the inactive_timeout for
a slot via SQL functions pg_create_physical_replication_slot and
pg_create_logical_replication_slot.

The new property will be useful to implement inactive timeout based
replication slot invalidation in a future commit.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 contrib/test_decoding/expected/slot.out       | 97 +++++++++++++++++++
 contrib/test_decoding/sql/slot.sql            | 30 ++++++
 doc/src/sgml/func.sgml                        | 18 ++--
 doc/src/sgml/system-views.sgml                |  9 ++
 src/backend/catalog/system_functions.sql      |  2 +
 src/backend/catalog/system_views.sql          |  1 +
 src/backend/replication/logical/slotsync.c    | 17 +++-
 src/backend/replication/slot.c                |  8 +-
 src/backend/replication/slotfuncs.c           | 31 +++++-
 src/backend/replication/walsender.c           |  4 +-
 src/bin/pg_upgrade/info.c                     |  6 +-
 src/bin/pg_upgrade/pg_upgrade.c               |  5 +-
 src/bin/pg_upgrade/pg_upgrade.h               |  2 +
 src/bin/pg_upgrade/t/003_logical_slots.pl     | 11 ++-
 src/include/catalog/pg_proc.dat               | 22 ++---
 src/include/replication/slot.h                |  5 +-
 .../t/040_standby_failover_slots_sync.pl      | 13 ++-
 src/test/regress/expected/rules.out           |  3 +-
 18 files changed, 243 insertions(+), 41 deletions(-)

diff --git a/contrib/test_decoding/expected/slot.out b/contrib/test_decoding/expected/slot.out
index 349ab2d380..c318eceefd 100644
--- a/contrib/test_decoding/expected/slot.out
+++ b/contrib/test_decoding/expected/slot.out
@@ -466,3 +466,100 @@ SELECT pg_drop_replication_slot('physical_slot');
  
 (1 row)
 
+-- Test negative value for inactive_timeout option for slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', inactive_timeout := -300);  -- error
+ERROR:  "inactive_timeout" must not be negative
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', inactive_timeout := -600);  -- error
+ERROR:  "inactive_timeout" must not be negative
+-- Test inactive_timeout option of physical slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot1', immediately_reserve := true, inactive_timeout := 300);
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot2');
+ ?column? 
+----------
+ init
+(1 row)
+
+-- Copy physical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_slot1', dst_slot_name := 'it_phy_slot3');
+ ?column? 
+----------
+ copy
+(1 row)
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+  slot_name   | slot_type | inactive_timeout 
+--------------+-----------+------------------
+ it_phy_slot1 | physical  |              300
+ it_phy_slot2 | physical  |                0
+ it_phy_slot3 | physical  |              300
+(3 rows)
+
+SELECT pg_drop_replication_slot('it_phy_slot1');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_phy_slot2');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_phy_slot3');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+-- Test inactive_timeout option of logical slots.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot2', plugin := 'test_decoding');
+ ?column? 
+----------
+ init
+(1 row)
+
+-- Copy logical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slot1', dst_slot_name := 'it_log_slot3');
+ ?column? 
+----------
+ copy
+(1 row)
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+  slot_name   | slot_type | inactive_timeout 
+--------------+-----------+------------------
+ it_log_slot1 | logical   |              600
+ it_log_slot2 | logical   |                0
+ it_log_slot3 | logical   |              600
+(3 rows)
+
+SELECT pg_drop_replication_slot('it_log_slot1');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_log_slot2');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_log_slot3');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
diff --git a/contrib/test_decoding/sql/slot.sql b/contrib/test_decoding/sql/slot.sql
index 580e3ae3be..e5c7b3d359 100644
--- a/contrib/test_decoding/sql/slot.sql
+++ b/contrib/test_decoding/sql/slot.sql
@@ -190,3 +190,33 @@ SELECT pg_drop_replication_slot('failover_true_slot');
 SELECT pg_drop_replication_slot('failover_false_slot');
 SELECT pg_drop_replication_slot('failover_default_slot');
 SELECT pg_drop_replication_slot('physical_slot');
+
+-- Test negative value for inactive_timeout option for slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', inactive_timeout := -300);  -- error
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', inactive_timeout := -600);  -- error
+
+-- Test inactive_timeout option of physical slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot1', immediately_reserve := true, inactive_timeout := 300);
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot2');
+
+-- Copy physical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_slot1', dst_slot_name := 'it_phy_slot3');
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+
+SELECT pg_drop_replication_slot('it_phy_slot1');
+SELECT pg_drop_replication_slot('it_phy_slot2');
+SELECT pg_drop_replication_slot('it_phy_slot3');
+
+-- Test inactive_timeout option of logical slots.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot2', plugin := 'test_decoding');
+
+-- Copy logical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slot1', dst_slot_name := 'it_log_slot3');
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+
+SELECT pg_drop_replication_slot('it_log_slot1');
+SELECT pg_drop_replication_slot('it_log_slot2');
+SELECT pg_drop_replication_slot('it_log_slot3');
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 8ecc02f2b9..2cc26e927a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28373,7 +28373,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <indexterm>
          <primary>pg_create_physical_replication_slot</primary>
         </indexterm>
-        <function>pg_create_physical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type> <optional>, <parameter>immediately_reserve</parameter> <type>boolean</type>, <parameter>temporary</parameter> <type>boolean</type> </optional> )
+        <function>pg_create_physical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type> <optional>, <parameter>immediately_reserve</parameter> <type>boolean</type>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>inactive_timeout</parameter> <type>integer</type> </optional>)
         <returnvalue>record</returnvalue>
         ( <parameter>slot_name</parameter> <type>name</type>,
         <parameter>lsn</parameter> <type>pg_lsn</type> )
@@ -28390,9 +28390,12 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         parameter, <parameter>temporary</parameter>, when set to true, specifies that
         the slot should not be permanently stored to disk and is only meant
         for use by the current session. Temporary slots are also
-        released upon any error. This function corresponds
-        to the replication protocol command <literal>CREATE_REPLICATION_SLOT
-        ... PHYSICAL</literal>.
+        released upon any error. The optional fourth
+        parameter, <parameter>inactive_timeout</parameter>, when set to a
+        non-zero value, specifies the amount of time in seconds the slot is
+        allowed to be inactive. This function corresponds to the replication
+        protocol command
+        <literal>CREATE_REPLICATION_SLOT ... PHYSICAL</literal>.
        </para></entry>
       </row>
 
@@ -28417,7 +28420,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <indexterm>
          <primary>pg_create_logical_replication_slot</primary>
         </indexterm>
-        <function>pg_create_logical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type>, <parameter>plugin</parameter> <type>name</type> <optional>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>twophase</parameter> <type>boolean</type>, <parameter>failover</parameter> <type>boolean</type> </optional> )
+        <function>pg_create_logical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type>, <parameter>plugin</parameter> <type>name</type> <optional>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>twophase</parameter> <type>boolean</type>, <parameter>failover</parameter> <type>boolean</type>, <parameter>inactive_timeout</parameter> <type>integer</type> </optional> )
         <returnvalue>record</returnvalue>
         ( <parameter>slot_name</parameter> <type>name</type>,
         <parameter>lsn</parameter> <type>pg_lsn</type> )
@@ -28436,7 +28439,10 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <parameter>failover</parameter>, when set to true,
         specifies that this slot is enabled to be synced to the
         standbys so that logical replication can be resumed after
-        failover. A call to this function has the same effect as
+        failover. The optional sixth parameter,
+        <parameter>inactive_timeout</parameter>, when set to a
+        non-zero value, specifies the amount of time in seconds the slot is
+        allowed to be inactive. A call to this function has the same effect as
         the replication protocol command
         <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
        </para></entry>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..a6cb13fd9d 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2523,6 +2523,15 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_timeout</structfield> <type>integer</type>
+      </para>
+      <para>
+        The amount of time in seconds the slot is allowed to be inactive.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>inactive_since</structfield> <type>timestamptz</type>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index fe2bb50f46..af27616657 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -469,6 +469,7 @@ AS 'pg_logical_emit_message_bytea';
 CREATE OR REPLACE FUNCTION pg_create_physical_replication_slot(
     IN slot_name name, IN immediately_reserve boolean DEFAULT false,
     IN temporary boolean DEFAULT false,
+    IN inactive_timeout int DEFAULT 0,
     OUT slot_name name, OUT lsn pg_lsn)
 RETURNS RECORD
 LANGUAGE INTERNAL
@@ -480,6 +481,7 @@ CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot(
     IN temporary boolean DEFAULT false,
     IN twophase boolean DEFAULT false,
     IN failover boolean DEFAULT false,
+    IN inactive_timeout int DEFAULT 0,
     OUT slot_name name, OUT lsn pg_lsn)
 RETURNS RECORD
 LANGUAGE INTERNAL
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 401fb35947..7d9d743dd5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,6 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
+            L.inactive_timeout,
             L.inactive_since,
             L.conflicting,
             L.invalidation_reason,
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index bbf9a2c485..79a968373c 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -131,6 +131,7 @@ typedef struct RemoteSlot
 	char	   *database;
 	bool		two_phase;
 	bool		failover;
+	int			inactive_timeout;	/* in seconds */
 	XLogRecPtr	restart_lsn;
 	XLogRecPtr	confirmed_lsn;
 	TransactionId catalog_xmin;
@@ -168,7 +169,8 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		remote_slot->two_phase == slot->data.two_phase &&
 		remote_slot->failover == slot->data.failover &&
 		remote_slot->confirmed_lsn == slot->data.confirmed_flush &&
-		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
+		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 &&
+		remote_slot->inactive_timeout == slot->data.inactive_timeout)
 		return false;
 
 	/* Avoid expensive operations while holding a spinlock. */
@@ -183,6 +185,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 	slot->data.confirmed_flush = remote_slot->confirmed_lsn;
 	slot->data.catalog_xmin = remote_slot->catalog_xmin;
 	slot->effective_catalog_xmin = remote_slot->catalog_xmin;
+	slot->data.inactive_timeout = remote_slot->inactive_timeout;
 	SpinLockRelease(&slot->mutex);
 
 	if (xmin_changed)
@@ -608,7 +611,8 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
 							  remote_slot->two_phase,
 							  remote_slot->failover,
-							  true);
+							  true,
+							  remote_slot->inactive_timeout);
 
 		/* For shorter lines. */
 		slot = MyReplicationSlot;
@@ -653,9 +657,9 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 static bool
 synchronize_slots(WalReceiverConn *wrconn)
 {
-#define SLOTSYNC_COLUMN_COUNT 9
+#define SLOTSYNC_COLUMN_COUNT 10
 	Oid			slotRow[SLOTSYNC_COLUMN_COUNT] = {TEXTOID, TEXTOID, LSNOID,
-	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID};
+	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID, INT4OID};
 
 	WalRcvExecResult *res;
 	TupleTableSlot *tupslot;
@@ -664,7 +668,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, invalidation_reason"
+		" database, invalidation_reason, inactive_timeout"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
@@ -744,6 +748,9 @@ synchronize_slots(WalReceiverConn *wrconn)
 		remote_slot->invalidated = isnull ? RS_INVAL_NONE :
 			GetSlotInvalidationCause(TextDatumGetCString(d));
 
+		remote_slot->inactive_timeout = DatumGetInt32(slot_getattr(tupslot, ++col,
+																   &isnull));
+
 		/* Sanity check */
 		Assert(col == SLOTSYNC_COLUMN_COUNT);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d0a2f440ef..bc7424bac3 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -129,7 +129,7 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	5		/* version for new files */
+#define SLOT_VERSION	6		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -304,11 +304,14 @@ ReplicationSlotValidateName(const char *name, int elevel)
  * failover: If enabled, allows the slot to be synced to standbys so
  *     that logical replication can be resumed after failover.
  * synced: True if the slot is synchronized from the primary server.
+ * inactive_timeout: The amount of time in seconds the slot is allowed to be
+ *     inactive.
  */
 void
 ReplicationSlotCreate(const char *name, bool db_specific,
 					  ReplicationSlotPersistency persistency,
-					  bool two_phase, bool failover, bool synced)
+					  bool two_phase, bool failover, bool synced,
+					  int inactive_timeout)
 {
 	ReplicationSlot *slot = NULL;
 	int			i;
@@ -398,6 +401,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
 	slot->data.synced = synced;
+	slot->data.inactive_timeout = inactive_timeout;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index da57177c25..6e1d8d1f9a 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -38,14 +38,15 @@
  */
 static void
 create_physical_replication_slot(char *name, bool immediately_reserve,
-								 bool temporary, XLogRecPtr restart_lsn)
+								 bool temporary, int inactive_timeout,
+								 XLogRecPtr restart_lsn)
 {
 	Assert(!MyReplicationSlot);
 
 	/* acquire replication slot, this will check for conflicting names */
 	ReplicationSlotCreate(name, false,
 						  temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
-						  false, false);
+						  false, false, inactive_timeout);
 
 	if (immediately_reserve)
 	{
@@ -71,6 +72,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 	Name		name = PG_GETARG_NAME(0);
 	bool		immediately_reserve = PG_GETARG_BOOL(1);
 	bool		temporary = PG_GETARG_BOOL(2);
+	int			inactive_timeout = PG_GETARG_INT32(3);	/* in seconds */
 	Datum		values[2];
 	bool		nulls[2];
 	TupleDesc	tupdesc;
@@ -84,9 +86,15 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 
 	CheckSlotRequirements();
 
+	if (inactive_timeout < 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("\"inactive_timeout\" must not be negative")));
+
 	create_physical_replication_slot(NameStr(*name),
 									 immediately_reserve,
 									 temporary,
+									 inactive_timeout,
 									 InvalidXLogRecPtr);
 
 	values[0] = NameGetDatum(&MyReplicationSlot->data.name);
@@ -120,7 +128,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 static void
 create_logical_replication_slot(char *name, char *plugin,
 								bool temporary, bool two_phase,
-								bool failover,
+								bool failover, int inactive_timeout,
 								XLogRecPtr restart_lsn,
 								bool find_startpoint)
 {
@@ -138,7 +146,7 @@ create_logical_replication_slot(char *name, char *plugin,
 	 */
 	ReplicationSlotCreate(name, true,
 						  temporary ? RS_TEMPORARY : RS_EPHEMERAL, two_phase,
-						  failover, false);
+						  failover, false, inactive_timeout);
 
 	/*
 	 * Create logical decoding context to find start point or, if we don't
@@ -177,6 +185,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 	bool		temporary = PG_GETARG_BOOL(2);
 	bool		two_phase = PG_GETARG_BOOL(3);
 	bool		failover = PG_GETARG_BOOL(4);
+	int			inactive_timeout = PG_GETARG_INT32(5);	/* in seconds */
 	Datum		result;
 	TupleDesc	tupdesc;
 	HeapTuple	tuple;
@@ -190,11 +199,17 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 
 	CheckLogicalDecodingRequirements();
 
+	if (inactive_timeout < 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("\"inactive_timeout\" must not be negative")));
+
 	create_logical_replication_slot(NameStr(*name),
 									NameStr(*plugin),
 									temporary,
 									two_phase,
 									failover,
+									inactive_timeout,
 									InvalidXLogRecPtr,
 									true);
 
@@ -239,7 +254,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 19
+#define PG_GET_REPLICATION_SLOTS_COLS 20
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -410,6 +425,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
+		values[i++] = Int32GetDatum(slot_contents.data.inactive_timeout);
+
 		if (slot_contents.inactive_since > 0)
 			values[i++] = TimestampTzGetDatum(slot_contents.inactive_since);
 		else
@@ -720,6 +737,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 	XLogRecPtr	src_restart_lsn;
 	bool		src_islogical;
 	bool		temporary;
+	int			inactive_timeout;	/* in seconds */
 	char	   *plugin;
 	Datum		values[2];
 	bool		nulls[2];
@@ -776,6 +794,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 	src_restart_lsn = first_slot_contents.data.restart_lsn;
 	temporary = (first_slot_contents.data.persistency == RS_TEMPORARY);
 	plugin = logical_slot ? NameStr(first_slot_contents.data.plugin) : NULL;
+	inactive_timeout = first_slot_contents.data.inactive_timeout;
 
 	/* Check type of replication slot */
 	if (src_islogical != logical_slot)
@@ -823,6 +842,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 										temporary,
 										false,
 										false,
+										inactive_timeout,
 										src_restart_lsn,
 										false);
 	}
@@ -830,6 +850,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 		create_physical_replication_slot(NameStr(*dst_name),
 										 true,
 										 temporary,
+										 inactive_timeout,
 										 src_restart_lsn);
 
 	/*
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bc40c454de..5315c08650 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1221,7 +1221,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	{
 		ReplicationSlotCreate(cmd->slotname, false,
 							  cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
-							  false, false, false);
+							  false, false, false, 0);
 
 		if (reserve_wal)
 		{
@@ -1252,7 +1252,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 		 */
 		ReplicationSlotCreate(cmd->slotname, true,
 							  cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
-							  two_phase, failover, false);
+							  two_phase, failover, false, 0);
 
 		/*
 		 * Do options check early so that we can bail before calling the
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 95c22a7200..12626987f0 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -676,7 +676,8 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid, "
+							"inactive_timeout "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
@@ -696,6 +697,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 		int			i_failover;
 		int			i_caught_up;
 		int			i_invalid;
+		int			i_inactive_timeout;
 
 		slotinfos = (LogicalSlotInfo *) pg_malloc(sizeof(LogicalSlotInfo) * num_slots);
 
@@ -705,6 +707,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 		i_failover = PQfnumber(res, "failover");
 		i_caught_up = PQfnumber(res, "caught_up");
 		i_invalid = PQfnumber(res, "invalid");
+		i_inactive_timeout = PQfnumber(res, "inactive_timeout");
 
 		for (int slotnum = 0; slotnum < num_slots; slotnum++)
 		{
@@ -716,6 +719,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 			curr->failover = (strcmp(PQgetvalue(res, slotnum, i_failover), "t") == 0);
 			curr->caught_up = (strcmp(PQgetvalue(res, slotnum, i_caught_up), "t") == 0);
 			curr->invalid = (strcmp(PQgetvalue(res, slotnum, i_invalid), "t") == 0);
+			curr->inactive_timeout = atooid(PQgetvalue(res, slotnum, i_inactive_timeout));
 		}
 	}
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index f6143b6bc4..2656056103 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -931,9 +931,10 @@ create_logical_replication_slots(void)
 			appendPQExpBuffer(query, ", ");
 			appendStringLiteralConn(query, slot_info->plugin, conn);
 
-			appendPQExpBuffer(query, ", false, %s, %s);",
+			appendPQExpBuffer(query, ", false, %s, %s, %d);",
 							  slot_info->two_phase ? "true" : "false",
-							  slot_info->failover ? "true" : "false");
+							  slot_info->failover ? "true" : "false",
+							  slot_info->inactive_timeout);
 
 			PQclear(executeQueryOrDie(conn, "%s", query->data));
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 92bcb693fb..eb86d000b1 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -162,6 +162,8 @@ typedef struct
 	bool		invalid;		/* if true, the slot is unusable */
 	bool		failover;		/* is the slot designated to be synced to the
 								 * physical standby? */
+	int			inactive_timeout;	/* The amount of time in seconds the slot
+									 * is allowed to be inactive. */
 } LogicalSlotInfo;
 
 typedef struct
diff --git a/src/bin/pg_upgrade/t/003_logical_slots.pl b/src/bin/pg_upgrade/t/003_logical_slots.pl
index 83d71c3084..6e82d2cb7b 100644
--- a/src/bin/pg_upgrade/t/003_logical_slots.pl
+++ b/src/bin/pg_upgrade/t/003_logical_slots.pl
@@ -153,14 +153,17 @@ like(
 # TEST: Successful upgrade
 
 # Preparations for the subsequent test:
-# 1. Setup logical replication (first, cleanup slots from the previous tests)
+# 1. Setup logical replication (first, cleanup slots from the previous tests,
+# and then create slot for this test with inactive_timeout set).
 my $old_connstr = $oldpub->connstr . ' dbname=postgres';
 
+my $inactive_timeout = 3600;
 $oldpub->start;
 $oldpub->safe_psql(
 	'postgres', qq[
 	SELECT * FROM pg_drop_replication_slot('test_slot1');
 	SELECT * FROM pg_drop_replication_slot('test_slot2');
+	SELECT pg_create_logical_replication_slot(slot_name := 'regress_sub', plugin := 'pgoutput', inactive_timeout := $inactive_timeout);
 	CREATE PUBLICATION regress_pub FOR ALL TABLES;
 ]);
 
@@ -172,7 +175,7 @@ $sub->start;
 $sub->safe_psql(
 	'postgres', qq[
 	CREATE TABLE tbl (a int);
-	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (two_phase = 'true', failover = 'true')
+	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (slot_name = 'regress_sub', create_slot = false, two_phase = 'true', failover = 'true')
 ]);
 $sub->wait_for_subscription_sync($oldpub, 'regress_sub');
 
@@ -192,8 +195,8 @@ command_ok([@pg_upgrade_cmd], 'run of pg_upgrade of old cluster');
 # Check that the slot 'regress_sub' has migrated to the new cluster
 $newpub->start;
 my $result = $newpub->safe_psql('postgres',
-	"SELECT slot_name, two_phase, failover FROM pg_replication_slots");
-is($result, qq(regress_sub|t|t), 'check the slot exists on new cluster');
+	"SELECT slot_name, two_phase, failover, inactive_timeout = $inactive_timeout FROM pg_replication_slots");
+is($result, qq(regress_sub|t|t|t), 'check the slot exists on new cluster');
 
 # Update the connection
 my $new_connstr = $newpub->connstr . ' dbname=postgres';
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2f7cfc02c6..ea4ffb509a 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11105,10 +11105,10 @@
 # replication slots
 { oid => '3779', descr => 'create a physical replication slot',
   proname => 'pg_create_physical_replication_slot', provolatile => 'v',
-  proparallel => 'u', prorettype => 'record', proargtypes => 'name bool bool',
-  proallargtypes => '{name,bool,bool,name,pg_lsn}',
-  proargmodes => '{i,i,i,o,o}',
-  proargnames => '{slot_name,immediately_reserve,temporary,slot_name,lsn}',
+  proparallel => 'u', prorettype => 'record', proargtypes => 'name bool bool int4',
+  proallargtypes => '{name,bool,bool,int4,name,pg_lsn}',
+  proargmodes => '{i,i,i,i,o,o}',
+  proargnames => '{slot_name,immediately_reserve,temporary,inactive_timeout,slot_name,lsn}',
   prosrc => 'pg_create_physical_replication_slot' },
 { oid => '4220',
   descr => 'copy a physical replication slot, changing temporality',
@@ -11133,17 +11133,17 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,timestamptz,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,inactive_since,conflicting,invalidation_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,int4,timestamptz,bool,text,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,inactive_timeout,inactive_since,conflicting,invalidation_reason,failover,synced}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
   proparallel => 'u', prorettype => 'record',
-  proargtypes => 'name name bool bool bool',
-  proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
-  proargmodes => '{i,i,i,i,i,o,o}',
-  proargnames => '{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
+  proargtypes => 'name name bool bool bool int4',
+  proallargtypes => '{name,name,bool,bool,bool,int4,name,pg_lsn}',
+  proargmodes => '{i,i,i,i,i,i,o,o}',
+  proargnames => '{slot_name,plugin,temporary,twophase,failover,inactive_timeout,slot_name,lsn}',
   prosrc => 'pg_create_logical_replication_slot' },
 { oid => '4222',
   descr => 'copy a logical replication slot, changing temporality and plugin',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7b937d1a0c..5a812ef528 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -127,6 +127,9 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* The amount of time in seconds the slot is allowed to be inactive */
+	int			inactive_timeout;
 } ReplicationSlotPersistentData;
 
 /*
@@ -239,7 +242,7 @@ extern void ReplicationSlotsShmemInit(void);
 extern void ReplicationSlotCreate(const char *name, bool db_specific,
 								  ReplicationSlotPersistency persistency,
 								  bool two_phase, bool failover,
-								  bool synced);
+								  bool synced, int inactive_timeout);
 extern void ReplicationSlotPersist(void);
 extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index e7c33c0066..a6fb0f1040 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -152,8 +152,9 @@ log_min_messages = 'debug2'
 $primary->append_conf('postgresql.conf', "log_min_messages = 'debug2'");
 $primary->reload;
 
+my $inactive_timeout = 3600;
 $primary->psql('postgres',
-	q{SELECT pg_create_logical_replication_slot('lsub2_slot', 'test_decoding', false, false, true);}
+	"SELECT pg_create_logical_replication_slot('lsub2_slot', 'test_decoding', false, false, true, $inactive_timeout);"
 );
 
 $primary->psql('postgres',
@@ -202,6 +203,16 @@ is( $standby1->safe_psql(
 capture_and_validate_slot_inactive_since($standby1, 'lsub1_slot', $slots_creation_time);
 capture_and_validate_slot_inactive_since($standby1, 'lsub2_slot', $slots_creation_time);
 
+# Confirm that the synced slot on the standby has got inactive_timeout from the
+# primary.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT inactive_timeout = $inactive_timeout FROM pg_replication_slots
+			WHERE slot_name = 'lsub2_slot' AND synced AND NOT temporary;"
+	),
+	"t",
+	'synced logical slot has got inactive_timeout on standby');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f53c3036a6..7f3b70f598 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,12 +1473,13 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
+    l.inactive_timeout,
     l.inactive_since,
     l.conflicting,
     l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, inactive_since, conflicting, invalidation_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, inactive_timeout, inactive_since, conflicting, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v22-0003-Add-inactive_timeout-based-replication-slot-inva.patchapplication/x-patch; name=v22-0003-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 0701c1bc0d61a812e052697d81f49cb9ca23f194 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 26 Mar 2024 08:54:13 +0000
Subject: [PATCH v22 3/3] Add inactive_timeout based replication slot
 invalidation.

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days at slot level, after which the inactive slots get
dropped.

To achieve the above, postgres uses replication slot property
last_inactive_time (the time at which the slot became inactive),
and a new slot level parameter inactive_timeout and finds an
opportunity to invalidate the slot based on this new mechanism.
The invalidation check happens at various locations to help
being as latest as possible, these locations include the
following:
- Whenever the slot is acquired if the slot
  gets invalidated due to this new mechanism, an error is
  emitted.
- During checkpoint.

Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   8 +-
 doc/src/sgml/system-views.sgml                |  10 +-
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 176 +++++++++++++++++-
 src/backend/replication/slotfuncs.c           |  12 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/bin/pg_upgrade/pg_upgrade.h               |   3 +-
 src/include/replication/slot.h                |   8 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 169 +++++++++++++++++
 12 files changed, 380 insertions(+), 19 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2cc26e927a..fb1640ae12 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28393,8 +28393,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         released upon any error. The optional fourth
         parameter, <parameter>inactive_timeout</parameter>, when set to a
         non-zero value, specifies the amount of time in seconds the slot is
-        allowed to be inactive. This function corresponds to the replication
-        protocol command
+        allowed to be inactive before getting invalidated.
+        This function corresponds to the replication protocol command
         <literal>CREATE_REPLICATION_SLOT ... PHYSICAL</literal>.
        </para></entry>
       </row>
@@ -28442,8 +28442,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         failover. The optional sixth parameter,
         <parameter>inactive_timeout</parameter>, when set to a
         non-zero value, specifies the amount of time in seconds the slot is
-        allowed to be inactive. A call to this function has the same effect as
-        the replication protocol command
+        allowed to be inactive before getting invalidated. A call to this
+        function has the same effect as the replication protocol command
         <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
        </para></entry>
       </row>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a6cb13fd9d..3b09838a0b 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2528,7 +2528,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        <structfield>inactive_timeout</structfield> <type>integer</type>
       </para>
       <para>
-        The amount of time in seconds the slot is allowed to be inactive.
+        The amount of time in seconds the slot is allowed to be inactive
+        before getting invalidated.
       </para></entry>
      </row>
 
@@ -2582,6 +2583,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by slot's
+          <literal>inactive_timeout</literal> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 79a968373c..e94ac0f13f 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -320,7 +320,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -530,7 +530,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
 		 * if the slot is not acquired by other processes.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index bc7424bac3..0d7f2c0f50 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -158,6 +159,8 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool InvalidateSlotForInactiveTimeout(ReplicationSlot *slot,
+											 bool need_locks);
 
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
@@ -305,7 +308,7 @@ ReplicationSlotValidateName(const char *name, int elevel)
  *     that logical replication can be resumed after failover.
  * synced: True if the slot is synchronized from the primary server.
  * inactive_timeout: The amount of time in seconds the slot is allowed to be
- *     inactive.
+ *     inactive before getting invalidated.
  */
 void
 ReplicationSlotCreate(const char *name, bool db_specific,
@@ -539,9 +542,14 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * If check_for_invalidation is true, the slot is checked for invalidation
+ * based on its inactive_timeout parameter and an error is raised after making
+ * the slot ours.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -619,6 +627,42 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * Check if the given slot can be invalidated based on its
+	 * inactive_timeout parameter. If yes, persist the invalidated state to
+	 * disk and then error out. We do this only after making the slot ours to
+	 * avoid anyone else acquiring it while we check for its invalidation.
+	 */
+	if (check_for_invalidation)
+	{
+		/* The slot is ours by now */
+		Assert(s->active_pid == MyProcPid);
+
+		/*
+		 * Well, the slot is not yet ours really unless we check for the
+		 * invalidation below.
+		 */
+		s->active_pid = 0;
+		if (InvalidateReplicationSlotForInactiveTimeout(s, true, true))
+		{
+			/*
+			 * If the slot has been invalidated, recalculate the resource
+			 * limits.
+			 */
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+
+			/* Might need it for slot clean up on error, so restore it */
+			s->active_pid = MyProcPid;
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("cannot acquire invalidated replication slot \"%s\"",
+							NameStr(MyReplicationSlot->data.name)),
+					 errdetail("This slot has been invalidated because of its inactive_timeout parameter.")));
+		}
+		s->active_pid = MyProcPid;
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -786,7 +830,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -809,7 +853,7 @@ ReplicationSlotAlter(const char *name, bool failover)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1511,6 +1555,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by slot's inactive_timeout parameter."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1624,6 +1671,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (InvalidateReplicationSlotForInactiveTimeout(s, false, false))
+						invalidation_cause = cause;
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1777,6 +1828,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1828,6 +1880,105 @@ restart:
 	return invalidated;
 }
 
+/*
+ * Invalidate given slot based on its inactive_timeout parameter.
+ *
+ * Returns true if the slot has got invalidated.
+ *
+ * NB - this function also runs as part of checkpoint, so avoid raising errors
+ * if possible.
+ */
+bool
+InvalidateReplicationSlotForInactiveTimeout(ReplicationSlot *slot,
+											bool need_locks,
+											bool persist_state)
+{
+	if (!InvalidateSlotForInactiveTimeout(slot, need_locks))
+		return false;
+
+	Assert(slot->active_pid == 0);
+
+	SpinLockAcquire(&slot->mutex);
+	slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT;
+
+	/* Make sure the invalidated state persists across server restart */
+	slot->just_dirtied = true;
+	slot->dirty = true;
+	SpinLockRelease(&slot->mutex);
+
+	if (persist_state)
+	{
+		char		path[MAXPGPATH];
+
+		sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+		SaveSlotToPath(slot, path, ERROR);
+	}
+
+	ReportSlotInvalidation(RS_INVAL_INACTIVE_TIMEOUT, false, 0,
+						   slot->data.name, InvalidXLogRecPtr,
+						   InvalidXLogRecPtr, InvalidTransactionId);
+
+	return true;
+}
+
+/*
+ * Helper for InvalidateReplicationSlotForInactiveTimeout
+ */
+static bool
+InvalidateSlotForInactiveTimeout(ReplicationSlot *slot, bool need_locks)
+{
+	ReplicationSlotInvalidationCause inavidation_cause = RS_INVAL_NONE;
+
+	if (slot->inactive_since == 0 ||
+		slot->data.inactive_timeout == 0)
+		return false;
+
+	/*
+	 * Do not invalidate the slots which are currently being synced from the
+	 * primary to the standby.
+	 */
+	if (RecoveryInProgress() && slot->data.synced)
+		return false;
+
+	if (need_locks)
+	{
+		LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+		/*
+		 * Check if the slot needs to be invalidated due to inactive_timeout.
+		 * We do this with the spinlock held to avoid race conditions -- for
+		 * example the restart_lsn could move forward, or the slot could be
+		 * dropped.
+		 */
+
+		SpinLockAcquire(&slot->mutex);
+	}
+
+	Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
+
+	if (slot->inactive_since > 0 &&
+		slot->data.inactive_timeout > 0)
+	{
+		TimestampTz now;
+
+		/* inactive_since is only tracked for inactive slots */
+		Assert(slot->active_pid == 0);
+
+		now = GetCurrentTimestamp();
+		if (TimestampDifferenceExceeds(slot->inactive_since, now,
+									   slot->data.inactive_timeout * 1000))
+			inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+	}
+
+	if (need_locks)
+	{
+		SpinLockRelease(&slot->mutex);
+		LWLockRelease(ReplicationSlotControlLock);
+	}
+
+	return (inavidation_cause == RS_INVAL_INACTIVE_TIMEOUT);
+}
+
 /*
  * Flush all replication slots to disk.
  *
@@ -1840,6 +1991,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated = false;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -1863,6 +2015,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		/* save the slot to disk, locking is handled in SaveSlotToPath() */
 		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
 
+		/*
+		 * Here's an opportunity to invalidate inactive replication slots
+		 * based on timeout, so let's do it.
+		 */
+		if (InvalidateReplicationSlotForInactiveTimeout(s, true, false))
+			invalidated = true;
+
 		/*
 		 * Slot's data is not flushed each time the confirmed_flush LSN is
 		 * updated as that could lead to frequent writes.  However, we decide
@@ -1889,6 +2048,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	/* If the slot has been invalidated, recalculate the resource limits */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 6e1d8d1f9a..4ea4db0f87 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -258,6 +258,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
+	bool		invalidated = false;
 
 	/*
 	 * We don't require any special permission to see this function's data
@@ -466,6 +467,15 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 	LWLockRelease(ReplicationSlotControlLock);
 
+	/*
+	 * If the slot has been invalidated, recalculate the resource limits
+	 */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
+
 	return (Datum) 0;
 }
 
@@ -668,7 +678,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 5315c08650..7dda2f5a66 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -846,7 +846,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1459,7 +1459,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index eb86d000b1..38d105c5d6 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -163,7 +163,8 @@ typedef struct
 	bool		failover;		/* is the slot designated to be synced to the
 								 * physical standby? */
 	int			inactive_timeout;	/* The amount of time in seconds the slot
-									 * is allowed to be inactive. */
+									 * is allowed to be inactive before
+									 * getting invalidated. */
 } LogicalSlotInfo;
 
 typedef struct
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 5a812ef528..75b0bad083 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -248,7 +250,8 @@ extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, bool failover);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
@@ -267,6 +270,9 @@ extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
+extern bool InvalidateReplicationSlotForInactiveTimeout(ReplicationSlot *slot,
+														bool need_locks,
+														bool persist_state);
 extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock);
 extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..b906d240a5
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,169 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# Check for invalidation of slot in server log.
+sub check_slots_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"", $offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated, "check that slot $slot_name invalidation has been logged");
+}
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot due to inactive_timeout
+#
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+$standby1->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+});
+
+# Set timeout so that the slot when inactive will get invalidated after the
+# timeout.
+my $inactive_timeout = 5;
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot', inactive_timeout := $inactive_timeout);
+]);
+
+$standby1->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Check inactive_timeout is what we've set above
+my $result = $primary->safe_psql(
+	'postgres', qq[
+	SELECT inactive_timeout = $inactive_timeout
+		FROM pg_replication_slots WHERE slot_name = 'sb1_slot';
+]);
+is($result, "t",
+	'check the inactive replication slot info for an active slot');
+
+my $logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby1->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE inactive_since IS NOT NULL
+            AND slot_name = 'sb1_slot'
+            AND inactive_timeout = $inactive_timeout;
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+check_slots_invalidation_in_server_log($primary, 'sb1_slot', $logstart);
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for inactive replication slot sb1_slot to be invalidated";
+
+# Testcase end: Invalidate streaming standby's slot due to inactive_timeout
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to inactive_timeout
+my $publisher = $primary;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$subscriber->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput', inactive_timeout := $inactive_timeout);
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+$result = $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the inactive replication slot info to be updated
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE inactive_since IS NOT NULL
+            AND slot_name = 'lsub1_slot'
+            AND inactive_timeout = $inactive_timeout;
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+check_slots_invalidation_in_server_log($publisher, 'lsub1_slot', $logstart);
+
+# Testcase end: Invalidate logical subscriber's slot due to inactive_timeout
+# =============================================================================
+
+done_testing();
-- 
2.34.1

#153

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#152)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 2:27 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

1)
Commti msg:

ensures the value is set to current timestamp during the
shutdown to help correctly interpret the time if the standby gets
promoted without a restart.

shutdown --> shutdown of slot sync worker (as it was not clear if it
is instance shutdown or something else)

Changed it to "shutdown of slot sync machinery" to be consistent with
the comments.

Thanks for addressing the comments. Just to give more clarity here (so
that you take a informed decision), I am not sure if we actually shut
down slot-sync machinery. We only shot down slot sync worker.
Slot-sync machinery can still be used using
'pg_sync_replication_slots' SQL function. I can easily reproduce the
scenario where SQL function and reset_synced_slots_info() are going
in parallel where the latter hits 'Assert(s->active_pid == 0)' due to
the fact that parallel SQL sync function is active on that slot.

thanks
Shveta

#154

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#152)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Mar 26, 2024 at 02:27:17PM +0530, Bharath Rupireddy wrote:

Please use the v22 patch set.

Thanks!

1 ===

+reset_synced_slots_info(void)

I'm not sure "reset" is the right word, what about slot_sync_shutdown_update()?

2 ===

+       for (int i = 0; i < max_replication_slots; i++)
+       {
+               ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+               /* Check if it is a synchronized slot */
+               if (s->in_use && s->data.synced)
+               {
+                       TimestampTz now;
+
+                       Assert(SlotIsLogical(s));
+                       Assert(s->active_pid == 0);
+
+                       /*
+                        * Set the time since the slot has become inactive after shutting
+                        * down slot sync machinery. This helps correctly interpret the
+                        * time if the standby gets promoted without a restart. We get the
+                        * current time beforehand to avoid a system call while holding
+                        * the lock.
+                        */
+                       now = GetCurrentTimestamp();

What about moving "now = GetCurrentTimestamp()" outside of the for loop? (it
would be less costly and probably good enough).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#155

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#151)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 1:54 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Tue, Mar 26, 2024 at 01:37:21PM +0530, Amit Kapila wrote:

On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

2 ===

It looks like inactive_since is set to the current timestamp on the standby
each time the sync worker does a cycle:

primary:

postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
slot_name | inactive_since
-------------+-------------------------------
lsub27_slot | 2024-03-26 07:39:19.745517+00
lsub28_slot | 2024-03-26 07:40:24.953826+00

standby:

postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
slot_name | inactive_since
-------------+-------------------------------
lsub27_slot | 2024-03-26 07:43:56.387324+00
lsub28_slot | 2024-03-26 07:43:56.387338+00

I don't think that should be the case.

But why? This is exactly what we discussed in another thread where we
agreed to update inactive_since even for sync slots.

Hum, I thought we agreed to "sync" it and to "update it to current time"
only at promotion time.

I think there may have been some misunderstanding here. But now if I
rethink this, I am fine with 'inactive_since' getting synced from
primary to standby. But if we do that, we need to add docs stating
"inactive_since" represents primary's inactivity and not standby's
slots inactivity for synced slots. The reason for this clarification
is that the synced slot might be generated much later, yet
'inactive_since' is synced from the primary, potentially indicating a
time considerably earlier than when the synced slot was actually
created.

Another approach could be that "inactive_since" for synced slot
actually gives its own inactivity data rather than giving primary's
slot data. We update inactive_since on standby only at 3 occasions:
1) at the time of creation of the synced slot.
2) during standby restart.
3) during promotion of standby.

I have attached a sample patch for this idea as.txt file.

I am fine with any of these approaches. One gives data synced from
primary for synced slots, while another gives actual inactivity data
of synced slots.

thanks
Shveta

Attachments:

v1-0001-inactive_since-for-synced-slots.patch.txttext/plain; charset=US-ASCII; name=v1-0001-inactive_since-for-synced-slots.patch.txtDownload

From 7dcd0e95299263187eb1f03812f8321b2612ee5c Mon Sep 17 00:00:00 2001
From: Shveta Malik <shveta.malik@gmail.com>
Date: Tue, 26 Mar 2024 14:42:25 +0530
Subject: [PATCH v1] inactive_since for synced slots.

inactive_since is updated for synced slots:
1) at the time of creation of slot.
2) during server restart.
3) during promotion.
---
 src/backend/replication/logical/slotsync.c |  1 +
 src/backend/replication/slot.c             | 15 ++++++++++++---
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index bbf9a2c485..6114895dca 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -628,6 +628,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		SpinLockAcquire(&slot->mutex);
 		slot->effective_catalog_xmin = xmin_horizon;
 		slot->data.catalog_xmin = xmin_horizon;
+		slot->inactive_since = GetCurrentTimestamp();
 		SpinLockRelease(&slot->mutex);
 		ReplicationSlotsComputeRequiredXmin(true);
 		LWLockRelease(ProcArrayLock);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d0a2f440ef..f2a57a14ec 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -628,7 +628,10 @@ retry:
 	 * now.
 	 */
 	SpinLockAcquire(&s->mutex);
-	s->inactive_since = 0;
+
+	if (!(RecoveryInProgress() && s->data.synced))
+		s->inactive_since = 0;
+
 	SpinLockRelease(&s->mutex);
 
 	if (am_walsender)
@@ -704,14 +707,20 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+
+		if (!(RecoveryInProgress() && slot->data.synced))
+			slot->inactive_since = now;
+
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
 	{
 		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
+
+		if (!(RecoveryInProgress() && slot->data.synced))
+			slot->inactive_since = now;
+
 		SpinLockRelease(&slot->mutex);
 	}
 
-- 
2.34.1

#156

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: shveta malik (#155)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Mar 26, 2024 at 03:17:36PM +0530, shveta malik wrote:

On Tue, Mar 26, 2024 at 1:54 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Tue, Mar 26, 2024 at 01:37:21PM +0530, Amit Kapila wrote:

On Tue, Mar 26, 2024 at 1:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

2 ===

It looks like inactive_since is set to the current timestamp on the standby
each time the sync worker does a cycle:

primary:

postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
slot_name | inactive_since
-------------+-------------------------------
lsub27_slot | 2024-03-26 07:39:19.745517+00
lsub28_slot | 2024-03-26 07:40:24.953826+00

standby:

postgres=# select slot_name,inactive_since from pg_replication_slots where failover = 't';
slot_name | inactive_since
-------------+-------------------------------
lsub27_slot | 2024-03-26 07:43:56.387324+00
lsub28_slot | 2024-03-26 07:43:56.387338+00

I don't think that should be the case.

But why? This is exactly what we discussed in another thread where we
agreed to update inactive_since even for sync slots.

Hum, I thought we agreed to "sync" it and to "update it to current time"
only at promotion time.

I think there may have been some misunderstanding here.

Indeed ;-)

But now if I
rethink this, I am fine with 'inactive_since' getting synced from
primary to standby. But if we do that, we need to add docs stating
"inactive_since" represents primary's inactivity and not standby's
slots inactivity for synced slots.

Yeah sure.

The reason for this clarification
is that the synced slot might be generated much later, yet
'inactive_since' is synced from the primary, potentially indicating a
time considerably earlier than when the synced slot was actually
created.

Right.

Another approach could be that "inactive_since" for synced slot
actually gives its own inactivity data rather than giving primary's
slot data. We update inactive_since on standby only at 3 occasions:
1) at the time of creation of the synced slot.
2) during standby restart.
3) during promotion of standby.

I have attached a sample patch for this idea as.txt file.

Thanks!

I am fine with any of these approaches. One gives data synced from
primary for synced slots, while another gives actual inactivity data
of synced slots.

What about another approach?: inactive_since gives data synced from primary for
synced slots and another dedicated field (could be added later...) could
represent what you suggest as the other option.

Another cons of updating inactive_since at the current time during each slot
sync cycle is that calling GetCurrentTimestamp() very frequently
(during each sync cycle of very active slots) could be too costly.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#157

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#154)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 3:12 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Tue, Mar 26, 2024 at 02:27:17PM +0530, Bharath Rupireddy wrote:

Please use the v22 patch set.

Thanks!

1 ===

+reset_synced_slots_info(void)

I'm not sure "reset" is the right word, what about slot_sync_shutdown_update()?

*shutdown_update() sounds generic. How about
update_synced_slots_inactive_time()? I think it is a bit longer but
conveys the meaning.

--
With Regards,
Amit Kapila.

#158

Ajin Cherian

itsajin@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#152)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 7:57 PM Bharath Rupireddy <
bharath.rupireddyforpostgres@gmail.com> wrote:

Please see the attached v23 patches. I've addressed all the review
comments received so far from Amit and Shveta.

In patch 0003:
+ SpinLockAcquire(&slot->mutex);
+ }
+
+ Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
+
+ if (slot->inactive_since > 0 &&
+ slot->data.inactive_timeout > 0)
+ {
+ TimestampTz now;
+
+ /* inactive_since is only tracked for inactive slots */
+ Assert(slot->active_pid == 0);
+
+ now = GetCurrentTimestamp();
+ if (TimestampDifferenceExceeds(slot->inactive_since, now,
+   slot->data.inactive_timeout * 1000))
+ inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+ }
+
+ if (need_locks)
+ {
+ SpinLockRelease(&slot->mutex);

Here, GetCurrentTimestamp() is still called with SpinLock held. Maybe do
this prior to acquiring the spinlock.

regards,
Ajin Cherian
Fujitsu Australia

#159

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#156)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 3:50 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

I think there may have been some misunderstanding here.

Indeed ;-)

But now if I
rethink this, I am fine with 'inactive_since' getting synced from
primary to standby. But if we do that, we need to add docs stating
"inactive_since" represents primary's inactivity and not standby's
slots inactivity for synced slots.

Yeah sure.

The reason for this clarification
is that the synced slot might be generated much later, yet
'inactive_since' is synced from the primary, potentially indicating a
time considerably earlier than when the synced slot was actually
created.

Right.

Another approach could be that "inactive_since" for synced slot
actually gives its own inactivity data rather than giving primary's
slot data. We update inactive_since on standby only at 3 occasions:
1) at the time of creation of the synced slot.
2) during standby restart.
3) during promotion of standby.

I have attached a sample patch for this idea as.txt file.

Thanks!

I am fine with any of these approaches. One gives data synced from
primary for synced slots, while another gives actual inactivity data
of synced slots.

What about another approach?: inactive_since gives data synced from primary for
synced slots and another dedicated field (could be added later...) could
represent what you suggest as the other option.

Yes, okay with me. I think there is some confusion here as well. In my
second approach above, I have not suggested anything related to
sync-worker. We can think on that later if we really need another
field which give us sync time. In my second approach, I have tried to
avoid updating inactive_since for synced slots during sync process. We
update that field during creation of synced slot so that
inactive_since reflects correct info even for synced slots (rather
than copying from primary). Please have a look at my patch and let me
know your thoughts. I am fine with copying it from primary as well and
documenting this behaviour.

Another cons of updating inactive_since at the current time during each slot
sync cycle is that calling GetCurrentTimestamp() very frequently
(during each sync cycle of very active slots) could be too costly.

Right.

thanks
Shveta

#160

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: shveta malik (#159)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 4:18 PM shveta malik <shveta.malik@gmail.com> wrote:

What about another approach?: inactive_since gives data synced from primary for
synced slots and another dedicated field (could be added later...) could
represent what you suggest as the other option.

Yes, okay with me. I think there is some confusion here as well. In my
second approach above, I have not suggested anything related to
sync-worker. We can think on that later if we really need another
field which give us sync time. In my second approach, I have tried to
avoid updating inactive_since for synced slots during sync process. We
update that field during creation of synced slot so that
inactive_since reflects correct info even for synced slots (rather
than copying from primary). Please have a look at my patch and let me
know your thoughts. I am fine with copying it from primary as well and
documenting this behaviour.

I took a look at your patch.

--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -628,6 +628,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
         SpinLockAcquire(&slot->mutex);
         slot->effective_catalog_xmin = xmin_horizon;
         slot->data.catalog_xmin = xmin_horizon;
+        slot->inactive_since = GetCurrentTimestamp();
         SpinLockRelease(&slot->mutex);

If we just sync inactive_since value for synced slots while in
recovery from the primary, so be it. Why do we need to update it to
the current time when the slot is being created? We don't expose slot
creation time, no? Aren't we fine if we just sync the value from
primary and document that fact? After the promotion, we can reset it
to the current time so that it gets its own time. Do you see any
issues with it?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#161

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#160)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Mar 26, 2024 at 4:18 PM shveta malik <shveta.malik@gmail.com> wrote:

What about another approach?: inactive_since gives data synced from primary for
synced slots and another dedicated field (could be added later...) could
represent what you suggest as the other option.

Yes, okay with me. I think there is some confusion here as well. In my
second approach above, I have not suggested anything related to
sync-worker. We can think on that later if we really need another
field which give us sync time. In my second approach, I have tried to
avoid updating inactive_since for synced slots during sync process. We
update that field during creation of synced slot so that
inactive_since reflects correct info even for synced slots (rather
than copying from primary). Please have a look at my patch and let me
know your thoughts. I am fine with copying it from primary as well and
documenting this behaviour.

I took a look at your patch.
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -628,6 +628,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
SpinLockAcquire(&slot->mutex);
slot->effective_catalog_xmin = xmin_horizon;
slot->data.catalog_xmin = xmin_horizon;
+        slot->inactive_since = GetCurrentTimestamp();
SpinLockRelease(&slot->mutex);
If we just sync inactive_since value for synced slots while in
recovery from the primary, so be it. Why do we need to update it to
the current time when the slot is being created?

If we update inactive_since at synced slot's creation or during
restart (skipping setting it during sync), then this time reflects
actual 'inactive_since' for that particular synced slot. Isn't that a
clear info for the user and in alignment of what the name
'inactive_since' actually suggests?

We don't expose slot
creation time, no?

No, we don't. But for synced slot, that is the time since that slot is
inactive (unless promoted), so we are exposing inactive_since and not
creation time.

Aren't we fine if we just sync the value from
primary and document that fact? After the promotion, we can reset it
to the current time so that it gets its own time. Do you see any
issues with it?

Yes, we can do that. But curious to know, do we see any additional
benefit of reflecting primary's inactive_since at standby which I
might be missing?

thanks
Shveta

#162

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: shveta malik (#161)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Mar 26, 2024 at 04:49:18PM +0530, shveta malik wrote:

On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Tue, Mar 26, 2024 at 4:18 PM shveta malik <shveta.malik@gmail.com> wrote:

What about another approach?: inactive_since gives data synced from primary for
synced slots and another dedicated field (could be added later...) could
represent what you suggest as the other option.

Yes, okay with me. I think there is some confusion here as well. In my
second approach above, I have not suggested anything related to
sync-worker. We can think on that later if we really need another
field which give us sync time. In my second approach, I have tried to
avoid updating inactive_since for synced slots during sync process. We
update that field during creation of synced slot so that
inactive_since reflects correct info even for synced slots (rather
than copying from primary). Please have a look at my patch and let me
know your thoughts. I am fine with copying it from primary as well and
documenting this behaviour.

I took a look at your patch.
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -628,6 +628,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
SpinLockAcquire(&slot->mutex);
slot->effective_catalog_xmin = xmin_horizon;
slot->data.catalog_xmin = xmin_horizon;
+        slot->inactive_since = GetCurrentTimestamp();
SpinLockRelease(&slot->mutex);
If we just sync inactive_since value for synced slots while in
recovery from the primary, so be it. Why do we need to update it to
the current time when the slot is being created?
If we update inactive_since at synced slot's creation or during
restart (skipping setting it during sync), then this time reflects
actual 'inactive_since' for that particular synced slot. Isn't that a
clear info for the user and in alignment of what the name
'inactive_since' actually suggests?

We don't expose slot
creation time, no?

No, we don't. But for synced slot, that is the time since that slot is
inactive (unless promoted), so we are exposing inactive_since and not
creation time.

Aren't we fine if we just sync the value from
primary and document that fact? After the promotion, we can reset it
to the current time so that it gets its own time. Do you see any
issues with it?

Yes, we can do that. But curious to know, do we see any additional
benefit of reflecting primary's inactive_since at standby which I
might be missing?

In case the primary goes down, then one could use the value on the standby
to get the value coming from the primary. I think that could be useful info to
have.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#163

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: shveta malik (#159)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Mar 26, 2024 at 04:17:53PM +0530, shveta malik wrote:

On Tue, Mar 26, 2024 at 3:50 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

I think there may have been some misunderstanding here.

Indeed ;-)

But now if I
rethink this, I am fine with 'inactive_since' getting synced from
primary to standby. But if we do that, we need to add docs stating
"inactive_since" represents primary's inactivity and not standby's
slots inactivity for synced slots.

Yeah sure.

The reason for this clarification
is that the synced slot might be generated much later, yet
'inactive_since' is synced from the primary, potentially indicating a
time considerably earlier than when the synced slot was actually
created.

Right.

Another approach could be that "inactive_since" for synced slot
actually gives its own inactivity data rather than giving primary's
slot data. We update inactive_since on standby only at 3 occasions:
1) at the time of creation of the synced slot.
2) during standby restart.
3) during promotion of standby.

I have attached a sample patch for this idea as.txt file.

Thanks!

I am fine with any of these approaches. One gives data synced from
primary for synced slots, while another gives actual inactivity data
of synced slots.

What about another approach?: inactive_since gives data synced from primary for
synced slots and another dedicated field (could be added later...) could
represent what you suggest as the other option.

Yes, okay with me. I think there is some confusion here as well. In my
second approach above, I have not suggested anything related to
sync-worker.

Yeah, no confusion, understood that way.

We can think on that later if we really need another
field which give us sync time.

I think that calling GetCurrentTimestamp() so frequently could be too costly, so
I'm not sure we should.

In my second approach, I have tried to
avoid updating inactive_since for synced slots during sync process. We
update that field during creation of synced slot so that
inactive_since reflects correct info even for synced slots (rather
than copying from primary).

Yeah, and I think we could create a dedicated field with this information
if we feel the need.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#164

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#160)

4 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

If we just sync inactive_since value for synced slots while in
recovery from the primary, so be it. Why do we need to update it to
the current time when the slot is being created? We don't expose slot
creation time, no? Aren't we fine if we just sync the value from
primary and document that fact? After the promotion, we can reset it
to the current time so that it gets its own time.

I'm attaching v24 patches. It implements the above idea proposed
upthread for synced slots. I've now separated
s/last_inactive_time/inactive_since and synced slots behaviour. Please
have a look.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v24-0001-Use-less-confusing-name-for-slot-s-last_inactive.patchapplication/octet-stream; name=v24-0001-Use-less-confusing-name-for-slot-s-last_inactive.patchDownload

From 942b8742f108fadf776e3ae1f6fe800878d27b55 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 26 Mar 2024 16:17:55 +0000
Subject: [PATCH v24 1/4] Use less confusing name for slot's last_inactive_time
 property.

The slot's last_inactive_time property added by commit a11f330b55
seems confusing. With last_inactive_time one expects it to tell
the last time that the slot was inactive. But, it tells the last
time that a currently-inactive slot previously *WAS* active.

This commit uses a less confusing and better name for the property
called inactive_since. Othernames considered were released_time,
deactivated_at but inactive_since won the race since the word
inactive is predominant as far as the replication slots are
concerned.

Reported-by: Robert Haas
Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Discussion: https://www.postgresql.org/message-id/ZgGrCBQoktdLi1Ir%40ip-10-97-1-34.eu-west-3.compute.internal
---
 doc/src/sgml/system-views.sgml            |  4 +-
 src/backend/catalog/system_views.sql      |  2 +-
 src/backend/replication/slot.c            | 17 ++++---
 src/backend/replication/slotfuncs.c       |  4 +-
 src/include/catalog/pg_proc.dat           |  2 +-
 src/include/replication/slot.h            |  4 +-
 src/test/recovery/t/019_replslot_limit.pl | 62 +++++++++++------------
 src/test/regress/expected/rules.out       |  4 +-
 8 files changed, 51 insertions(+), 48 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 5f4165a945..3c8dca8ca3 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,10 +2525,10 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>last_inactive_time</structfield> <type>timestamptz</type>
+       <structfield>inactive_since</structfield> <type>timestamptz</type>
       </para>
       <para>
-        The time at which the slot became inactive.
+        The time since the slot has become inactive.
         <literal>NULL</literal> if the slot is currently being used.
       </para></entry>
      </row>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index bc70ff193e..401fb35947 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.last_inactive_time,
+            L.inactive_since,
             L.conflicting,
             L.invalidation_reason,
             L.failover,
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 45f7a28f7d..d778c0b921 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -409,7 +409,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->candidate_restart_valid = InvalidXLogRecPtr;
 	slot->candidate_restart_lsn = InvalidXLogRecPtr;
 	slot->last_saved_confirmed_flush = InvalidXLogRecPtr;
-	slot->last_inactive_time = 0;
+	slot->inactive_since = 0;
 
 	/*
 	 * Create the slot on disk.  We haven't actually marked the slot allocated
@@ -623,9 +623,12 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
-	/* Reset the last inactive time as the slot is active now. */
+	/*
+	 * Reset the time since the slot has become inactive as the slot is active
+	 * now.
+	 */
 	SpinLockAcquire(&s->mutex);
-	s->last_inactive_time = 0;
+	s->inactive_since = 0;
 	SpinLockRelease(&s->mutex);
 
 	if (am_walsender)
@@ -703,14 +706,14 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->last_inactive_time = now;
+		slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
 	{
 		SpinLockAcquire(&slot->mutex);
-		slot->last_inactive_time = now;
+		slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 	}
 
@@ -2373,9 +2376,9 @@ RestoreSlotFromDisk(const char *name)
 		 * inactive as decoding is not allowed on those.
 		 */
 		if (!(RecoveryInProgress() && slot->data.synced))
-			slot->last_inactive_time = GetCurrentTimestamp();
+			slot->inactive_since = GetCurrentTimestamp();
 		else
-			slot->last_inactive_time = 0;
+			slot->inactive_since = 0;
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 24f5e6d90a..da57177c25 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -410,8 +410,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.last_inactive_time > 0)
-			values[i++] = TimestampTzGetDatum(slot_contents.last_inactive_time);
+		if (slot_contents.inactive_since > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.inactive_since);
 		else
 			nulls[i++] = true;
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 0d26e5b422..2f7cfc02c6 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11135,7 +11135,7 @@
   proargtypes => '',
   proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,timestamptz,bool,text,bool,bool}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,last_inactive_time,conflicting,invalidation_reason,failover,synced}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,inactive_since,conflicting,invalidation_reason,failover,synced}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index eefd7abd39..7b937d1a0c 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -202,8 +202,8 @@ typedef struct ReplicationSlot
 	 */
 	XLogRecPtr	last_saved_confirmed_flush;
 
-	/* The time at which this slot becomes inactive */
-	TimestampTz last_inactive_time;
+	/* The time since the slot has become inactive */
+	TimestampTz inactive_since;
 } ReplicationSlot;
 
 #define SlotIsPhysical(slot) ((slot)->data.database == InvalidOid)
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 3409cf88cd..3b9a306a8b 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -411,7 +411,7 @@ $node_primary3->stop;
 $node_standby3->stop;
 
 # =============================================================================
-# Testcase start: Check last_inactive_time property of the streaming standby's slot
+# Testcase start: Check inactive_since property of the streaming standby's slot
 #
 
 # Initialize primary node
@@ -440,45 +440,45 @@ $primary4->safe_psql(
     SELECT pg_create_physical_replication_slot(slot_name := '$sb4_slot');
 ]);
 
-# Get last_inactive_time value after the slot's creation. Note that the slot
-# is still inactive till it's used by the standby below.
-my $last_inactive_time =
-	capture_and_validate_slot_last_inactive_time($primary4, $sb4_slot, $slot_creation_time);
+# Get inactive_since value after the slot's creation. Note that the slot is
+# still inactive till it's used by the standby below.
+my $inactive_since =
+	capture_and_validate_slot_inactive_since($primary4, $sb4_slot, $slot_creation_time);
 
 $standby4->start;
 
 # Wait until standby has replayed enough data
 $primary4->wait_for_catchup($standby4);
 
-# Now the slot is active so last_inactive_time value must be NULL
+# Now the slot is active so inactive_since value must be NULL
 is( $primary4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$sb4_slot';]
+		qq[SELECT inactive_since IS NULL FROM pg_replication_slots WHERE slot_name = '$sb4_slot';]
 	),
 	't',
 	'last inactive time for an active physical slot is NULL');
 
-# Stop the standby to check its last_inactive_time value is updated
+# Stop the standby to check its inactive_since value is updated
 $standby4->stop;
 
-# Let's restart the primary so that the last_inactive_time is set upon
-# loading the slot from the disk.
+# Let's restart the primary so that the inactive_since is set upon loading the
+# slot from the disk.
 $primary4->restart;
 
 is( $primary4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND last_inactive_time IS NOT NULL;]
+		qq[SELECT inactive_since > '$inactive_since'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND inactive_since IS NOT NULL;]
 	),
 	't',
 	'last inactive time for an inactive physical slot is updated correctly');
 
 $standby4->stop;
 
-# Testcase end: Check last_inactive_time property of the streaming standby's slot
+# Testcase end: Check inactive_since property of the streaming standby's slot
 # =============================================================================
 
 # =============================================================================
-# Testcase start: Check last_inactive_time property of the logical subscriber's slot
+# Testcase start: Check inactive_since property of the logical subscriber's slot
 my $publisher4 = $primary4;
 
 # Create subscriber node
@@ -499,10 +499,10 @@ $publisher4->safe_psql('postgres',
 	"SELECT pg_create_logical_replication_slot(slot_name := '$lsub4_slot', plugin := 'pgoutput');"
 );
 
-# Get last_inactive_time value after the slot's creation. Note that the slot
-# is still inactive till it's used by the subscriber below.
-$last_inactive_time =
-	capture_and_validate_slot_last_inactive_time($publisher4, $lsub4_slot, $slot_creation_time);
+# Get inactive_since value after the slot's creation. Note that the slot is
+# still inactive till it's used by the subscriber below.
+$inactive_since =
+	capture_and_validate_slot_inactive_since($publisher4, $lsub4_slot, $slot_creation_time);
 
 $subscriber4->start;
 $subscriber4->safe_psql('postgres',
@@ -512,54 +512,54 @@ $subscriber4->safe_psql('postgres',
 # Wait until subscriber has caught up
 $subscriber4->wait_for_subscription_sync($publisher4, 'sub');
 
-# Now the slot is active so last_inactive_time value must be NULL
+# Now the slot is active so inactive_since value must be NULL
 is( $publisher4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub4_slot';]
+		qq[SELECT inactive_since IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub4_slot';]
 	),
 	't',
 	'last inactive time for an active logical slot is NULL');
 
-# Stop the subscriber to check its last_inactive_time value is updated
+# Stop the subscriber to check its inactive_since value is updated
 $subscriber4->stop;
 
-# Let's restart the publisher so that the last_inactive_time is set upon
+# Let's restart the publisher so that the inactive_since is set upon
 # loading the slot from the disk.
 $publisher4->restart;
 
 is( $publisher4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND last_inactive_time IS NOT NULL;]
+		qq[SELECT inactive_since > '$inactive_since'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND inactive_since IS NOT NULL;]
 	),
 	't',
 	'last inactive time for an inactive logical slot is updated correctly');
 
-# Testcase end: Check last_inactive_time property of the logical subscriber's slot
+# Testcase end: Check inactive_since property of the logical subscriber's slot
 # =============================================================================
 
 $publisher4->stop;
 $subscriber4->stop;
 
-# Capture and validate last_inactive_time of a given slot.
-sub capture_and_validate_slot_last_inactive_time
+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
 {
 	my ($node, $slot_name, $slot_creation_time) = @_;
 
-	my $last_inactive_time = $node->safe_psql('postgres',
-		qq(SELECT last_inactive_time FROM pg_replication_slots
-			WHERE slot_name = '$slot_name' AND last_inactive_time IS NOT NULL;)
+	my $inactive_since = $node->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
 		);
 
 	# Check that the captured time is sane
 	is( $node->safe_psql(
 			'postgres',
-			qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0) AND
-				'$last_inactive_time'::timestamptz >= '$slot_creation_time'::timestamptz;]
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
 		),
 		't',
 		"last inactive time for an active slot $slot_name is sane");
 
-	return $last_inactive_time;
+	return $inactive_since;
 }
 
 done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index dfcbaec387..f53c3036a6 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,12 +1473,12 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.last_inactive_time,
+    l.inactive_since,
     l.conflicting,
     l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, last_inactive_time, conflicting, invalidation_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, inactive_since, conflicting, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v24-0002-Maintain-inactive_since-for-synced-slots-correct.patchapplication/octet-stream; name=v24-0002-Maintain-inactive_since-for-synced-slots-correct.patchDownload

From d9df4b9f3fa146f076064cdf10cfefc6a97ccca2 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 26 Mar 2024 16:18:58 +0000
Subject: [PATCH v24 2/4] Maintain inactive_since for synced slots correctly.

The slot's inactive_since isn't currently maintained for
synced slots on the standby. The commit a11f330b55 prevents
updating inactive_since with RecoveryInProgress() check in
RestoreSlotFromDisk(). But, the issue is that
RecoveryInProgress() always returns true in
RestoreSlotFromDisk() as 'xlogctl->SharedRecoveryState' is
always 'RECOVERY_STATE_CRASH' at that time. The impact of this
on a promoted standby inactive_since is always NULL for all
synced slots even after server restart.

Above issue led us to a question as to why we can't just update
inactive_since for synced slots on the standby with the value
received from remote slot on the primary. This is consistent with
any other slot parameter i.e. all of them are synced from the
primary.

This commit does two things:
1) Updates inactive_since for sync slots with the value
received from the primary's slot.

2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACWLctoiH-pSjWnEpR54q4DED6rw_BRJm5pCx86_Y01MoQ%40mail.gmail.com
---
 src/backend/replication/logical/slotsync.c    | 60 +++++++++++++++++--
 src/backend/replication/slot.c                | 37 ++++++++----
 .../t/040_standby_failover_slots_sync.pl      | 49 +++++++++++++++
 3 files changed, 130 insertions(+), 16 deletions(-)

diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..c1905ce24b 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -137,9 +137,12 @@ typedef struct RemoteSlot
 
 	/* RS_INVAL_NONE if valid, or the reason of invalidation */
 	ReplicationSlotInvalidationCause invalidated;
+
+	TimestampTz inactive_since; /* in seconds */
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_time(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -167,7 +170,8 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		remote_slot->two_phase == slot->data.two_phase &&
 		remote_slot->failover == slot->data.failover &&
 		remote_slot->confirmed_lsn == slot->data.confirmed_flush &&
-		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
+		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 &&
+		remote_slot->inactive_since == slot->inactive_since)
 		return false;
 
 	/* Avoid expensive operations while holding a spinlock. */
@@ -182,6 +186,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 	slot->data.confirmed_flush = remote_slot->confirmed_lsn;
 	slot->data.catalog_xmin = remote_slot->catalog_xmin;
 	slot->effective_catalog_xmin = remote_slot->catalog_xmin;
+	slot->inactive_since = remote_slot->inactive_since;
 	SpinLockRelease(&slot->mutex);
 
 	if (xmin_changed)
@@ -652,9 +657,9 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 static bool
 synchronize_slots(WalReceiverConn *wrconn)
 {
-#define SLOTSYNC_COLUMN_COUNT 9
+#define SLOTSYNC_COLUMN_COUNT 10
 	Oid			slotRow[SLOTSYNC_COLUMN_COUNT] = {TEXTOID, TEXTOID, LSNOID,
-	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID};
+	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID, TIMESTAMPTZOID};
 
 	WalRcvExecResult *res;
 	TupleTableSlot *tupslot;
@@ -663,7 +668,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, invalidation_reason"
+		" database, invalidation_reason, inactive_since"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
@@ -743,6 +748,9 @@ synchronize_slots(WalReceiverConn *wrconn)
 		remote_slot->invalidated = isnull ? RS_INVAL_NONE :
 			GetSlotInvalidationCause(TextDatumGetCString(d));
 
+		remote_slot->inactive_since = DatumGetTimestampTz(slot_getattr(tupslot, ++col,
+																	   &isnull));
+
 		/* Sanity check */
 		Assert(col == SLOTSYNC_COLUMN_COUNT);
 
@@ -1296,6 +1304,47 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Reset the synced slots info such as inactive_since after shutting
+ * down the slot sync machinery.
+ */
+static void
+update_synced_slots_inactive_time(void)
+{
+	TimestampTz now = 0;
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			Assert(SlotIsLogical(s));
+
+			/*
+			 * We get the current time beforehand and only once to avoid
+			 * system calls overhead while holding the lock.
+			 */
+			if (now == 0)
+				now = GetCurrentTimestamp();
+
+			/*
+			 * Set the time since the slot has become inactive after shutting
+			 * down slot sync machinery. This helps correctly interpret the
+			 * time if the standby gets promoted without a restart.
+			 */
+			SpinLockAcquire(&s->mutex);
+			s->inactive_since = now;
+			SpinLockRelease(&s->mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1309,6 +1358,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(&SlotSyncCtx->mutex);
+		update_synced_slots_inactive_time();
 		return;
 	}
 	SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1341,6 +1391,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(&SlotSyncCtx->mutex);
+
+	update_synced_slots_inactive_time();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d778c0b921..05bc453de9 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -42,6 +42,7 @@
 #include "access/transam.h"
 #include "access/xlog_internal.h"
 #include "access/xlogrecovery.h"
+#include "access/xlogutils.h"
 #include "common/file_utils.h"
 #include "common/string.h"
 #include "miscadmin.h"
@@ -655,6 +656,7 @@ ReplicationSlotRelease(void)
 	char	   *slotname = NULL;	/* keep compiler quiet */
 	bool		is_logical = false; /* keep compiler quiet */
 	TimestampTz now = 0;
+	bool		update_inactive_since;
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
@@ -690,13 +692,19 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has become inactive.
+	 *
+	 * Note that we don't set it for the slots currently being synced from the
+	 * primary to the standby, because such slots typically sync the data from
+	 * the remote slot.
 	 */
 	if (!(RecoveryInProgress() && slot->data.synced))
+	{
 		now = GetCurrentTimestamp();
+		update_inactive_since = true;
+	}
+	else
+		update_inactive_since = false;
 
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
@@ -706,11 +714,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		if (update_inactive_since)
+			slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
-	else
+	else if (update_inactive_since)
 	{
 		SpinLockAcquire(&slot->mutex);
 		slot->inactive_since = now;
@@ -2369,13 +2378,17 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * Set the time since the slot has become inactive after loading the
+		 * slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it.
+		 *
+		 * Note that we don't set it for the slots currently being synced from
+		 * the primary to the standby, because such slots typically sync the
+		 * data from the remote slot. We use InRecovery flag instead of
+		 * RecoveryInProgress() as it always returns true even for normal
+		 * server startup.
 		 */
-		if (!(RecoveryInProgress() && slot->data.synced))
+		if (!(InRecovery && slot->data.synced))
 			slot->inactive_since = GetCurrentTimestamp();
 		else
 			slot->inactive_since = 0;
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index f47bfd78eb..a5aa2e3260 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -35,6 +35,13 @@ my $subscriber1 = PostgreSQL::Test::Cluster->new('subscriber1');
 $subscriber1->init;
 $subscriber1->start;
 
+# Capture the time before the logical failover slot is created on the
+# primary. We later call this publisher as primary anyway.
+my $creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Create a slot on the publisher with failover disabled
 $publisher->safe_psql('postgres',
 	"SELECT 'init' FROM pg_create_logical_replication_slot('lsub1_slot', 'pgoutput', false, false, false);"
@@ -174,6 +181,10 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary
+my $inactive_since_on_primary =
+	capture_and_validate_slot_inactive_since($primary, 'lsub1_slot', $creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -190,6 +201,19 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Capture the inactive_since of the slot from the standby the logical failover
+# slots are synced/created on the standby.
+my $inactive_since_on_standby =
+	capture_and_validate_slot_inactive_since($standby1, 'lsub1_slot', $creation_time_on_primary);
+
+# Synced slots on the standby must get the inactive_since from the primary.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_primary'::timestamptz = '$inactive_since_on_standby'::timestamptz;"
+	),
+	"t",
+	'synchronized slot has got the inactive_since from the primary');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
@@ -773,4 +797,29 @@ is( $subscriber1->safe_psql('postgres', q{SELECT count(*) FROM tab_int;}),
 	"20",
 	'data replicated from the new primary');
 
+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
+{
+	my ($node, $slot_name, $slot_creation_time) = @_;
+	my $name = $node->name;
+
+	my $inactive_since = $node->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
+		);
+
+	print "HI  $slot_name $name $inactive_since $slot_creation_time\n";
+
+	# Check that the captured time is sane
+	is( $node->safe_psql(
+			'postgres',
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
+		),
+		't',
+		"last inactive time for slot $slot_name is sane on node $name");
+
+	return $inactive_since;
+}
+
 done_testing();
-- 
2.34.1

v24-0003-Allow-setting-inactive_timeout-for-replication-s.patchapplication/octet-stream; name=v24-0003-Allow-setting-inactive_timeout-for-replication-s.patchDownload

From 8c6d41253ed2fd301cd062fb408660d26ec7c751 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 26 Mar 2024 16:19:35 +0000
Subject: [PATCH v24 3/4] Allow setting inactive_timeout for replication slots
 via SQL API.

This commit adds a new replication slot property called
inactive_timeout specifying the amount of time in seconds the slot
is allowed to be inactive. It is added to slot's persistent data
structure to survive during server restarts. It will be synced to
failover slots on the standby, and also will be carried over to
the new cluster as part of pg_upgrade.

This commit particularly lets one specify the inactive_timeout for
a slot via SQL functions pg_create_physical_replication_slot and
pg_create_logical_replication_slot.

The new property will be useful to implement inactive timeout based
replication slot invalidation in a future commit.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 contrib/test_decoding/expected/slot.out       | 97 +++++++++++++++++++
 contrib/test_decoding/sql/slot.sql            | 30 ++++++
 doc/src/sgml/func.sgml                        | 18 ++--
 doc/src/sgml/system-views.sgml                |  9 ++
 src/backend/catalog/system_functions.sql      |  2 +
 src/backend/catalog/system_views.sql          |  1 +
 src/backend/replication/logical/slotsync.c    | 17 +++-
 src/backend/replication/slot.c                |  8 +-
 src/backend/replication/slotfuncs.c           | 31 +++++-
 src/backend/replication/walsender.c           |  4 +-
 src/bin/pg_upgrade/info.c                     |  6 +-
 src/bin/pg_upgrade/pg_upgrade.c               |  5 +-
 src/bin/pg_upgrade/pg_upgrade.h               |  2 +
 src/bin/pg_upgrade/t/003_logical_slots.pl     | 11 ++-
 src/include/catalog/pg_proc.dat               | 22 ++---
 src/include/replication/slot.h                |  5 +-
 .../t/040_standby_failover_slots_sync.pl      | 13 ++-
 src/test/regress/expected/rules.out           |  3 +-
 18 files changed, 243 insertions(+), 41 deletions(-)

diff --git a/contrib/test_decoding/expected/slot.out b/contrib/test_decoding/expected/slot.out
index 349ab2d380..c318eceefd 100644
--- a/contrib/test_decoding/expected/slot.out
+++ b/contrib/test_decoding/expected/slot.out
@@ -466,3 +466,100 @@ SELECT pg_drop_replication_slot('physical_slot');
  
 (1 row)
 
+-- Test negative value for inactive_timeout option for slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', inactive_timeout := -300);  -- error
+ERROR:  "inactive_timeout" must not be negative
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', inactive_timeout := -600);  -- error
+ERROR:  "inactive_timeout" must not be negative
+-- Test inactive_timeout option of physical slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot1', immediately_reserve := true, inactive_timeout := 300);
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot2');
+ ?column? 
+----------
+ init
+(1 row)
+
+-- Copy physical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_slot1', dst_slot_name := 'it_phy_slot3');
+ ?column? 
+----------
+ copy
+(1 row)
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+  slot_name   | slot_type | inactive_timeout 
+--------------+-----------+------------------
+ it_phy_slot1 | physical  |              300
+ it_phy_slot2 | physical  |                0
+ it_phy_slot3 | physical  |              300
+(3 rows)
+
+SELECT pg_drop_replication_slot('it_phy_slot1');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_phy_slot2');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_phy_slot3');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+-- Test inactive_timeout option of logical slots.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
+ ?column? 
+----------
+ init
+(1 row)
+
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot2', plugin := 'test_decoding');
+ ?column? 
+----------
+ init
+(1 row)
+
+-- Copy logical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slot1', dst_slot_name := 'it_log_slot3');
+ ?column? 
+----------
+ copy
+(1 row)
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+  slot_name   | slot_type | inactive_timeout 
+--------------+-----------+------------------
+ it_log_slot1 | logical   |              600
+ it_log_slot2 | logical   |                0
+ it_log_slot3 | logical   |              600
+(3 rows)
+
+SELECT pg_drop_replication_slot('it_log_slot1');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_log_slot2');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
+SELECT pg_drop_replication_slot('it_log_slot3');
+ pg_drop_replication_slot 
+--------------------------
+ 
+(1 row)
+
diff --git a/contrib/test_decoding/sql/slot.sql b/contrib/test_decoding/sql/slot.sql
index 580e3ae3be..e5c7b3d359 100644
--- a/contrib/test_decoding/sql/slot.sql
+++ b/contrib/test_decoding/sql/slot.sql
@@ -190,3 +190,33 @@ SELECT pg_drop_replication_slot('failover_true_slot');
 SELECT pg_drop_replication_slot('failover_false_slot');
 SELECT pg_drop_replication_slot('failover_default_slot');
 SELECT pg_drop_replication_slot('physical_slot');
+
+-- Test negative value for inactive_timeout option for slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_fail_slot', inactive_timeout := -300);  -- error
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_fail_slot', plugin := 'test_decoding', inactive_timeout := -600);  -- error
+
+-- Test inactive_timeout option of physical slots.
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot1', immediately_reserve := true, inactive_timeout := 300);
+SELECT 'init' FROM pg_create_physical_replication_slot(slot_name := 'it_phy_slot2');
+
+-- Copy physical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_physical_replication_slot(src_slot_name := 'it_phy_slot1', dst_slot_name := 'it_phy_slot3');
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+
+SELECT pg_drop_replication_slot('it_phy_slot1');
+SELECT pg_drop_replication_slot('it_phy_slot2');
+SELECT pg_drop_replication_slot('it_phy_slot3');
+
+-- Test inactive_timeout option of logical slots.
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot1', plugin := 'test_decoding', inactive_timeout := 600);
+SELECT 'init' FROM pg_create_logical_replication_slot(slot_name := 'it_log_slot2', plugin := 'test_decoding');
+
+-- Copy logical slot with inactive_timeout option set.
+SELECT 'copy' FROM pg_copy_logical_replication_slot(src_slot_name := 'it_log_slot1', dst_slot_name := 'it_log_slot3');
+
+SELECT slot_name, slot_type, inactive_timeout FROM pg_replication_slots ORDER BY 1;
+
+SELECT pg_drop_replication_slot('it_log_slot1');
+SELECT pg_drop_replication_slot('it_log_slot2');
+SELECT pg_drop_replication_slot('it_log_slot3');
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 8ecc02f2b9..2cc26e927a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28373,7 +28373,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <indexterm>
          <primary>pg_create_physical_replication_slot</primary>
         </indexterm>
-        <function>pg_create_physical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type> <optional>, <parameter>immediately_reserve</parameter> <type>boolean</type>, <parameter>temporary</parameter> <type>boolean</type> </optional> )
+        <function>pg_create_physical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type> <optional>, <parameter>immediately_reserve</parameter> <type>boolean</type>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>inactive_timeout</parameter> <type>integer</type> </optional>)
         <returnvalue>record</returnvalue>
         ( <parameter>slot_name</parameter> <type>name</type>,
         <parameter>lsn</parameter> <type>pg_lsn</type> )
@@ -28390,9 +28390,12 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         parameter, <parameter>temporary</parameter>, when set to true, specifies that
         the slot should not be permanently stored to disk and is only meant
         for use by the current session. Temporary slots are also
-        released upon any error. This function corresponds
-        to the replication protocol command <literal>CREATE_REPLICATION_SLOT
-        ... PHYSICAL</literal>.
+        released upon any error. The optional fourth
+        parameter, <parameter>inactive_timeout</parameter>, when set to a
+        non-zero value, specifies the amount of time in seconds the slot is
+        allowed to be inactive. This function corresponds to the replication
+        protocol command
+        <literal>CREATE_REPLICATION_SLOT ... PHYSICAL</literal>.
        </para></entry>
       </row>
 
@@ -28417,7 +28420,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <indexterm>
          <primary>pg_create_logical_replication_slot</primary>
         </indexterm>
-        <function>pg_create_logical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type>, <parameter>plugin</parameter> <type>name</type> <optional>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>twophase</parameter> <type>boolean</type>, <parameter>failover</parameter> <type>boolean</type> </optional> )
+        <function>pg_create_logical_replication_slot</function> ( <parameter>slot_name</parameter> <type>name</type>, <parameter>plugin</parameter> <type>name</type> <optional>, <parameter>temporary</parameter> <type>boolean</type>, <parameter>twophase</parameter> <type>boolean</type>, <parameter>failover</parameter> <type>boolean</type>, <parameter>inactive_timeout</parameter> <type>integer</type> </optional> )
         <returnvalue>record</returnvalue>
         ( <parameter>slot_name</parameter> <type>name</type>,
         <parameter>lsn</parameter> <type>pg_lsn</type> )
@@ -28436,7 +28439,10 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         <parameter>failover</parameter>, when set to true,
         specifies that this slot is enabled to be synced to the
         standbys so that logical replication can be resumed after
-        failover. A call to this function has the same effect as
+        failover. The optional sixth parameter,
+        <parameter>inactive_timeout</parameter>, when set to a
+        non-zero value, specifies the amount of time in seconds the slot is
+        allowed to be inactive. A call to this function has the same effect as
         the replication protocol command
         <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
        </para></entry>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..a6cb13fd9d 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2523,6 +2523,15 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>inactive_timeout</structfield> <type>integer</type>
+      </para>
+      <para>
+        The amount of time in seconds the slot is allowed to be inactive.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>inactive_since</structfield> <type>timestamptz</type>
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index fe2bb50f46..af27616657 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -469,6 +469,7 @@ AS 'pg_logical_emit_message_bytea';
 CREATE OR REPLACE FUNCTION pg_create_physical_replication_slot(
     IN slot_name name, IN immediately_reserve boolean DEFAULT false,
     IN temporary boolean DEFAULT false,
+    IN inactive_timeout int DEFAULT 0,
     OUT slot_name name, OUT lsn pg_lsn)
 RETURNS RECORD
 LANGUAGE INTERNAL
@@ -480,6 +481,7 @@ CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot(
     IN temporary boolean DEFAULT false,
     IN twophase boolean DEFAULT false,
     IN failover boolean DEFAULT false,
+    IN inactive_timeout int DEFAULT 0,
     OUT slot_name name, OUT lsn pg_lsn)
 RETURNS RECORD
 LANGUAGE INTERNAL
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 401fb35947..7d9d743dd5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,6 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
+            L.inactive_timeout,
             L.inactive_since,
             L.conflicting,
             L.invalidation_reason,
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index c1905ce24b..91c1604d3c 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -131,6 +131,7 @@ typedef struct RemoteSlot
 	char	   *database;
 	bool		two_phase;
 	bool		failover;
+	int			inactive_timeout;	/* in seconds */
 	XLogRecPtr	restart_lsn;
 	XLogRecPtr	confirmed_lsn;
 	TransactionId catalog_xmin;
@@ -171,7 +172,8 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		remote_slot->failover == slot->data.failover &&
 		remote_slot->confirmed_lsn == slot->data.confirmed_flush &&
 		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 &&
-		remote_slot->inactive_since == slot->inactive_since)
+		remote_slot->inactive_since == slot->inactive_since &&
+		remote_slot->inactive_timeout == slot->data.inactive_timeout)
 		return false;
 
 	/* Avoid expensive operations while holding a spinlock. */
@@ -187,6 +189,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 	slot->data.catalog_xmin = remote_slot->catalog_xmin;
 	slot->effective_catalog_xmin = remote_slot->catalog_xmin;
 	slot->inactive_since = remote_slot->inactive_since;
+	slot->data.inactive_timeout = remote_slot->inactive_timeout;
 	SpinLockRelease(&slot->mutex);
 
 	if (xmin_changed)
@@ -612,7 +615,8 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		ReplicationSlotCreate(remote_slot->name, true, RS_TEMPORARY,
 							  remote_slot->two_phase,
 							  remote_slot->failover,
-							  true);
+							  true,
+							  remote_slot->inactive_timeout);
 
 		/* For shorter lines. */
 		slot = MyReplicationSlot;
@@ -657,9 +661,9 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 static bool
 synchronize_slots(WalReceiverConn *wrconn)
 {
-#define SLOTSYNC_COLUMN_COUNT 10
+#define SLOTSYNC_COLUMN_COUNT 11
 	Oid			slotRow[SLOTSYNC_COLUMN_COUNT] = {TEXTOID, TEXTOID, LSNOID,
-	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID, TIMESTAMPTZOID};
+	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID, TIMESTAMPTZOID, INT4OID};
 
 	WalRcvExecResult *res;
 	TupleTableSlot *tupslot;
@@ -668,7 +672,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, invalidation_reason, inactive_since"
+		" database, invalidation_reason, inactive_since, inactive_timeout"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
@@ -751,6 +755,9 @@ synchronize_slots(WalReceiverConn *wrconn)
 		remote_slot->inactive_since = DatumGetTimestampTz(slot_getattr(tupslot, ++col,
 																	   &isnull));
 
+		remote_slot->inactive_timeout = DatumGetInt32(slot_getattr(tupslot, ++col,
+																   &isnull));
+
 		/* Sanity check */
 		Assert(col == SLOTSYNC_COLUMN_COUNT);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 05bc453de9..5653e680a8 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -130,7 +130,7 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 	sizeof(ReplicationSlotOnDisk) - ReplicationSlotOnDiskConstantSize
 
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
-#define SLOT_VERSION	5		/* version for new files */
+#define SLOT_VERSION	6		/* version for new files */
 
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
@@ -305,11 +305,14 @@ ReplicationSlotValidateName(const char *name, int elevel)
  * failover: If enabled, allows the slot to be synced to standbys so
  *     that logical replication can be resumed after failover.
  * synced: True if the slot is synchronized from the primary server.
+ * inactive_timeout: The amount of time in seconds the slot is allowed to be
+ *     inactive.
  */
 void
 ReplicationSlotCreate(const char *name, bool db_specific,
 					  ReplicationSlotPersistency persistency,
-					  bool two_phase, bool failover, bool synced)
+					  bool two_phase, bool failover, bool synced,
+					  int inactive_timeout)
 {
 	ReplicationSlot *slot = NULL;
 	int			i;
@@ -399,6 +402,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->data.two_phase_at = InvalidXLogRecPtr;
 	slot->data.failover = failover;
 	slot->data.synced = synced;
+	slot->data.inactive_timeout = inactive_timeout;
 
 	/* and then data only present in shared memory */
 	slot->just_dirtied = false;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index da57177c25..6e1d8d1f9a 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -38,14 +38,15 @@
  */
 static void
 create_physical_replication_slot(char *name, bool immediately_reserve,
-								 bool temporary, XLogRecPtr restart_lsn)
+								 bool temporary, int inactive_timeout,
+								 XLogRecPtr restart_lsn)
 {
 	Assert(!MyReplicationSlot);
 
 	/* acquire replication slot, this will check for conflicting names */
 	ReplicationSlotCreate(name, false,
 						  temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
-						  false, false);
+						  false, false, inactive_timeout);
 
 	if (immediately_reserve)
 	{
@@ -71,6 +72,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 	Name		name = PG_GETARG_NAME(0);
 	bool		immediately_reserve = PG_GETARG_BOOL(1);
 	bool		temporary = PG_GETARG_BOOL(2);
+	int			inactive_timeout = PG_GETARG_INT32(3);	/* in seconds */
 	Datum		values[2];
 	bool		nulls[2];
 	TupleDesc	tupdesc;
@@ -84,9 +86,15 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 
 	CheckSlotRequirements();
 
+	if (inactive_timeout < 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("\"inactive_timeout\" must not be negative")));
+
 	create_physical_replication_slot(NameStr(*name),
 									 immediately_reserve,
 									 temporary,
+									 inactive_timeout,
 									 InvalidXLogRecPtr);
 
 	values[0] = NameGetDatum(&MyReplicationSlot->data.name);
@@ -120,7 +128,7 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS)
 static void
 create_logical_replication_slot(char *name, char *plugin,
 								bool temporary, bool two_phase,
-								bool failover,
+								bool failover, int inactive_timeout,
 								XLogRecPtr restart_lsn,
 								bool find_startpoint)
 {
@@ -138,7 +146,7 @@ create_logical_replication_slot(char *name, char *plugin,
 	 */
 	ReplicationSlotCreate(name, true,
 						  temporary ? RS_TEMPORARY : RS_EPHEMERAL, two_phase,
-						  failover, false);
+						  failover, false, inactive_timeout);
 
 	/*
 	 * Create logical decoding context to find start point or, if we don't
@@ -177,6 +185,7 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 	bool		temporary = PG_GETARG_BOOL(2);
 	bool		two_phase = PG_GETARG_BOOL(3);
 	bool		failover = PG_GETARG_BOOL(4);
+	int			inactive_timeout = PG_GETARG_INT32(5);	/* in seconds */
 	Datum		result;
 	TupleDesc	tupdesc;
 	HeapTuple	tuple;
@@ -190,11 +199,17 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS)
 
 	CheckLogicalDecodingRequirements();
 
+	if (inactive_timeout < 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("\"inactive_timeout\" must not be negative")));
+
 	create_logical_replication_slot(NameStr(*name),
 									NameStr(*plugin),
 									temporary,
 									two_phase,
 									failover,
+									inactive_timeout,
 									InvalidXLogRecPtr,
 									true);
 
@@ -239,7 +254,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 19
+#define PG_GET_REPLICATION_SLOTS_COLS 20
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
@@ -410,6 +425,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
+		values[i++] = Int32GetDatum(slot_contents.data.inactive_timeout);
+
 		if (slot_contents.inactive_since > 0)
 			values[i++] = TimestampTzGetDatum(slot_contents.inactive_since);
 		else
@@ -720,6 +737,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 	XLogRecPtr	src_restart_lsn;
 	bool		src_islogical;
 	bool		temporary;
+	int			inactive_timeout;	/* in seconds */
 	char	   *plugin;
 	Datum		values[2];
 	bool		nulls[2];
@@ -776,6 +794,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 	src_restart_lsn = first_slot_contents.data.restart_lsn;
 	temporary = (first_slot_contents.data.persistency == RS_TEMPORARY);
 	plugin = logical_slot ? NameStr(first_slot_contents.data.plugin) : NULL;
+	inactive_timeout = first_slot_contents.data.inactive_timeout;
 
 	/* Check type of replication slot */
 	if (src_islogical != logical_slot)
@@ -823,6 +842,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 										temporary,
 										false,
 										false,
+										inactive_timeout,
 										src_restart_lsn,
 										false);
 	}
@@ -830,6 +850,7 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot)
 		create_physical_replication_slot(NameStr(*dst_name),
 										 true,
 										 temporary,
+										 inactive_timeout,
 										 src_restart_lsn);
 
 	/*
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bc40c454de..5315c08650 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1221,7 +1221,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 	{
 		ReplicationSlotCreate(cmd->slotname, false,
 							  cmd->temporary ? RS_TEMPORARY : RS_PERSISTENT,
-							  false, false, false);
+							  false, false, false, 0);
 
 		if (reserve_wal)
 		{
@@ -1252,7 +1252,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 		 */
 		ReplicationSlotCreate(cmd->slotname, true,
 							  cmd->temporary ? RS_TEMPORARY : RS_EPHEMERAL,
-							  two_phase, failover, false);
+							  two_phase, failover, false, 0);
 
 		/*
 		 * Do options check early so that we can bail before calling the
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 95c22a7200..12626987f0 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -676,7 +676,8 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, failover, "
-							"%s as caught_up, invalidation_reason IS NOT NULL as invalid "
+							"%s as caught_up, invalidation_reason IS NOT NULL as invalid, "
+							"inactive_timeout "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
@@ -696,6 +697,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 		int			i_failover;
 		int			i_caught_up;
 		int			i_invalid;
+		int			i_inactive_timeout;
 
 		slotinfos = (LogicalSlotInfo *) pg_malloc(sizeof(LogicalSlotInfo) * num_slots);
 
@@ -705,6 +707,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 		i_failover = PQfnumber(res, "failover");
 		i_caught_up = PQfnumber(res, "caught_up");
 		i_invalid = PQfnumber(res, "invalid");
+		i_inactive_timeout = PQfnumber(res, "inactive_timeout");
 
 		for (int slotnum = 0; slotnum < num_slots; slotnum++)
 		{
@@ -716,6 +719,7 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 			curr->failover = (strcmp(PQgetvalue(res, slotnum, i_failover), "t") == 0);
 			curr->caught_up = (strcmp(PQgetvalue(res, slotnum, i_caught_up), "t") == 0);
 			curr->invalid = (strcmp(PQgetvalue(res, slotnum, i_invalid), "t") == 0);
+			curr->inactive_timeout = atooid(PQgetvalue(res, slotnum, i_inactive_timeout));
 		}
 	}
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index f6143b6bc4..2656056103 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -931,9 +931,10 @@ create_logical_replication_slots(void)
 			appendPQExpBuffer(query, ", ");
 			appendStringLiteralConn(query, slot_info->plugin, conn);
 
-			appendPQExpBuffer(query, ", false, %s, %s);",
+			appendPQExpBuffer(query, ", false, %s, %s, %d);",
 							  slot_info->two_phase ? "true" : "false",
-							  slot_info->failover ? "true" : "false");
+							  slot_info->failover ? "true" : "false",
+							  slot_info->inactive_timeout);
 
 			PQclear(executeQueryOrDie(conn, "%s", query->data));
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 92bcb693fb..eb86d000b1 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -162,6 +162,8 @@ typedef struct
 	bool		invalid;		/* if true, the slot is unusable */
 	bool		failover;		/* is the slot designated to be synced to the
 								 * physical standby? */
+	int			inactive_timeout;	/* The amount of time in seconds the slot
+									 * is allowed to be inactive. */
 } LogicalSlotInfo;
 
 typedef struct
diff --git a/src/bin/pg_upgrade/t/003_logical_slots.pl b/src/bin/pg_upgrade/t/003_logical_slots.pl
index 83d71c3084..6e82d2cb7b 100644
--- a/src/bin/pg_upgrade/t/003_logical_slots.pl
+++ b/src/bin/pg_upgrade/t/003_logical_slots.pl
@@ -153,14 +153,17 @@ like(
 # TEST: Successful upgrade
 
 # Preparations for the subsequent test:
-# 1. Setup logical replication (first, cleanup slots from the previous tests)
+# 1. Setup logical replication (first, cleanup slots from the previous tests,
+# and then create slot for this test with inactive_timeout set).
 my $old_connstr = $oldpub->connstr . ' dbname=postgres';
 
+my $inactive_timeout = 3600;
 $oldpub->start;
 $oldpub->safe_psql(
 	'postgres', qq[
 	SELECT * FROM pg_drop_replication_slot('test_slot1');
 	SELECT * FROM pg_drop_replication_slot('test_slot2');
+	SELECT pg_create_logical_replication_slot(slot_name := 'regress_sub', plugin := 'pgoutput', inactive_timeout := $inactive_timeout);
 	CREATE PUBLICATION regress_pub FOR ALL TABLES;
 ]);
 
@@ -172,7 +175,7 @@ $sub->start;
 $sub->safe_psql(
 	'postgres', qq[
 	CREATE TABLE tbl (a int);
-	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (two_phase = 'true', failover = 'true')
+	CREATE SUBSCRIPTION regress_sub CONNECTION '$old_connstr' PUBLICATION regress_pub WITH (slot_name = 'regress_sub', create_slot = false, two_phase = 'true', failover = 'true')
 ]);
 $sub->wait_for_subscription_sync($oldpub, 'regress_sub');
 
@@ -192,8 +195,8 @@ command_ok([@pg_upgrade_cmd], 'run of pg_upgrade of old cluster');
 # Check that the slot 'regress_sub' has migrated to the new cluster
 $newpub->start;
 my $result = $newpub->safe_psql('postgres',
-	"SELECT slot_name, two_phase, failover FROM pg_replication_slots");
-is($result, qq(regress_sub|t|t), 'check the slot exists on new cluster');
+	"SELECT slot_name, two_phase, failover, inactive_timeout = $inactive_timeout FROM pg_replication_slots");
+is($result, qq(regress_sub|t|t|t), 'check the slot exists on new cluster');
 
 # Update the connection
 my $new_connstr = $newpub->connstr . ' dbname=postgres';
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 2f7cfc02c6..ea4ffb509a 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11105,10 +11105,10 @@
 # replication slots
 { oid => '3779', descr => 'create a physical replication slot',
   proname => 'pg_create_physical_replication_slot', provolatile => 'v',
-  proparallel => 'u', prorettype => 'record', proargtypes => 'name bool bool',
-  proallargtypes => '{name,bool,bool,name,pg_lsn}',
-  proargmodes => '{i,i,i,o,o}',
-  proargnames => '{slot_name,immediately_reserve,temporary,slot_name,lsn}',
+  proparallel => 'u', prorettype => 'record', proargtypes => 'name bool bool int4',
+  proallargtypes => '{name,bool,bool,int4,name,pg_lsn}',
+  proargmodes => '{i,i,i,i,o,o}',
+  proargnames => '{slot_name,immediately_reserve,temporary,inactive_timeout,slot_name,lsn}',
   prosrc => 'pg_create_physical_replication_slot' },
 { oid => '4220',
   descr => 'copy a physical replication slot, changing temporality',
@@ -11133,17 +11133,17 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,timestamptz,bool,text,bool,bool}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,inactive_since,conflicting,invalidation_reason,failover,synced}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,int4,timestamptz,bool,text,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,inactive_timeout,inactive_since,conflicting,invalidation_reason,failover,synced}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
   proparallel => 'u', prorettype => 'record',
-  proargtypes => 'name name bool bool bool',
-  proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
-  proargmodes => '{i,i,i,i,i,o,o}',
-  proargnames => '{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
+  proargtypes => 'name name bool bool bool int4',
+  proallargtypes => '{name,name,bool,bool,bool,int4,name,pg_lsn}',
+  proargmodes => '{i,i,i,i,i,i,o,o}',
+  proargnames => '{slot_name,plugin,temporary,twophase,failover,inactive_timeout,slot_name,lsn}',
   prosrc => 'pg_create_logical_replication_slot' },
 { oid => '4222',
   descr => 'copy a logical replication slot, changing temporality and plugin',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7b937d1a0c..5a812ef528 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -127,6 +127,9 @@ typedef struct ReplicationSlotPersistentData
 	 * for logical slots on the primary server.
 	 */
 	bool		failover;
+
+	/* The amount of time in seconds the slot is allowed to be inactive */
+	int			inactive_timeout;
 } ReplicationSlotPersistentData;
 
 /*
@@ -239,7 +242,7 @@ extern void ReplicationSlotsShmemInit(void);
 extern void ReplicationSlotCreate(const char *name, bool db_specific,
 								  ReplicationSlotPersistency persistency,
 								  bool two_phase, bool failover,
-								  bool synced);
+								  bool synced, int inactive_timeout);
 extern void ReplicationSlotPersist(void);
 extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index a5aa2e3260..71fd2e708b 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -159,8 +159,9 @@ log_min_messages = 'debug2'
 $primary->append_conf('postgresql.conf', "log_min_messages = 'debug2'");
 $primary->reload;
 
+my $inactive_timeout = 3600;
 $primary->psql('postgres',
-	q{SELECT pg_create_logical_replication_slot('lsub2_slot', 'test_decoding', false, false, true);}
+	"SELECT pg_create_logical_replication_slot('lsub2_slot', 'test_decoding', false, false, true, $inactive_timeout);"
 );
 
 $primary->psql('postgres',
@@ -214,6 +215,16 @@ is( $standby1->safe_psql(
 	"t",
 	'synchronized slot has got the inactive_since from the primary');
 
+# Confirm that the synced slot on the standby has got inactive_timeout from the
+# primary.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT inactive_timeout = $inactive_timeout FROM pg_replication_slots
+			WHERE slot_name = 'lsub2_slot' AND synced AND NOT temporary;"
+	),
+	"t",
+	'synced logical slot has got inactive_timeout on standby');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f53c3036a6..7f3b70f598 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,12 +1473,13 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
+    l.inactive_timeout,
     l.inactive_since,
     l.conflicting,
     l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, inactive_since, conflicting, invalidation_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, inactive_timeout, inactive_since, conflicting, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

v24-0004-Add-inactive_timeout-based-replication-slot-inva.patchapplication/octet-stream; name=v24-0004-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 62bbc6cbf5f23afe9db2a44b4ae9e513861c9306 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 26 Mar 2024 16:20:26 +0000
Subject: [PATCH v24 4/4] Add inactive_timeout based replication slot
 invalidation.

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days at slot level, after which the inactive slots get
dropped.

To achieve the above, postgres uses replication slot property
last_inactive_time (the time at which the slot became inactive),
and a new slot level parameter inactive_timeout and finds an
opportunity to invalidate the slot based on this new mechanism.
The invalidation check happens at various locations to help
being as latest as possible, these locations include the
following:
- Whenever the slot is acquired if the slot
  gets invalidated due to this new mechanism, an error is
  emitted.
- During checkpoint.

Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/func.sgml                        |   8 +-
 doc/src/sgml/system-views.sgml                |  10 +-
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 173 +++++++++++++++++-
 src/backend/replication/slotfuncs.c           |  12 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/bin/pg_upgrade/pg_upgrade.h               |   3 +-
 src/include/replication/slot.h                |   8 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 169 +++++++++++++++++
 12 files changed, 377 insertions(+), 19 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2cc26e927a..fb1640ae12 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28393,8 +28393,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         released upon any error. The optional fourth
         parameter, <parameter>inactive_timeout</parameter>, when set to a
         non-zero value, specifies the amount of time in seconds the slot is
-        allowed to be inactive. This function corresponds to the replication
-        protocol command
+        allowed to be inactive before getting invalidated.
+        This function corresponds to the replication protocol command
         <literal>CREATE_REPLICATION_SLOT ... PHYSICAL</literal>.
        </para></entry>
       </row>
@@ -28442,8 +28442,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
         failover. The optional sixth parameter,
         <parameter>inactive_timeout</parameter>, when set to a
         non-zero value, specifies the amount of time in seconds the slot is
-        allowed to be inactive. A call to this function has the same effect as
-        the replication protocol command
+        allowed to be inactive before getting invalidated. A call to this
+        function has the same effect as the replication protocol command
         <literal>CREATE_REPLICATION_SLOT ... LOGICAL</literal>.
        </para></entry>
       </row>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a6cb13fd9d..3b09838a0b 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2528,7 +2528,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        <structfield>inactive_timeout</structfield> <type>integer</type>
       </para>
       <para>
-        The amount of time in seconds the slot is allowed to be inactive.
+        The amount of time in seconds the slot is allowed to be inactive
+        before getting invalidated.
       </para></entry>
      </row>
 
@@ -2582,6 +2583,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by slot's
+          <literal>inactive_timeout</literal> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 91c1604d3c..b9ed12468c 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -324,7 +324,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -534,7 +534,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
 		 * if the slot is not acquired by other processes.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 5653e680a8..394428c24f 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -108,10 +108,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -159,6 +160,8 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool InvalidateSlotForInactiveTimeout(ReplicationSlot *slot,
+											 bool need_locks);
 
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
@@ -306,7 +309,7 @@ ReplicationSlotValidateName(const char *name, int elevel)
  *     that logical replication can be resumed after failover.
  * synced: True if the slot is synchronized from the primary server.
  * inactive_timeout: The amount of time in seconds the slot is allowed to be
- *     inactive.
+ *     inactive before getting invalidated.
  */
 void
 ReplicationSlotCreate(const char *name, bool db_specific,
@@ -540,9 +543,14 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * If check_for_invalidation is true, the slot is checked for invalidation
+ * based on its inactive_timeout parameter and an error is raised after making
+ * the slot ours.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -620,6 +628,42 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * Check if the given slot can be invalidated based on its
+	 * inactive_timeout parameter. If yes, persist the invalidated state to
+	 * disk and then error out. We do this only after making the slot ours to
+	 * avoid anyone else acquiring it while we check for its invalidation.
+	 */
+	if (check_for_invalidation)
+	{
+		/* The slot is ours by now */
+		Assert(s->active_pid == MyProcPid);
+
+		/*
+		 * Well, the slot is not yet ours really unless we check for the
+		 * invalidation below.
+		 */
+		s->active_pid = 0;
+		if (InvalidateReplicationSlotForInactiveTimeout(s, true, true))
+		{
+			/*
+			 * If the slot has been invalidated, recalculate the resource
+			 * limits.
+			 */
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+
+			/* Might need it for slot clean up on error, so restore it */
+			s->active_pid = MyProcPid;
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("cannot acquire invalidated replication slot \"%s\"",
+							NameStr(MyReplicationSlot->data.name)),
+					 errdetail("This slot has been invalidated because of its inactive_timeout parameter.")));
+		}
+		s->active_pid = MyProcPid;
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -797,7 +841,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -820,7 +864,7 @@ ReplicationSlotAlter(const char *name, bool failover)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1522,6 +1566,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by slot's inactive_timeout parameter."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1635,6 +1682,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (InvalidateReplicationSlotForInactiveTimeout(s, false, false))
+						invalidation_cause = cause;
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1788,6 +1839,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1839,6 +1891,102 @@ restart:
 	return invalidated;
 }
 
+/*
+ * Invalidate given slot based on its inactive_timeout parameter.
+ *
+ * Returns true if the slot has got invalidated.
+ *
+ * NB - this function also runs as part of checkpoint, so avoid raising errors
+ * if possible.
+ */
+bool
+InvalidateReplicationSlotForInactiveTimeout(ReplicationSlot *slot,
+											bool need_locks,
+											bool persist_state)
+{
+	if (!InvalidateSlotForInactiveTimeout(slot, need_locks))
+		return false;
+
+	Assert(slot->active_pid == 0);
+
+	SpinLockAcquire(&slot->mutex);
+	slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT;
+
+	/* Make sure the invalidated state persists across server restart */
+	slot->just_dirtied = true;
+	slot->dirty = true;
+	SpinLockRelease(&slot->mutex);
+
+	if (persist_state)
+	{
+		char		path[MAXPGPATH];
+
+		sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+		SaveSlotToPath(slot, path, ERROR);
+	}
+
+	ReportSlotInvalidation(RS_INVAL_INACTIVE_TIMEOUT, false, 0,
+						   slot->data.name, InvalidXLogRecPtr,
+						   InvalidXLogRecPtr, InvalidTransactionId);
+
+	return true;
+}
+
+/*
+ * Helper for InvalidateReplicationSlotForInactiveTimeout
+ */
+static bool
+InvalidateSlotForInactiveTimeout(ReplicationSlot *slot, bool need_locks)
+{
+	ReplicationSlotInvalidationCause inavidation_cause = RS_INVAL_NONE;
+
+	if (slot->inactive_since == 0 ||
+		slot->data.inactive_timeout == 0)
+		return false;
+
+	/*
+	 * Do not invalidate the slots which are currently being synced from the
+	 * primary to the standby.
+	 */
+	if (RecoveryInProgress() && slot->data.synced)
+		return false;
+
+	if (need_locks)
+	{
+		LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+		SpinLockAcquire(&slot->mutex);
+	}
+
+	Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
+
+	/*
+	 * Check if the slot needs to be invalidated due to inactive_timeout. We
+	 * do this with the spinlock held to avoid race conditions -- for example
+	 * the restart_lsn could move forward, or the slot could be dropped.
+	 */
+	if (slot->inactive_since > 0 &&
+		slot->data.inactive_timeout > 0)
+	{
+		TimestampTz now;
+
+		/* inactive_since is only tracked for inactive slots */
+		Assert(slot->active_pid == 0);
+
+		now = GetCurrentTimestamp();
+		if (TimestampDifferenceExceeds(slot->inactive_since, now,
+									   slot->data.inactive_timeout * 1000))
+			inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+	}
+
+	if (need_locks)
+	{
+		SpinLockRelease(&slot->mutex);
+		LWLockRelease(ReplicationSlotControlLock);
+	}
+
+	return (inavidation_cause == RS_INVAL_INACTIVE_TIMEOUT);
+}
+
 /*
  * Flush all replication slots to disk.
  *
@@ -1851,6 +1999,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated = false;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -1874,6 +2023,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		/* save the slot to disk, locking is handled in SaveSlotToPath() */
 		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
 
+		/*
+		 * Here's an opportunity to invalidate inactive replication slots
+		 * based on timeout, so let's do it.
+		 */
+		if (InvalidateReplicationSlotForInactiveTimeout(s, true, false))
+			invalidated = true;
+
 		/*
 		 * Slot's data is not flushed each time the confirmed_flush LSN is
 		 * updated as that could lead to frequent writes.  However, we decide
@@ -1900,6 +2056,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	/* If the slot has been invalidated, recalculate the resource limits */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 6e1d8d1f9a..4ea4db0f87 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -258,6 +258,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	XLogRecPtr	currlsn;
 	int			slotno;
+	bool		invalidated = false;
 
 	/*
 	 * We don't require any special permission to see this function's data
@@ -466,6 +467,15 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 	LWLockRelease(ReplicationSlotControlLock);
 
+	/*
+	 * If the slot has been invalidated, recalculate the resource limits
+	 */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
+
 	return (Datum) 0;
 }
 
@@ -668,7 +678,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 5315c08650..7dda2f5a66 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -846,7 +846,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1459,7 +1459,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index eb86d000b1..38d105c5d6 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -163,7 +163,8 @@ typedef struct
 	bool		failover;		/* is the slot designated to be synced to the
 								 * physical standby? */
 	int			inactive_timeout;	/* The amount of time in seconds the slot
-									 * is allowed to be inactive. */
+									 * is allowed to be inactive before
+									 * getting invalidated. */
 } LogicalSlotInfo;
 
 typedef struct
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 5a812ef528..75b0bad083 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -248,7 +250,8 @@ extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, bool failover);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
@@ -267,6 +270,9 @@ extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
+extern bool InvalidateReplicationSlotForInactiveTimeout(ReplicationSlot *slot,
+														bool need_locks,
+														bool persist_state);
 extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock);
 extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..da04dfc7fc
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,169 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot due to inactive_timeout
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+$standby1->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb1_slot'
+});
+
+# Set timeout so that the slot when inactive will get invalidated after the
+# timeout.
+my $inactive_timeout = 5;
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot', inactive_timeout := $inactive_timeout);
+]);
+
+$standby1->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Check inactive_timeout is what we've set above
+my $result = $primary->safe_psql(
+	'postgres', qq[
+	SELECT inactive_timeout = $inactive_timeout
+		FROM pg_replication_slots WHERE slot_name = 'sb1_slot';
+]);
+is($result, "t",
+	'check the inactive replication slot info for an active slot');
+
+my $logstart = -s $primary->logfile;
+
+# Stop standby to make the replication slot on primary inactive
+$standby1->stop;
+
+# Wait for the inactive replication slot info to be updated
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE inactive_since IS NOT NULL
+            AND slot_name = 'sb1_slot'
+            AND inactive_timeout = $inactive_timeout;
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+check_slots_invalidation_in_server_log($primary, 'sb1_slot', $logstart);
+
+# Wait for the inactive replication slots to be invalidated.
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sb1_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for inactive replication slot sb1_slot to be invalidated";
+
+# Testcase end: Invalidate streaming standby's slot due to inactive_timeout
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to inactive_timeout
+
+my $publisher = $primary;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$subscriber->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput', inactive_timeout := $inactive_timeout);
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+$result = $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the inactive replication slot info to be updated
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE inactive_since IS NOT NULL
+            AND slot_name = 'lsub1_slot'
+            AND inactive_timeout = $inactive_timeout;
+])
+  or die
+  "Timed out while waiting for inactive replication slot info to be updated";
+
+check_slots_invalidation_in_server_log($publisher, 'lsub1_slot', $logstart);
+
+# Testcase end: Invalidate logical subscriber's slot due to inactive_timeout
+# =============================================================================
+
+# Check for invalidation of slot in server log.
+sub check_slots_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"", $offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated, "check that slot $slot_name invalidation has been logged");
+}
+
+done_testing();
-- 
2.34.1

#165

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#164)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Mar 26, 2024 at 09:59:23PM +0530, Bharath Rupireddy wrote:

On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

If we just sync inactive_since value for synced slots while in
recovery from the primary, so be it. Why do we need to update it to
the current time when the slot is being created? We don't expose slot
creation time, no? Aren't we fine if we just sync the value from
primary and document that fact? After the promotion, we can reset it
to the current time so that it gets its own time.

I'm attaching v24 patches. It implements the above idea proposed
upthread for synced slots. I've now separated
s/last_inactive_time/inactive_since and synced slots behaviour. Please
have a look.

Thanks!

==== v24-0001

It's now pure mechanical changes and it looks good to me.

==== v24-0002

1 ===

This commit does two things:
1) Updates inactive_since for sync slots with the value
received from the primary's slot.

Tested it and it does that.

2 ===

2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Tested it and it does that.

3 ===

+/*
+ * Reset the synced slots info such as inactive_since after shutting
+ * down the slot sync machinery.
+ */
+static void
+update_synced_slots_inactive_time(void)

Looks like the comment "reset" is not matching the name of the function and
what it does.

4 ===

+                       /*
+                        * We get the current time beforehand and only once to avoid
+                        * system calls overhead while holding the lock.
+                        */
+                       if (now == 0)
+                               now = GetCurrentTimestamp();

Also +1 of having GetCurrentTimestamp() just called one time within the loop.

5 ===

-               if (!(RecoveryInProgress() && slot->data.synced))
+               if (!(InRecovery && slot->data.synced))
                        slot->inactive_since = GetCurrentTimestamp();
                else
                        slot->inactive_since = 0;

Not related to this change but more the way RestoreSlotFromDisk() behaves here:

For a sync slot on standby it will be set to zero and then later will be
synchronized with the one coming from the primary. I think that's fine to have
it to zero for this window of time.

Now, if the standby is down and one sets sync_replication_slots to off,
then inactive_since will be set to zero on the standby at startup and not
synchronized (unless one triggers a manual sync). I also think that's fine but
it might be worth to document this behavior (that after a standby startup
inactive_since is zero until the next sync...).

6 ===

+ print "HI $slot_name $name $inactive_since $slot_creation_time\n";

garbage?

7 ===

+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
+{
+       my ($node, $slot_name, $slot_creation_time) = @_;
+       my $name = $node->name;

We know have capture_and_validate_slot_inactive_since at 2 places:
040_standby_failover_slots_sync.pl and 019_replslot_limit.pl.

Worth to create a sub in Cluster.pm?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#166

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#164)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 9:59 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

If we just sync inactive_since value for synced slots while in
recovery from the primary, so be it. Why do we need to update it to
the current time when the slot is being created? We don't expose slot
creation time, no? Aren't we fine if we just sync the value from
primary and document that fact? After the promotion, we can reset it
to the current time so that it gets its own time.

I'm attaching v24 patches. It implements the above idea proposed
upthread for synced slots. I've now separated
s/last_inactive_time/inactive_since and synced slots behaviour. Please
have a look.

Thanks for the patches. Few trivial comments for v24-002:

1)
slot.c:
+ * data from the remote slot. We use InRecovery flag instead of
+ * RecoveryInProgress() as it always returns true even for normal
+ * server startup.

a) Not clear what 'it' refers to. Better to use 'the latter'
b) Is it better to mention the primary here:
'as the latter always returns true even on the primary server during startup'.

2)
update_local_synced_slot():

- strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
+ strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 &&
+ remote_slot->inactive_since == slot->inactive_since)

When this code was written initially, the intent was to do strcmp at
the end (only if absolutely needed). It will be good if we maintain
the same and add new checks before strcmp.

3)
update_synced_slots_inactive_time():

This assert is removed, is it intentional?
Assert(s->active_pid == 0);

4)
040_standby_failover_slots_sync.pl:

+# Capture the inactive_since of the slot from the standby the logical failover
+# slots are synced/created on the standby.

The comment is unclear, something seems missing.

thanks
Shveta

#167

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#165)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

I'm attaching v24 patches. It implements the above idea proposed
upthread for synced slots.

==== v24-0002

1 ===

This commit does two things:
1) Updates inactive_since for sync slots with the value
received from the primary's slot.

Tested it and it does that.

Thanks. I've added a test case for this.

2 ===

2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Tested it and it does that.

Thanks. I've added a test case for this.

3 ===
+/*
+ * Reset the synced slots info such as inactive_since after shutting
+ * down the slot sync machinery.
+ */
+static void
+update_synced_slots_inactive_time(void)
Looks like the comment "reset" is not matching the name of the function and
what it does.

Changed. I've also changed the function name to
update_synced_slots_inactive_since to be precise on what it exactly
does.

4 ===

+                       /*
+                        * We get the current time beforehand and only once to avoid
+                        * system calls overhead while holding the lock.
+                        */
+                       if (now == 0)
+                               now = GetCurrentTimestamp();

Also +1 of having GetCurrentTimestamp() just called one time within the loop.

Right.

5 ===
-               if (!(RecoveryInProgress() && slot->data.synced))
+               if (!(InRecovery && slot->data.synced))
slot->inactive_since = GetCurrentTimestamp();
else
slot->inactive_since = 0;
Not related to this change but more the way RestoreSlotFromDisk() behaves here:

For a sync slot on standby it will be set to zero and then later will be
synchronized with the one coming from the primary. I think that's fine to have
it to zero for this window of time.

Right.

Now, if the standby is down and one sets sync_replication_slots to off,
then inactive_since will be set to zero on the standby at startup and not
synchronized (unless one triggers a manual sync). I also think that's fine but
it might be worth to document this behavior (that after a standby startup
inactive_since is zero until the next sync...).

Isn't this behaviour applicable for other slot parameters that the
slot syncs from the remote slot on the primary?

I've added the following note in the comments when we update
inactive_since in RestoreSlotFromDisk.

* Note that for synced slots after the standby starts up (i.e. after
* the slots are loaded from the disk), the inactive_since will remain
* zero until the next slot sync cycle.
*/
if (!(InRecovery && slot->data.synced))
slot->inactive_since = GetCurrentTimestamp();
else
slot->inactive_since = 0;

6 ===

+ print "HI $slot_name $name $inactive_since $slot_creation_time\n";

garbage?

Removed.

7 ===
+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
+{
+       my ($node, $slot_name, $slot_creation_time) = @_;
+       my $name = $node->name;
We know have capture_and_validate_slot_inactive_since at 2 places:
040_standby_failover_slots_sync.pl and 019_replslot_limit.pl.

Worth to create a sub in Cluster.pm?

I'd second that thought for now. We might have to debate first if it's
useful for all the nodes even without replication, and if yes, the
naming stuff and all that. Historically, we've had such duplicated
functions until recently, for instance advance_wal and log_contains.
We
moved them over to a common perl library Cluster.pm very recently. I'm
sure we can come back later to move it to Cluster.pm.

On Wed, Mar 27, 2024 at 9:02 AM shveta malik <shveta.malik@gmail.com> wrote:

1)
slot.c:
+ * data from the remote slot. We use InRecovery flag instead of
+ * RecoveryInProgress() as it always returns true even for normal
+ * server startup.
a) Not clear what 'it' refers to. Better to use 'the latter'
b) Is it better to mention the primary here:
'as the latter always returns true even on the primary server during startup'.

Modified.

2)
update_local_synced_slot():
- strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
+ strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0 &&
+ remote_slot->inactive_since == slot->inactive_since)
When this code was written initially, the intent was to do strcmp at
the end (only if absolutely needed). It will be good if we maintain
the same and add new checks before strcmp.

Done.

3)
update_synced_slots_inactive_time():

This assert is removed, is it intentional?
Assert(s->active_pid == 0);

Yes, the slot can get acquired in the corner case when someone runs
pg_sync_replication_slots concurrently at this time. I'm referring to
the issue reported upthread. We don't prevent one running
pg_sync_replication_slots in promotion/ShutDownSlotSync phase right?
Maybe we should prevent that otherwise some of the slots are synced
and the standby gets promoted while others are yet-to-be-synced.

4)
040_standby_failover_slots_sync.pl:
+# Capture the inactive_since of the slot from the standby the logical failover
+# slots are synced/created on the standby.
The comment is unclear, something seems missing.

Nice catch. Yes, that was wrong. I've modified it now.

Please find the attached v25-0001 (made this 0001 patch now as
inactive_since patch is committed) patch with the above changes.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v25-0001-Maintain-inactive_since-for-synced-slots-correct.patchapplication/x-patch; name=v25-0001-Maintain-inactive_since-for-synced-slots-correct.patchDownload

From 790f791b4200ff06cfdf55fdf1572436e9d982fd Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 27 Mar 2024 04:23:52 +0000
Subject: [PATCH v25] Maintain inactive_since for synced slots correctly.

The slot's inactive_since isn't currently maintained for
synced slots on the standby. The commit a11f330b55 prevents
updating inactive_since with RecoveryInProgress() check in
RestoreSlotFromDisk(). But, the issue is that
RecoveryInProgress() always returns true in
RestoreSlotFromDisk() as 'xlogctl->SharedRecoveryState' is
always 'RECOVERY_STATE_CRASH' at that time. The impact of this
on a promoted standby inactive_since is always NULL for all
synced slots even after server restart.

Above issue led us to a question as to why we can't just update
inactive_since for synced slots on the standby with the value
received from remote slot on the primary. This is consistent with
any other slot parameter i.e. all of them are synced from the
primary.

This commit does two things:
1) Updates inactive_since for sync slots with the value
received from the primary's slot.

2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACWLctoiH-pSjWnEpR54q4DED6rw_BRJm5pCx86_Y01MoQ%40mail.gmail.com
---
 doc/src/sgml/system-views.sgml                |  4 ++
 src/backend/replication/logical/slotsync.c    | 62 ++++++++++++++++-
 src/backend/replication/slot.c                | 41 ++++++++----
 .../t/040_standby_failover_slots_sync.pl      | 66 +++++++++++++++++++
 4 files changed, 158 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..7713f168e7 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2530,6 +2530,10 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       <para>
         The time since the slot has become inactive.
         <literal>NULL</literal> if the slot is currently being used.
+        Note that the slots that are being synced from a primary server
+        (whose <structfield>synced</structfield> field is true), will get the
+        <structfield>inactive_since</structfield> value from the corresponding
+        remote slot on the primary.
       </para></entry>
      </row>
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..d367c9ed3c 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -137,9 +137,12 @@ typedef struct RemoteSlot
 
 	/* RS_INVAL_NONE if valid, or the reason of invalidation */
 	ReplicationSlotInvalidationCause invalidated;
+
+	TimestampTz inactive_since; /* in seconds */
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_since(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -167,6 +170,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		remote_slot->two_phase == slot->data.two_phase &&
 		remote_slot->failover == slot->data.failover &&
 		remote_slot->confirmed_lsn == slot->data.confirmed_flush &&
+		remote_slot->inactive_since == slot->inactive_since &&
 		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
 		return false;
 
@@ -182,6 +186,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 	slot->data.confirmed_flush = remote_slot->confirmed_lsn;
 	slot->data.catalog_xmin = remote_slot->catalog_xmin;
 	slot->effective_catalog_xmin = remote_slot->catalog_xmin;
+	slot->inactive_since = remote_slot->inactive_since;
 	SpinLockRelease(&slot->mutex);
 
 	if (xmin_changed)
@@ -652,9 +657,9 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 static bool
 synchronize_slots(WalReceiverConn *wrconn)
 {
-#define SLOTSYNC_COLUMN_COUNT 9
+#define SLOTSYNC_COLUMN_COUNT 10
 	Oid			slotRow[SLOTSYNC_COLUMN_COUNT] = {TEXTOID, TEXTOID, LSNOID,
-	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID};
+	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID, TIMESTAMPTZOID};
 
 	WalRcvExecResult *res;
 	TupleTableSlot *tupslot;
@@ -663,7 +668,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, invalidation_reason"
+		" database, invalidation_reason, inactive_since"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
@@ -743,6 +748,14 @@ synchronize_slots(WalReceiverConn *wrconn)
 		remote_slot->invalidated = isnull ? RS_INVAL_NONE :
 			GetSlotInvalidationCause(TextDatumGetCString(d));
 
+		/*
+		 * It is possible to get null value for inactive_since if the slot is
+		 * active on the primary server, so handle accordingly.
+		 */
+		d = DatumGetTimestampTz(slot_getattr(tupslot, ++col,
+											 &isnull));
+		remote_slot->inactive_since = isnull ? 0 : DatumGetLSN(d);
+
 		/* Sanity check */
 		Assert(col == SLOTSYNC_COLUMN_COUNT);
 
@@ -1296,6 +1309,46 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Update the inactive_since property for synced slots.
+ */
+static void
+update_synced_slots_inactive_since(void)
+{
+	TimestampTz now = 0;
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			Assert(SlotIsLogical(s));
+
+			/*
+			 * We get the current time beforehand and only once to avoid
+			 * system calls overhead while holding the lock.
+			 */
+			if (now == 0)
+				now = GetCurrentTimestamp();
+
+			/*
+			 * Set the time since the slot has become inactive after shutting
+			 * down slot sync machinery. This helps correctly interpret the
+			 * time if the standby gets promoted without a restart.
+			 */
+			SpinLockAcquire(&s->mutex);
+			s->inactive_since = now;
+			SpinLockRelease(&s->mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1309,6 +1362,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(&SlotSyncCtx->mutex);
+		update_synced_slots_inactive_since();
 		return;
 	}
 	SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1341,6 +1395,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(&SlotSyncCtx->mutex);
+
+	update_synced_slots_inactive_since();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d778c0b921..5d6882e4db 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -42,6 +42,7 @@
 #include "access/transam.h"
 #include "access/xlog_internal.h"
 #include "access/xlogrecovery.h"
+#include "access/xlogutils.h"
 #include "common/file_utils.h"
 #include "common/string.h"
 #include "miscadmin.h"
@@ -655,6 +656,7 @@ ReplicationSlotRelease(void)
 	char	   *slotname = NULL;	/* keep compiler quiet */
 	bool		is_logical = false; /* keep compiler quiet */
 	TimestampTz now = 0;
+	bool		update_inactive_since;
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
@@ -690,13 +692,19 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has become inactive.
+	 *
+	 * Note that we don't set it for the slots currently being synced from the
+	 * primary to the standby, because such slots typically sync the data from
+	 * the remote slot.
 	 */
 	if (!(RecoveryInProgress() && slot->data.synced))
+	{
 		now = GetCurrentTimestamp();
+		update_inactive_since = true;
+	}
+	else
+		update_inactive_since = false;
 
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
@@ -706,11 +714,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		if (update_inactive_since)
+			slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
-	else
+	else if (update_inactive_since)
 	{
 		SpinLockAcquire(&slot->mutex);
 		slot->inactive_since = now;
@@ -2369,13 +2378,21 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * Set the time since the slot has become inactive after loading the
+		 * slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it.
+		 *
+		 * Note that we don't set it for the slots currently being synced from
+		 * the primary to the standby, because such slots typically sync the
+		 * data from the remote slot. We use InRecovery flag instead of
+		 * RecoveryInProgress() as the latter always returns true at this time
+		 * even on primary.
+		 *
+		 * Note that for synced slots after the standby starts up (i.e. after
+		 * the slots are loaded from the disk), the inactive_since will remain
+		 * zero until the next slot sync cycle.
 		 */
-		if (!(RecoveryInProgress() && slot->data.synced))
+		if (!(InRecovery && slot->data.synced))
 			slot->inactive_since = GetCurrentTimestamp();
 		else
 			slot->inactive_since = 0;
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index f47bfd78eb..58d5177bad 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -35,6 +35,13 @@ my $subscriber1 = PostgreSQL::Test::Cluster->new('subscriber1');
 $subscriber1->init;
 $subscriber1->start;
 
+# Capture the time before the logical failover slot is created on the
+# primary. We later call this publisher as primary anyway.
+my $slot_creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Create a slot on the publisher with failover disabled
 $publisher->safe_psql('postgres',
 	"SELECT 'init' FROM pg_create_logical_replication_slot('lsub1_slot', 'pgoutput', false, false, false);"
@@ -174,6 +181,10 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary
+my $inactive_since_on_primary =
+	capture_and_validate_slot_inactive_since($primary, 'lsub1_slot', $slot_creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -190,6 +201,18 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Capture the inactive_since of the synced slot on the standby
+my $inactive_since_on_standby =
+	capture_and_validate_slot_inactive_since($standby1, 'lsub1_slot', $slot_creation_time_on_primary);
+
+# Synced slots on the standby must get the inactive_since from the primary.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_primary'::timestamptz = '$inactive_since_on_standby'::timestamptz;"
+	),
+	"t",
+	'synchronized slot has got the inactive_since from the primary');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
@@ -750,8 +773,28 @@ $primary->reload;
 $standby1->start;
 $primary->wait_for_replay_catchup($standby1);
 
+# Capture the time before the standby is promoted
+my $promotion_time_on_primary = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 $standby1->promote;
 
+# Capture the inactive_since of the synced slot after the promotion.
+# Expectation here is that the slot gets its own inactive_since as part of the
+# promotion. We do this check before the slot is enabled on the new primary
+# below, otherwise the slot gets active setting inactive_since to NULL.
+my $inactive_since_on_new_primary =
+	capture_and_validate_slot_inactive_since($standby1, 'lsub1_slot', $promotion_time_on_primary);
+
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_new_primary'::timestamptz > '$inactive_since_on_primary'::timestamptz"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since on the new primary');
+
 # Update subscription with the new primary's connection info
 my $standby1_conninfo = $standby1->connstr . ' dbname=postgres';
 $subscriber1->safe_psql('postgres',
@@ -773,4 +816,27 @@ is( $subscriber1->safe_psql('postgres', q{SELECT count(*) FROM tab_int;}),
 	"20",
 	'data replicated from the new primary');
 
+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
+{
+	my ($node, $slot_name, $reference_time) = @_;
+	my $name = $node->name;
+
+	my $inactive_since = $node->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
+		);
+
+	# Check that the captured time is sane
+	is( $node->safe_psql(
+			'postgres',
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+				'$inactive_since'::timestamptz >= '$reference_time'::timestamptz;]
+		),
+		't',
+		"last inactive time for slot $slot_name is sane on node $name");
+
+	return $inactive_since;
+}
+
 done_testing();
-- 
2.34.1

#168

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#167)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

3)
update_synced_slots_inactive_time():

This assert is removed, is it intentional?
Assert(s->active_pid == 0);

Yes, the slot can get acquired in the corner case when someone runs
pg_sync_replication_slots concurrently at this time. I'm referring to
the issue reported upthread. We don't prevent one running
pg_sync_replication_slots in promotion/ShutDownSlotSync phase right?
Maybe we should prevent that otherwise some of the slots are synced
and the standby gets promoted while others are yet-to-be-synced.

We should do something about it but that shouldn't be done in this
patch. We can handle it separately and then add such an assert.

--
With Regards,
Amit Kapila.

#169

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#168)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 27, 2024 at 10:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

3)
update_synced_slots_inactive_time():

This assert is removed, is it intentional?
Assert(s->active_pid == 0);

Yes, the slot can get acquired in the corner case when someone runs
pg_sync_replication_slots concurrently at this time. I'm referring to
the issue reported upthread. We don't prevent one running
pg_sync_replication_slots in promotion/ShutDownSlotSync phase right?
Maybe we should prevent that otherwise some of the slots are synced
and the standby gets promoted while others are yet-to-be-synced.

We should do something about it but that shouldn't be done in this
patch. We can handle it separately and then add such an assert.

Agreed. Once this patch is concluded, I can fix the slot sync shutdown
issue and will also add this 'assert' back.

thanks
Shveta

#170

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: shveta malik (#169)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 27, 2024 at 10:24 AM shveta malik <shveta.malik@gmail.com> wrote:

On Wed, Mar 27, 2024 at 10:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

3)
update_synced_slots_inactive_time():

This assert is removed, is it intentional?
Assert(s->active_pid == 0);

Yes, the slot can get acquired in the corner case when someone runs
pg_sync_replication_slots concurrently at this time. I'm referring to
the issue reported upthread. We don't prevent one running
pg_sync_replication_slots in promotion/ShutDownSlotSync phase right?
Maybe we should prevent that otherwise some of the slots are synced
and the standby gets promoted while others are yet-to-be-synced.

We should do something about it but that shouldn't be done in this
patch. We can handle it separately and then add such an assert.

Agreed. Once this patch is concluded, I can fix the slot sync shutdown
issue and will also add this 'assert' back.

Agreed. Thanks.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#171

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#163)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Mar 26, 2024 at 6:05 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

We can think on that later if we really need another
field which give us sync time.

I think that calling GetCurrentTimestamp() so frequently could be too costly, so
I'm not sure we should.

Agreed.

In my second approach, I have tried to
avoid updating inactive_since for synced slots during sync process. We
update that field during creation of synced slot so that
inactive_since reflects correct info even for synced slots (rather
than copying from primary).

Yeah, and I think we could create a dedicated field with this information
if we feel the need.

Okay.

thanks
Shveta

#172

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#167)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Please find the attached v25-0001 (made this 0001 patch now as
inactive_since patch is committed) patch with the above changes.

Fixed an issue in synchronize_slots where DatumGetLSN is being used in
place of DatumGetTimestampTz. Found this via CF bot member [1][05:14:39.281] #7 DatumGetLSN (X=<optimized out>) at ../src/include/utils/pg_lsn.h:24 [05:14:39.281] No locals. [05:14:39.281] #8 synchronize_slots (wrconn=wrconn@entry=0x583cd170) at ../src/backend/replication/logical/slotsync.c:757 [05:14:39.281] isnull = false [05:14:39.281] remote_slot = 0x583ce1a8 [05:14:39.281] d = <optimized out> [05:14:39.281] col = 10 [05:14:39.281] slotRow = {25, 25, 3220, 3220, 28, 16, 16, 25, 25, 1184} [05:14:39.281] res = 0x583cd1b8 [05:14:39.281] tupslot = 0x583ce11c [05:14:39.281] remote_slot_list = 0x0 [05:14:39.281] some_slot_updated = false [05:14:39.281] started_tx = false [05:14:39.281] query = 0x57692bc4 "SELECT slot_name, plugin, confirmed_flush_lsn, restart_lsn, catalog_xmin, two_phase, failover, database, invalidation_reason, inactive_since FROM pg_catalog.pg_replication_slots WHERE failover and NOT"... [05:14:39.281] __func__ = "synchronize_slots" [05:14:39.281] #9 0x56ff9d1e in SyncReplicationSlots (wrconn=0x583cd170) at ../src/backend/replication/logical/slotsync.c:1504, not on
my dev system.

Please find the attached v6 patch.

[1]: [05:14:39.281] #7 DatumGetLSN (X=<optimized out>) at ../src/include/utils/pg_lsn.h:24 [05:14:39.281] No locals. [05:14:39.281] #8 synchronize_slots (wrconn=wrconn@entry=0x583cd170) at ../src/backend/replication/logical/slotsync.c:757 [05:14:39.281] isnull = false [05:14:39.281] remote_slot = 0x583ce1a8 [05:14:39.281] d = <optimized out> [05:14:39.281] col = 10 [05:14:39.281] slotRow = {25, 25, 3220, 3220, 28, 16, 16, 25, 25, 1184} [05:14:39.281] res = 0x583cd1b8 [05:14:39.281] tupslot = 0x583ce11c [05:14:39.281] remote_slot_list = 0x0 [05:14:39.281] some_slot_updated = false [05:14:39.281] started_tx = false [05:14:39.281] query = 0x57692bc4 "SELECT slot_name, plugin, confirmed_flush_lsn, restart_lsn, catalog_xmin, two_phase, failover, database, invalidation_reason, inactive_since FROM pg_catalog.pg_replication_slots WHERE failover and NOT"... [05:14:39.281] __func__ = "synchronize_slots" [05:14:39.281] #9 0x56ff9d1e in SyncReplicationSlots (wrconn=0x583cd170) at ../src/backend/replication/logical/slotsync.c:1504
[05:14:39.281] #7 DatumGetLSN (X=<optimized out>) at
../src/include/utils/pg_lsn.h:24
[05:14:39.281] No locals.
[05:14:39.281] #8 synchronize_slots (wrconn=wrconn@entry=0x583cd170)
at ../src/backend/replication/logical/slotsync.c:757
[05:14:39.281] isnull = false
[05:14:39.281] remote_slot = 0x583ce1a8
[05:14:39.281] d = <optimized out>
[05:14:39.281] col = 10
[05:14:39.281] slotRow = {25, 25, 3220, 3220, 28, 16, 16, 25, 25, 1184}
[05:14:39.281] res = 0x583cd1b8
[05:14:39.281] tupslot = 0x583ce11c
[05:14:39.281] remote_slot_list = 0x0
[05:14:39.281] some_slot_updated = false
[05:14:39.281] started_tx = false
[05:14:39.281] query = 0x57692bc4 "SELECT slot_name, plugin,
confirmed_flush_lsn, restart_lsn, catalog_xmin, two_phase, failover,
database, invalidation_reason, inactive_since FROM
pg_catalog.pg_replication_slots WHERE failover and NOT"...
[05:14:39.281] __func__ = "synchronize_slots"
[05:14:39.281] #9 0x56ff9d1e in SyncReplicationSlots
(wrconn=0x583cd170) at
../src/backend/replication/logical/slotsync.c:1504

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v26-0001-Maintain-inactive_since-for-synced-slots-correct.patchapplication/x-patch; name=v26-0001-Maintain-inactive_since-for-synced-slots-correct.patchDownload

From 8c5f7d0064e0e01a2d458872ecc1d8e682ddc033 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 27 Mar 2024 05:29:18 +0000
Subject: [PATCH v26] Maintain inactive_since for synced slots correctly.

The slot's inactive_since isn't currently maintained for
synced slots on the standby. The commit a11f330b55 prevents
updating inactive_since with RecoveryInProgress() check in
RestoreSlotFromDisk(). But, the issue is that
RecoveryInProgress() always returns true in
RestoreSlotFromDisk() as 'xlogctl->SharedRecoveryState' is
always 'RECOVERY_STATE_CRASH' at that time. The impact of this
on a promoted standby inactive_since is always NULL for all
synced slots even after server restart.

Above issue led us to a question as to why we can't just update
inactive_since for synced slots on the standby with the value
received from remote slot on the primary. This is consistent with
any other slot parameter i.e. all of them are synced from the
primary.

This commit does two things:
1) Updates inactive_since for sync slots with the value
received from the primary's slot.

2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACWLctoiH-pSjWnEpR54q4DED6rw_BRJm5pCx86_Y01MoQ%40mail.gmail.com
---
 doc/src/sgml/system-views.sgml                |  4 ++
 src/backend/replication/logical/slotsync.c    | 61 ++++++++++++++++-
 src/backend/replication/slot.c                | 41 ++++++++----
 .../t/040_standby_failover_slots_sync.pl      | 66 +++++++++++++++++++
 4 files changed, 157 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..7713f168e7 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2530,6 +2530,10 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       <para>
         The time since the slot has become inactive.
         <literal>NULL</literal> if the slot is currently being used.
+        Note that the slots that are being synced from a primary server
+        (whose <structfield>synced</structfield> field is true), will get the
+        <structfield>inactive_since</structfield> value from the corresponding
+        remote slot on the primary.
       </para></entry>
      </row>
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..9c95a4b062 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -137,9 +137,12 @@ typedef struct RemoteSlot
 
 	/* RS_INVAL_NONE if valid, or the reason of invalidation */
 	ReplicationSlotInvalidationCause invalidated;
+
+	TimestampTz inactive_since; /* in seconds */
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_since(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -167,6 +170,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		remote_slot->two_phase == slot->data.two_phase &&
 		remote_slot->failover == slot->data.failover &&
 		remote_slot->confirmed_lsn == slot->data.confirmed_flush &&
+		remote_slot->inactive_since == slot->inactive_since &&
 		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
 		return false;
 
@@ -182,6 +186,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 	slot->data.confirmed_flush = remote_slot->confirmed_lsn;
 	slot->data.catalog_xmin = remote_slot->catalog_xmin;
 	slot->effective_catalog_xmin = remote_slot->catalog_xmin;
+	slot->inactive_since = remote_slot->inactive_since;
 	SpinLockRelease(&slot->mutex);
 
 	if (xmin_changed)
@@ -652,9 +657,9 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 static bool
 synchronize_slots(WalReceiverConn *wrconn)
 {
-#define SLOTSYNC_COLUMN_COUNT 9
+#define SLOTSYNC_COLUMN_COUNT 10
 	Oid			slotRow[SLOTSYNC_COLUMN_COUNT] = {TEXTOID, TEXTOID, LSNOID,
-	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID};
+	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID, TIMESTAMPTZOID};
 
 	WalRcvExecResult *res;
 	TupleTableSlot *tupslot;
@@ -663,7 +668,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, invalidation_reason"
+		" database, invalidation_reason, inactive_since"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
@@ -743,6 +748,13 @@ synchronize_slots(WalReceiverConn *wrconn)
 		remote_slot->invalidated = isnull ? RS_INVAL_NONE :
 			GetSlotInvalidationCause(TextDatumGetCString(d));
 
+		/*
+		 * It is possible to get null value for inactive_since if the slot is
+		 * active on the primary server, so handle accordingly.
+		 */
+		d = slot_getattr(tupslot, ++col, &isnull);
+		remote_slot->inactive_since = isnull ? 0 : DatumGetTimestampTz(d);
+
 		/* Sanity check */
 		Assert(col == SLOTSYNC_COLUMN_COUNT);
 
@@ -1296,6 +1308,46 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Update the inactive_since property for synced slots.
+ */
+static void
+update_synced_slots_inactive_since(void)
+{
+	TimestampTz now = 0;
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			Assert(SlotIsLogical(s));
+
+			/*
+			 * We get the current time beforehand and only once to avoid
+			 * system calls overhead while holding the lock.
+			 */
+			if (now == 0)
+				now = GetCurrentTimestamp();
+
+			/*
+			 * Set the time since the slot has become inactive after shutting
+			 * down slot sync machinery. This helps correctly interpret the
+			 * time if the standby gets promoted without a restart.
+			 */
+			SpinLockAcquire(&s->mutex);
+			s->inactive_since = now;
+			SpinLockRelease(&s->mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1309,6 +1361,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(&SlotSyncCtx->mutex);
+		update_synced_slots_inactive_since();
 		return;
 	}
 	SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1341,6 +1394,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(&SlotSyncCtx->mutex);
+
+	update_synced_slots_inactive_since();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d778c0b921..5d6882e4db 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -42,6 +42,7 @@
 #include "access/transam.h"
 #include "access/xlog_internal.h"
 #include "access/xlogrecovery.h"
+#include "access/xlogutils.h"
 #include "common/file_utils.h"
 #include "common/string.h"
 #include "miscadmin.h"
@@ -655,6 +656,7 @@ ReplicationSlotRelease(void)
 	char	   *slotname = NULL;	/* keep compiler quiet */
 	bool		is_logical = false; /* keep compiler quiet */
 	TimestampTz now = 0;
+	bool		update_inactive_since;
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
@@ -690,13 +692,19 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has become inactive.
+	 *
+	 * Note that we don't set it for the slots currently being synced from the
+	 * primary to the standby, because such slots typically sync the data from
+	 * the remote slot.
 	 */
 	if (!(RecoveryInProgress() && slot->data.synced))
+	{
 		now = GetCurrentTimestamp();
+		update_inactive_since = true;
+	}
+	else
+		update_inactive_since = false;
 
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
@@ -706,11 +714,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		if (update_inactive_since)
+			slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
-	else
+	else if (update_inactive_since)
 	{
 		SpinLockAcquire(&slot->mutex);
 		slot->inactive_since = now;
@@ -2369,13 +2378,21 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * Set the time since the slot has become inactive after loading the
+		 * slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it.
+		 *
+		 * Note that we don't set it for the slots currently being synced from
+		 * the primary to the standby, because such slots typically sync the
+		 * data from the remote slot. We use InRecovery flag instead of
+		 * RecoveryInProgress() as the latter always returns true at this time
+		 * even on primary.
+		 *
+		 * Note that for synced slots after the standby starts up (i.e. after
+		 * the slots are loaded from the disk), the inactive_since will remain
+		 * zero until the next slot sync cycle.
 		 */
-		if (!(RecoveryInProgress() && slot->data.synced))
+		if (!(InRecovery && slot->data.synced))
 			slot->inactive_since = GetCurrentTimestamp();
 		else
 			slot->inactive_since = 0;
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index f47bfd78eb..58d5177bad 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -35,6 +35,13 @@ my $subscriber1 = PostgreSQL::Test::Cluster->new('subscriber1');
 $subscriber1->init;
 $subscriber1->start;
 
+# Capture the time before the logical failover slot is created on the
+# primary. We later call this publisher as primary anyway.
+my $slot_creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Create a slot on the publisher with failover disabled
 $publisher->safe_psql('postgres',
 	"SELECT 'init' FROM pg_create_logical_replication_slot('lsub1_slot', 'pgoutput', false, false, false);"
@@ -174,6 +181,10 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary
+my $inactive_since_on_primary =
+	capture_and_validate_slot_inactive_since($primary, 'lsub1_slot', $slot_creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -190,6 +201,18 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Capture the inactive_since of the synced slot on the standby
+my $inactive_since_on_standby =
+	capture_and_validate_slot_inactive_since($standby1, 'lsub1_slot', $slot_creation_time_on_primary);
+
+# Synced slots on the standby must get the inactive_since from the primary.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_primary'::timestamptz = '$inactive_since_on_standby'::timestamptz;"
+	),
+	"t",
+	'synchronized slot has got the inactive_since from the primary');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
@@ -750,8 +773,28 @@ $primary->reload;
 $standby1->start;
 $primary->wait_for_replay_catchup($standby1);
 
+# Capture the time before the standby is promoted
+my $promotion_time_on_primary = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 $standby1->promote;
 
+# Capture the inactive_since of the synced slot after the promotion.
+# Expectation here is that the slot gets its own inactive_since as part of the
+# promotion. We do this check before the slot is enabled on the new primary
+# below, otherwise the slot gets active setting inactive_since to NULL.
+my $inactive_since_on_new_primary =
+	capture_and_validate_slot_inactive_since($standby1, 'lsub1_slot', $promotion_time_on_primary);
+
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_new_primary'::timestamptz > '$inactive_since_on_primary'::timestamptz"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since on the new primary');
+
 # Update subscription with the new primary's connection info
 my $standby1_conninfo = $standby1->connstr . ' dbname=postgres';
 $subscriber1->safe_psql('postgres',
@@ -773,4 +816,27 @@ is( $subscriber1->safe_psql('postgres', q{SELECT count(*) FROM tab_int;}),
 	"20",
 	'data replicated from the new primary');
 
+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
+{
+	my ($node, $slot_name, $reference_time) = @_;
+	my $name = $node->name;
+
+	my $inactive_since = $node->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
+		);
+
+	# Check that the captured time is sane
+	is( $node->safe_psql(
+			'postgres',
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+				'$inactive_since'::timestamptz >= '$reference_time'::timestamptz;]
+		),
+		't',
+		"last inactive time for slot $slot_name is sane on node $name");
+
+	return $inactive_since;
+}
+
 done_testing();
-- 
2.34.1

#173

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#167)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Mar 27, 2024 at 10:08:33AM +0530, Bharath Rupireddy wrote:

On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
-               if (!(RecoveryInProgress() && slot->data.synced))
+               if (!(InRecovery && slot->data.synced))
slot->inactive_since = GetCurrentTimestamp();
else
slot->inactive_since = 0;
Not related to this change but more the way RestoreSlotFromDisk() behaves here:

For a sync slot on standby it will be set to zero and then later will be
synchronized with the one coming from the primary. I think that's fine to have
it to zero for this window of time.
Right.

Now, if the standby is down and one sets sync_replication_slots to off,
then inactive_since will be set to zero on the standby at startup and not
synchronized (unless one triggers a manual sync). I also think that's fine but
it might be worth to document this behavior (that after a standby startup
inactive_since is zero until the next sync...).

Isn't this behaviour applicable for other slot parameters that the
slot syncs from the remote slot on the primary?

No they are persisted on disk. If not, we'd not know where to resume the decoding
from on the standby in case primary is down and/or sync is off.

I've added the following note in the comments when we update
inactive_since in RestoreSlotFromDisk.

* Note that for synced slots after the standby starts up (i.e. after
* the slots are loaded from the disk), the inactive_since will remain
* zero until the next slot sync cycle.
*/
if (!(InRecovery && slot->data.synced))
slot->inactive_since = GetCurrentTimestamp();
else
slot->inactive_since = 0;

I think we should add some words in the doc too and also about what the meaning
of inactive_since on the standby is (as suggested by Shveta in [1]/messages/by-id/CAJpy0uDkTW+t1k3oPkaipFBzZePfFNB5DmiA==pxRGcAdpF=Pg@mail.gmail.com).

[1]: /messages/by-id/CAJpy0uDkTW+t1k3oPkaipFBzZePfFNB5DmiA==pxRGcAdpF=Pg@mail.gmail.com

7 ===
+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
+{
+       my ($node, $slot_name, $slot_creation_time) = @_;
+       my $name = $node->name;
We know have capture_and_validate_slot_inactive_since at 2 places:
040_standby_failover_slots_sync.pl and 019_replslot_limit.pl.

Worth to create a sub in Cluster.pm?
I'd second that thought for now. We might have to debate first if it's
useful for all the nodes even without replication, and if yes, the
naming stuff and all that. Historically, we've had such duplicated
functions until recently, for instance advance_wal and log_contains.
We
moved them over to a common perl library Cluster.pm very recently. I'm
sure we can come back later to move it to Cluster.pm.

I thought that would be the right time not to introduce duplicated code.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#174

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#172)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 27, 2024 at 11:05 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Fixed an issue in synchronize_slots where DatumGetLSN is being used in
place of DatumGetTimestampTz. Found this via CF bot member [1], not on
my dev system.

Please find the attached v6 patch.

Thanks for the patch. Few trivial things:

----------
1)
system-views.sgml:

a) "Note that the slots" --> "Note that the slots on the standbys,"
--it is good to mention "standbys" as synced could be true on primary
as well (promoted standby)

b) If you plan to add more info which Bertrand suggested, then it will
be better to make a <note> section instead of using "Note"

2)
commit msg:

"The impact of this
on a promoted standby inactive_since is always NULL for all
synced slots even after server restart.
"
Sentence looks broken.
---------

Apart from the above trivial things, v26-001 looks good to me.

thanks
Shveta

#175

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: shveta malik (#174)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 27, 2024 at 11:39 AM shveta malik <shveta.malik@gmail.com> wrote:

Thanks for the patch. Few trivial things:

Thanks for reviewing.

----------
1)
system-views.sgml:

a) "Note that the slots" --> "Note that the slots on the standbys,"
--it is good to mention "standbys" as synced could be true on primary
as well (promoted standby)

Done.

b) If you plan to add more info which Bertrand suggested, then it will
be better to make a <note> section instead of using "Note"

I added the note that Bertrand specified upthread. But, I couldn't
find an instance of adding <note> ... </note> within a table. Hence
with "Note that ...." statments just like any other notes in the
system-views.sgml. pg_replication_slot in system-vews.sgml renders as
table, so having <note> ... </note> may not be a great idea.

2)
commit msg:

"The impact of this
on a promoted standby inactive_since is always NULL for all
synced slots even after server restart.
"
Sentence looks broken.
---------

Reworded.

Apart from the above trivial things, v26-001 looks good to me.

Please check the attached v27 patch which also has Bertrand's comment
on deduplicating the TAP function. I've now moved it to Cluster.pm.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v27-0001-Maintain-inactive_since-for-synced-slots-correct.patchapplication/x-patch; name=v27-0001-Maintain-inactive_since-for-synced-slots-correct.patchDownload

From b4f113ef1d3467383913d0f04cc672372133420d Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 27 Mar 2024 09:15:41 +0000
Subject: [PATCH v27] Maintain inactive_since for synced slots correctly.

The slot's inactive_since isn't currently maintained for
synced slots on the standby. The commit a11f330b55 prevents
updating inactive_since with RecoveryInProgress() check in
RestoreSlotFromDisk(). But, the issue is that
RecoveryInProgress() always returns true in
RestoreSlotFromDisk() as 'xlogctl->SharedRecoveryState' is
always 'RECOVERY_STATE_CRASH' at that time. Because of this,
inactive_since is always NULL on a promoted standby for all
synced slots even after server restart.

Above issue led us to a question as to why we can't just update
inactive_since for synced slots on the standby with the value
received from remote slot on the primary. This is consistent with
any other slot parameter i.e. all of them are synced from the
primary.

This commit does two things:
1) Updates inactive_since for sync slots with the value
received from the primary's slot.

2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACWLctoiH-pSjWnEpR54q4DED6rw_BRJm5pCx86_Y01MoQ%40mail.gmail.com
---
 doc/src/sgml/system-views.sgml                |  8 +++
 src/backend/replication/logical/slotsync.c    | 61 ++++++++++++++++++-
 src/backend/replication/slot.c                | 41 +++++++++----
 src/test/perl/PostgreSQL/Test/Cluster.pm      | 34 +++++++++++
 src/test/recovery/t/019_replslot_limit.pl     | 26 +-------
 .../t/040_standby_failover_slots_sync.pl      | 43 +++++++++++++
 6 files changed, 174 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..07f6d177e8 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2530,6 +2530,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       <para>
         The time since the slot has become inactive.
         <literal>NULL</literal> if the slot is currently being used.
+        Note that the slots on the standbys that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>), will get the
+        <structfield>inactive_since</structfield> value from the
+        corresponding remote slot on the primary. Also, note that for the
+        synced slots on the standby, after the standby starts up (i.e. after
+        the slots are loaded from the disk), the inactive_since will remain
+        zero until the next slot sync cycle.
       </para></entry>
      </row>
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..9c95a4b062 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -137,9 +137,12 @@ typedef struct RemoteSlot
 
 	/* RS_INVAL_NONE if valid, or the reason of invalidation */
 	ReplicationSlotInvalidationCause invalidated;
+
+	TimestampTz inactive_since; /* in seconds */
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_since(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -167,6 +170,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		remote_slot->two_phase == slot->data.two_phase &&
 		remote_slot->failover == slot->data.failover &&
 		remote_slot->confirmed_lsn == slot->data.confirmed_flush &&
+		remote_slot->inactive_since == slot->inactive_since &&
 		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
 		return false;
 
@@ -182,6 +186,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 	slot->data.confirmed_flush = remote_slot->confirmed_lsn;
 	slot->data.catalog_xmin = remote_slot->catalog_xmin;
 	slot->effective_catalog_xmin = remote_slot->catalog_xmin;
+	slot->inactive_since = remote_slot->inactive_since;
 	SpinLockRelease(&slot->mutex);
 
 	if (xmin_changed)
@@ -652,9 +657,9 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 static bool
 synchronize_slots(WalReceiverConn *wrconn)
 {
-#define SLOTSYNC_COLUMN_COUNT 9
+#define SLOTSYNC_COLUMN_COUNT 10
 	Oid			slotRow[SLOTSYNC_COLUMN_COUNT] = {TEXTOID, TEXTOID, LSNOID,
-	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID};
+	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID, TIMESTAMPTZOID};
 
 	WalRcvExecResult *res;
 	TupleTableSlot *tupslot;
@@ -663,7 +668,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, invalidation_reason"
+		" database, invalidation_reason, inactive_since"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
@@ -743,6 +748,13 @@ synchronize_slots(WalReceiverConn *wrconn)
 		remote_slot->invalidated = isnull ? RS_INVAL_NONE :
 			GetSlotInvalidationCause(TextDatumGetCString(d));
 
+		/*
+		 * It is possible to get null value for inactive_since if the slot is
+		 * active on the primary server, so handle accordingly.
+		 */
+		d = slot_getattr(tupslot, ++col, &isnull);
+		remote_slot->inactive_since = isnull ? 0 : DatumGetTimestampTz(d);
+
 		/* Sanity check */
 		Assert(col == SLOTSYNC_COLUMN_COUNT);
 
@@ -1296,6 +1308,46 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Update the inactive_since property for synced slots.
+ */
+static void
+update_synced_slots_inactive_since(void)
+{
+	TimestampTz now = 0;
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			Assert(SlotIsLogical(s));
+
+			/*
+			 * We get the current time beforehand and only once to avoid
+			 * system calls overhead while holding the lock.
+			 */
+			if (now == 0)
+				now = GetCurrentTimestamp();
+
+			/*
+			 * Set the time since the slot has become inactive after shutting
+			 * down slot sync machinery. This helps correctly interpret the
+			 * time if the standby gets promoted without a restart.
+			 */
+			SpinLockAcquire(&s->mutex);
+			s->inactive_since = now;
+			SpinLockRelease(&s->mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1309,6 +1361,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(&SlotSyncCtx->mutex);
+		update_synced_slots_inactive_since();
 		return;
 	}
 	SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1341,6 +1394,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(&SlotSyncCtx->mutex);
+
+	update_synced_slots_inactive_since();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d778c0b921..5d6882e4db 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -42,6 +42,7 @@
 #include "access/transam.h"
 #include "access/xlog_internal.h"
 #include "access/xlogrecovery.h"
+#include "access/xlogutils.h"
 #include "common/file_utils.h"
 #include "common/string.h"
 #include "miscadmin.h"
@@ -655,6 +656,7 @@ ReplicationSlotRelease(void)
 	char	   *slotname = NULL;	/* keep compiler quiet */
 	bool		is_logical = false; /* keep compiler quiet */
 	TimestampTz now = 0;
+	bool		update_inactive_since;
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
@@ -690,13 +692,19 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has become inactive.
+	 *
+	 * Note that we don't set it for the slots currently being synced from the
+	 * primary to the standby, because such slots typically sync the data from
+	 * the remote slot.
 	 */
 	if (!(RecoveryInProgress() && slot->data.synced))
+	{
 		now = GetCurrentTimestamp();
+		update_inactive_since = true;
+	}
+	else
+		update_inactive_since = false;
 
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
@@ -706,11 +714,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		if (update_inactive_since)
+			slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
-	else
+	else if (update_inactive_since)
 	{
 		SpinLockAcquire(&slot->mutex);
 		slot->inactive_since = now;
@@ -2369,13 +2378,21 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * Set the time since the slot has become inactive after loading the
+		 * slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it.
+		 *
+		 * Note that we don't set it for the slots currently being synced from
+		 * the primary to the standby, because such slots typically sync the
+		 * data from the remote slot. We use InRecovery flag instead of
+		 * RecoveryInProgress() as the latter always returns true at this time
+		 * even on primary.
+		 *
+		 * Note that for synced slots after the standby starts up (i.e. after
+		 * the slots are loaded from the disk), the inactive_since will remain
+		 * zero until the next slot sync cycle.
 		 */
-		if (!(RecoveryInProgress() && slot->data.synced))
+		if (!(InRecovery && slot->data.synced))
 			slot->inactive_since = GetCurrentTimestamp();
 		else
 			slot->inactive_since = 0;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b08296605c..221fe93a8b 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3276,6 +3276,40 @@ sub create_logical_slot_on_standby
 
 =pod
 
+=item $node->create_logical_slot_on_standby(self, primary, slot_name, dbname)
+
+Get inactive_since column value for a given replication slot validating it
+against given reference time.
+
+=cut
+
+sub get_slot_inactive_since_value
+{
+	my ($self, $slot_name, $reference_time) = @_;
+	my $name = $self->name;
+
+	my $inactive_since = $self->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
+		);
+
+	# Check that the captured time is sane
+	if (defined $reference_time)
+	{
+		is($self->safe_psql('postgres',
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+					'$inactive_since'::timestamptz >= '$reference_time'::timestamptz;]
+			),
+			't',
+			"last inactive time for slot $slot_name is valid on node $name")
+			or die "could not validate captured inactive_since for slot $slot_name";
+	}
+
+	return $inactive_since;
+}
+
+=pod
+
 =item $node->advance_wal(num)
 
 Advance WAL of node by given number of segments.
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 3b9a306a8b..c8e5e5054d 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -443,7 +443,7 @@ $primary4->safe_psql(
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the standby below.
 my $inactive_since =
-	capture_and_validate_slot_inactive_since($primary4, $sb4_slot, $slot_creation_time);
+	$primary4->get_slot_inactive_since_value($sb4_slot, $slot_creation_time);
 
 $standby4->start;
 
@@ -502,7 +502,7 @@ $publisher4->safe_psql('postgres',
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the subscriber below.
 $inactive_since =
-	capture_and_validate_slot_inactive_since($publisher4, $lsub4_slot, $slot_creation_time);
+	$publisher4->get_slot_inactive_since_value($lsub4_slot, $slot_creation_time);
 
 $subscriber4->start;
 $subscriber4->safe_psql('postgres',
@@ -540,26 +540,4 @@ is( $publisher4->safe_psql(
 $publisher4->stop;
 $subscriber4->stop;
 
-# Capture and validate inactive_since of a given slot.
-sub capture_and_validate_slot_inactive_since
-{
-	my ($node, $slot_name, $slot_creation_time) = @_;
-
-	my $inactive_since = $node->safe_psql('postgres',
-		qq(SELECT inactive_since FROM pg_replication_slots
-			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
-		);
-
-	# Check that the captured time is sane
-	is( $node->safe_psql(
-			'postgres',
-			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
-				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
-		),
-		't',
-		"last inactive time for an active slot $slot_name is sane");
-
-	return $inactive_since;
-}
-
 done_testing();
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index f47bfd78eb..33e3a8dcf0 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -35,6 +35,13 @@ my $subscriber1 = PostgreSQL::Test::Cluster->new('subscriber1');
 $subscriber1->init;
 $subscriber1->start;
 
+# Capture the time before the logical failover slot is created on the
+# primary. We later call this publisher as primary anyway.
+my $slot_creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Create a slot on the publisher with failover disabled
 $publisher->safe_psql('postgres',
 	"SELECT 'init' FROM pg_create_logical_replication_slot('lsub1_slot', 'pgoutput', false, false, false);"
@@ -174,6 +181,10 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary
+my $inactive_since_on_primary =
+	$primary->get_slot_inactive_since_value('lsub1_slot', $slot_creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -190,6 +201,18 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Capture the inactive_since of the synced slot on the standby
+my $inactive_since_on_standby =
+	$standby1->get_slot_inactive_since_value('lsub1_slot', $slot_creation_time_on_primary);
+
+# Synced slots on the standby must get the inactive_since from the primary.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_primary'::timestamptz = '$inactive_since_on_standby'::timestamptz;"
+	),
+	"t",
+	'synchronized slot has got the inactive_since from the primary');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
@@ -750,8 +773,28 @@ $primary->reload;
 $standby1->start;
 $primary->wait_for_replay_catchup($standby1);
 
+# Capture the time before the standby is promoted
+my $promotion_time_on_primary = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 $standby1->promote;
 
+# Capture the inactive_since of the synced slot after the promotion.
+# Expectation here is that the slot gets its own inactive_since as part of the
+# promotion. We do this check before the slot is enabled on the new primary
+# below, otherwise the slot gets active setting inactive_since to NULL.
+my $inactive_since_on_new_primary =
+	$standby1->get_slot_inactive_since_value('lsub1_slot', $promotion_time_on_primary);
+
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_new_primary'::timestamptz > '$inactive_since_on_primary'::timestamptz"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since on the new primary');
+
 # Update subscription with the new primary's connection info
 my $standby1_conninfo = $standby1->connstr . ' dbname=postgres';
 $subscriber1->safe_psql('postgres',
-- 
2.34.1

#176

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#175)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Mar 27, 2024 at 02:55:17PM +0530, Bharath Rupireddy wrote:

Please check the attached v27 patch which also has Bertrand's comment
on deduplicating the TAP function. I've now moved it to Cluster.pm.

Thanks!

1 ===

+        Note that the slots on the standbys that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>), will get the
+        <structfield>inactive_since</structfield> value from the
+        corresponding remote slot on the primary. Also, note that for the
+        synced slots on the standby, after the standby starts up (i.e. after
+        the slots are loaded from the disk), the inactive_since will remain
+        zero until the next slot sync cycle.

Not sure we should mention the "(i.e. after the slots are loaded from the disk)"
and also "cycle" (as that does not sound right in case of manual sync).

My proposal (in text) but feel free to reword it:

Note that the slots on the standbys that are being synced from a
primary server (whose synced field is true), will get the inactive_since value
from the corresponding remote slot on the primary. Also, after the standby starts
up, the inactive_since (for such synced slots) will remain zero until the next
synchronization.

2 ===

+=item $node->create_logical_slot_on_standby(self, primary, slot_name, dbname)

get_slot_inactive_since_value instead?

3 ===

+against given reference time.

s/given reference/optional given reference/?

Apart from the above, LGTM.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#177

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#175)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 27, 2024 at 2:55 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Mar 27, 2024 at 11:39 AM shveta malik <shveta.malik@gmail.com> wrote:

Thanks for the patch. Few trivial things:

Thanks for reviewing.

----------
1)
system-views.sgml:

a) "Note that the slots" --> "Note that the slots on the standbys,"
--it is good to mention "standbys" as synced could be true on primary
as well (promoted standby)

Done.

b) If you plan to add more info which Bertrand suggested, then it will
be better to make a <note> section instead of using "Note"

I added the note that Bertrand specified upthread. But, I couldn't
find an instance of adding <note> ... </note> within a table. Hence
with "Note that ...." statments just like any other notes in the
system-views.sgml. pg_replication_slot in system-vews.sgml renders as
table, so having <note> ... </note> may not be a great idea.

2)
commit msg:

"The impact of this
on a promoted standby inactive_since is always NULL for all
synced slots even after server restart.
"
Sentence looks broken.
---------

Reworded.

Apart from the above trivial things, v26-001 looks good to me.

Please check the attached v27 patch which also has Bertrand's comment
on deduplicating the TAP function. I've now moved it to Cluster.pm.

Thanks for the patch. Regarding doc, I have few comments.

+        Note that the slots on the standbys that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>), will get the
+        <structfield>inactive_since</structfield> value from the
+        corresponding remote slot on the primary. Also, note that for the
+        synced slots on the standby, after the standby starts up (i.e. after
+        the slots are loaded from the disk), the inactive_since will remain
+        zero until the next slot sync cycle.

a) "inactive_since will remain zero"
Since it is user exposed info and the user finds it NULL in
pg_replication_slots, shall we mention NULL instead of 0?

b) Since we are referring to the sync cycle here, I feel it will be
good to give a link to that page.
+        zero until the next slot sync cycle (see
+        <xref linkend="logicaldecoding-replication-slots-synchronization"/> for
+        slot synchronization details).

thanks
Shveta

#178

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: shveta malik (#177)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 27, 2024 at 3:42 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

1 ===

My proposal (in text) but feel free to reword it:

Note that the slots on the standbys that are being synced from a
primary server (whose synced field is true), will get the inactive_since value
from the corresponding remote slot on the primary. Also, after the standby starts
up, the inactive_since (for such synced slots) will remain zero until the next
synchronization.

WFM.

2 ===

+=item $node->create_logical_slot_on_standby(self, primary, slot_name, dbname)

get_slot_inactive_since_value instead?

Ugh. Changed.

3 ===

+against given reference time.

s/given reference/optional given reference/?

Done.

Apart from the above, LGTM.

Thanks for reviewing.

On Wed, Mar 27, 2024 at 3:43 PM shveta malik <shveta.malik@gmail.com> wrote:

Thanks for the patch. Regarding doc, I have few comments.

Thanks for reviewing.

a) "inactive_since will remain zero"
Since it is user exposed info and the user finds it NULL in
pg_replication_slots, shall we mention NULL instead of 0?

Right. Changed.

b) Since we are referring to the sync cycle here, I feel it will be
good to give a link to that page.
+        zero until the next slot sync cycle (see
+        <xref linkend="logicaldecoding-replication-slots-synchronization"/> for
+        slot synchronization details).

WFM.

Please see the attached v28 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v28-0001-Maintain-inactive_since-for-synced-slots-correct.patchapplication/octet-stream; name=v28-0001-Maintain-inactive_since-for-synced-slots-correct.patchDownload

From 769ed57bad0cb154e9889c48141ec97ee18eb790 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 27 Mar 2024 11:57:13 +0000
Subject: [PATCH v28] Maintain inactive_since for synced slots correctly.

The slot's inactive_since isn't currently maintained for
synced slots on the standby. The commit a11f330b55 prevents
updating inactive_since with RecoveryInProgress() check in
RestoreSlotFromDisk(). But, the issue is that
RecoveryInProgress() always returns true in
RestoreSlotFromDisk() as 'xlogctl->SharedRecoveryState' is
always 'RECOVERY_STATE_CRASH' at that time. Because of this,
inactive_since is always NULL on a promoted standby for all
synced slots even after server restart.

Above issue led us to a question as to why we can't just update
inactive_since for synced slots on the standby with the value
received from remote slot on the primary. This is consistent with
any other slot parameter i.e. all of them are synced from the
primary.

This commit does two things:
1) Updates inactive_since for sync slots with the value
received from the primary's slot.

2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACWLctoiH-pSjWnEpR54q4DED6rw_BRJm5pCx86_Y01MoQ%40mail.gmail.com
---
 doc/src/sgml/system-views.sgml                |  9 +++
 src/backend/replication/logical/slotsync.c    | 61 ++++++++++++++++++-
 src/backend/replication/slot.c                | 41 +++++++++----
 src/test/perl/PostgreSQL/Test/Cluster.pm      | 34 +++++++++++
 src/test/recovery/t/019_replslot_limit.pl     | 26 +-------
 .../t/040_standby_failover_slots_sync.pl      | 43 +++++++++++++
 6 files changed, 175 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..c8d97ab375 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2530,6 +2530,15 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       <para>
         The time since the slot has become inactive.
         <literal>NULL</literal> if the slot is currently being used.
+        Note that the slots on the standbys that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>), will get the
+        <structfield>inactive_since</structfield> value from the
+        corresponding remote slot on the primary. Also, after the standby
+        starts up, the <structfield>inactive_since</structfield> value
+        (for such synced slots) will remain <literal>NULL</literal> until
+        the next synchronization (see
+        <xref linkend="logicaldecoding-replication-slots-synchronization"/>).
       </para></entry>
      </row>
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..9c95a4b062 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -137,9 +137,12 @@ typedef struct RemoteSlot
 
 	/* RS_INVAL_NONE if valid, or the reason of invalidation */
 	ReplicationSlotInvalidationCause invalidated;
+
+	TimestampTz inactive_since; /* in seconds */
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_since(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -167,6 +170,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		remote_slot->two_phase == slot->data.two_phase &&
 		remote_slot->failover == slot->data.failover &&
 		remote_slot->confirmed_lsn == slot->data.confirmed_flush &&
+		remote_slot->inactive_since == slot->inactive_since &&
 		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
 		return false;
 
@@ -182,6 +186,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 	slot->data.confirmed_flush = remote_slot->confirmed_lsn;
 	slot->data.catalog_xmin = remote_slot->catalog_xmin;
 	slot->effective_catalog_xmin = remote_slot->catalog_xmin;
+	slot->inactive_since = remote_slot->inactive_since;
 	SpinLockRelease(&slot->mutex);
 
 	if (xmin_changed)
@@ -652,9 +657,9 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 static bool
 synchronize_slots(WalReceiverConn *wrconn)
 {
-#define SLOTSYNC_COLUMN_COUNT 9
+#define SLOTSYNC_COLUMN_COUNT 10
 	Oid			slotRow[SLOTSYNC_COLUMN_COUNT] = {TEXTOID, TEXTOID, LSNOID,
-	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID};
+	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID, TIMESTAMPTZOID};
 
 	WalRcvExecResult *res;
 	TupleTableSlot *tupslot;
@@ -663,7 +668,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, invalidation_reason"
+		" database, invalidation_reason, inactive_since"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
@@ -743,6 +748,13 @@ synchronize_slots(WalReceiverConn *wrconn)
 		remote_slot->invalidated = isnull ? RS_INVAL_NONE :
 			GetSlotInvalidationCause(TextDatumGetCString(d));
 
+		/*
+		 * It is possible to get null value for inactive_since if the slot is
+		 * active on the primary server, so handle accordingly.
+		 */
+		d = slot_getattr(tupslot, ++col, &isnull);
+		remote_slot->inactive_since = isnull ? 0 : DatumGetTimestampTz(d);
+
 		/* Sanity check */
 		Assert(col == SLOTSYNC_COLUMN_COUNT);
 
@@ -1296,6 +1308,46 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Update the inactive_since property for synced slots.
+ */
+static void
+update_synced_slots_inactive_since(void)
+{
+	TimestampTz now = 0;
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			Assert(SlotIsLogical(s));
+
+			/*
+			 * We get the current time beforehand and only once to avoid
+			 * system calls overhead while holding the lock.
+			 */
+			if (now == 0)
+				now = GetCurrentTimestamp();
+
+			/*
+			 * Set the time since the slot has become inactive after shutting
+			 * down slot sync machinery. This helps correctly interpret the
+			 * time if the standby gets promoted without a restart.
+			 */
+			SpinLockAcquire(&s->mutex);
+			s->inactive_since = now;
+			SpinLockRelease(&s->mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1309,6 +1361,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(&SlotSyncCtx->mutex);
+		update_synced_slots_inactive_since();
 		return;
 	}
 	SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1341,6 +1394,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(&SlotSyncCtx->mutex);
+
+	update_synced_slots_inactive_since();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d778c0b921..5d6882e4db 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -42,6 +42,7 @@
 #include "access/transam.h"
 #include "access/xlog_internal.h"
 #include "access/xlogrecovery.h"
+#include "access/xlogutils.h"
 #include "common/file_utils.h"
 #include "common/string.h"
 #include "miscadmin.h"
@@ -655,6 +656,7 @@ ReplicationSlotRelease(void)
 	char	   *slotname = NULL;	/* keep compiler quiet */
 	bool		is_logical = false; /* keep compiler quiet */
 	TimestampTz now = 0;
+	bool		update_inactive_since;
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
@@ -690,13 +692,19 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has become inactive.
+	 *
+	 * Note that we don't set it for the slots currently being synced from the
+	 * primary to the standby, because such slots typically sync the data from
+	 * the remote slot.
 	 */
 	if (!(RecoveryInProgress() && slot->data.synced))
+	{
 		now = GetCurrentTimestamp();
+		update_inactive_since = true;
+	}
+	else
+		update_inactive_since = false;
 
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
@@ -706,11 +714,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		if (update_inactive_since)
+			slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
-	else
+	else if (update_inactive_since)
 	{
 		SpinLockAcquire(&slot->mutex);
 		slot->inactive_since = now;
@@ -2369,13 +2378,21 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * Set the time since the slot has become inactive after loading the
+		 * slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it.
+		 *
+		 * Note that we don't set it for the slots currently being synced from
+		 * the primary to the standby, because such slots typically sync the
+		 * data from the remote slot. We use InRecovery flag instead of
+		 * RecoveryInProgress() as the latter always returns true at this time
+		 * even on primary.
+		 *
+		 * Note that for synced slots after the standby starts up (i.e. after
+		 * the slots are loaded from the disk), the inactive_since will remain
+		 * zero until the next slot sync cycle.
 		 */
-		if (!(RecoveryInProgress() && slot->data.synced))
+		if (!(InRecovery && slot->data.synced))
 			slot->inactive_since = GetCurrentTimestamp();
 		else
 			slot->inactive_since = 0;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b08296605c..236b2125e4 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3276,6 +3276,40 @@ sub create_logical_slot_on_standby
 
 =pod
 
+=item $node->get_slot_inactive_since_value(self, primary, slot_name, dbname)
+
+Get inactive_since column value for a given replication slot validating it
+against optional reference time.
+
+=cut
+
+sub get_slot_inactive_since_value
+{
+	my ($self, $slot_name, $reference_time) = @_;
+	my $name = $self->name;
+
+	my $inactive_since = $self->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
+		);
+
+	# Check that the captured time is sane
+	if (defined $reference_time)
+	{
+		is($self->safe_psql('postgres',
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+					'$inactive_since'::timestamptz >= '$reference_time'::timestamptz;]
+			),
+			't',
+			"last inactive time for slot $slot_name is valid on node $name")
+			or die "could not validate captured inactive_since for slot $slot_name";
+	}
+
+	return $inactive_since;
+}
+
+=pod
+
 =item $node->advance_wal(num)
 
 Advance WAL of node by given number of segments.
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 3b9a306a8b..c8e5e5054d 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -443,7 +443,7 @@ $primary4->safe_psql(
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the standby below.
 my $inactive_since =
-	capture_and_validate_slot_inactive_since($primary4, $sb4_slot, $slot_creation_time);
+	$primary4->get_slot_inactive_since_value($sb4_slot, $slot_creation_time);
 
 $standby4->start;
 
@@ -502,7 +502,7 @@ $publisher4->safe_psql('postgres',
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the subscriber below.
 $inactive_since =
-	capture_and_validate_slot_inactive_since($publisher4, $lsub4_slot, $slot_creation_time);
+	$publisher4->get_slot_inactive_since_value($lsub4_slot, $slot_creation_time);
 
 $subscriber4->start;
 $subscriber4->safe_psql('postgres',
@@ -540,26 +540,4 @@ is( $publisher4->safe_psql(
 $publisher4->stop;
 $subscriber4->stop;
 
-# Capture and validate inactive_since of a given slot.
-sub capture_and_validate_slot_inactive_since
-{
-	my ($node, $slot_name, $slot_creation_time) = @_;
-
-	my $inactive_since = $node->safe_psql('postgres',
-		qq(SELECT inactive_since FROM pg_replication_slots
-			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
-		);
-
-	# Check that the captured time is sane
-	is( $node->safe_psql(
-			'postgres',
-			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
-				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
-		),
-		't',
-		"last inactive time for an active slot $slot_name is sane");
-
-	return $inactive_since;
-}
-
 done_testing();
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index f47bfd78eb..33e3a8dcf0 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -35,6 +35,13 @@ my $subscriber1 = PostgreSQL::Test::Cluster->new('subscriber1');
 $subscriber1->init;
 $subscriber1->start;
 
+# Capture the time before the logical failover slot is created on the
+# primary. We later call this publisher as primary anyway.
+my $slot_creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Create a slot on the publisher with failover disabled
 $publisher->safe_psql('postgres',
 	"SELECT 'init' FROM pg_create_logical_replication_slot('lsub1_slot', 'pgoutput', false, false, false);"
@@ -174,6 +181,10 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary
+my $inactive_since_on_primary =
+	$primary->get_slot_inactive_since_value('lsub1_slot', $slot_creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -190,6 +201,18 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Capture the inactive_since of the synced slot on the standby
+my $inactive_since_on_standby =
+	$standby1->get_slot_inactive_since_value('lsub1_slot', $slot_creation_time_on_primary);
+
+# Synced slots on the standby must get the inactive_since from the primary.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_primary'::timestamptz = '$inactive_since_on_standby'::timestamptz;"
+	),
+	"t",
+	'synchronized slot has got the inactive_since from the primary');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
@@ -750,8 +773,28 @@ $primary->reload;
 $standby1->start;
 $primary->wait_for_replay_catchup($standby1);
 
+# Capture the time before the standby is promoted
+my $promotion_time_on_primary = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 $standby1->promote;
 
+# Capture the inactive_since of the synced slot after the promotion.
+# Expectation here is that the slot gets its own inactive_since as part of the
+# promotion. We do this check before the slot is enabled on the new primary
+# below, otherwise the slot gets active setting inactive_since to NULL.
+my $inactive_since_on_new_primary =
+	$standby1->get_slot_inactive_since_value('lsub1_slot', $promotion_time_on_primary);
+
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_new_primary'::timestamptz > '$inactive_since_on_primary'::timestamptz"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since on the new primary');
+
 # Update subscription with the new primary's connection info
 my $standby1_conninfo = $standby1->connstr . ' dbname=postgres';
 $subscriber1->safe_psql('postgres',
-- 
2.34.1

#179

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#178)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Mar 27, 2024 at 05:55:05PM +0530, Bharath Rupireddy wrote:

On Wed, Mar 27, 2024 at 3:42 PM Bertrand Drouvot
Please see the attached v28 patch.

Thanks!

1 === sorry I missed it in the previous review

        if (!(RecoveryInProgress() && slot->data.synced))
+       {
                now = GetCurrentTimestamp();
+               update_inactive_since = true;
+       }
+       else
+               update_inactive_since = false;

I think update_inactive_since is not needed, we could rely on (now > 0) instead.

2 ===

+=item $node->get_slot_inactive_since_value(self, primary, slot_name, dbname)
+
+Get inactive_since column value for a given replication slot validating it
+against optional reference time.
+
+=cut
+
+sub get_slot_inactive_since_value
+{

shouldn't be "=item $node->get_slot_inactive_since_value(self, slot_name, reference_time)"
instead?

Apart from the above, LGTM.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#180

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#179)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 27, 2024 at 6:54 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Wed, Mar 27, 2024 at 05:55:05PM +0530, Bharath Rupireddy wrote:

On Wed, Mar 27, 2024 at 3:42 PM Bertrand Drouvot
Please see the attached v28 patch.

Thanks!

1 === sorry I missed it in the previous review
if (!(RecoveryInProgress() && slot->data.synced))
+       {
now = GetCurrentTimestamp();
+               update_inactive_since = true;
+       }
+       else
+               update_inactive_since = false;
I think update_inactive_since is not needed, we could rely on (now > 0) instead.

Thought of using it, but, at the expense of readability. I prefer to
use a variable instead. However, I changed the variable to be more
meaningful to is_slot_being_synced.

2 ===
+=item $node->get_slot_inactive_since_value(self, primary, slot_name, dbname)
+
+Get inactive_since column value for a given replication slot validating it
+against optional reference time.
+
+=cut
+
+sub get_slot_inactive_since_value
+{
shouldn't be "=item $node->get_slot_inactive_since_value(self, slot_name, reference_time)"
instead?

Ugh. Changed.

Apart from the above, LGTM.

Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the
standby for sync slots. 0002 implementing inactive timeout GUC based
invalidation mechanism.

Please have a look.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v29-0001-Maintain-inactive_since-for-synced-slots-correct.patchapplication/octet-stream; name=v29-0001-Maintain-inactive_since-for-synced-slots-correct.patchDownload

From 5012d35b4e4d0631a429df7411185711cdf948fa Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 27 Mar 2024 14:21:01 +0000
Subject: [PATCH v29 1/2] Maintain inactive_since for synced slots correctly.

The slot's inactive_since isn't currently maintained for
synced slots on the standby. The commit a11f330b55 prevents
updating inactive_since with RecoveryInProgress() check in
RestoreSlotFromDisk(). But, the issue is that
RecoveryInProgress() always returns true in
RestoreSlotFromDisk() as 'xlogctl->SharedRecoveryState' is
always 'RECOVERY_STATE_CRASH' at that time. Because of this,
inactive_since is always NULL on a promoted standby for all
synced slots even after server restart.

Above issue led us to a question as to why we can't just update
inactive_since for synced slots on the standby with the value
received from remote slot on the primary. This is consistent with
any other slot parameter i.e. all of them are synced from the
primary.

This commit does two things:
1) Updates inactive_since for sync slots with the value
received from the primary's slot.

2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACWLctoiH-pSjWnEpR54q4DED6rw_BRJm5pCx86_Y01MoQ%40mail.gmail.com
---
 doc/src/sgml/system-views.sgml                |  9 +++
 src/backend/replication/logical/slotsync.c    | 61 ++++++++++++++++++-
 src/backend/replication/slot.c                | 40 ++++++++----
 src/test/perl/PostgreSQL/Test/Cluster.pm      | 34 +++++++++++
 src/test/recovery/t/019_replslot_limit.pl     | 26 +-------
 .../t/040_standby_failover_slots_sync.pl      | 43 +++++++++++++
 6 files changed, 173 insertions(+), 40 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..c8d97ab375 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2530,6 +2530,15 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       <para>
         The time since the slot has become inactive.
         <literal>NULL</literal> if the slot is currently being used.
+        Note that the slots on the standbys that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>), will get the
+        <structfield>inactive_since</structfield> value from the
+        corresponding remote slot on the primary. Also, after the standby
+        starts up, the <structfield>inactive_since</structfield> value
+        (for such synced slots) will remain <literal>NULL</literal> until
+        the next synchronization (see
+        <xref linkend="logicaldecoding-replication-slots-synchronization"/>).
       </para></entry>
      </row>
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..9c95a4b062 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -137,9 +137,12 @@ typedef struct RemoteSlot
 
 	/* RS_INVAL_NONE if valid, or the reason of invalidation */
 	ReplicationSlotInvalidationCause invalidated;
+
+	TimestampTz inactive_since; /* in seconds */
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_since(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -167,6 +170,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		remote_slot->two_phase == slot->data.two_phase &&
 		remote_slot->failover == slot->data.failover &&
 		remote_slot->confirmed_lsn == slot->data.confirmed_flush &&
+		remote_slot->inactive_since == slot->inactive_since &&
 		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
 		return false;
 
@@ -182,6 +186,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 	slot->data.confirmed_flush = remote_slot->confirmed_lsn;
 	slot->data.catalog_xmin = remote_slot->catalog_xmin;
 	slot->effective_catalog_xmin = remote_slot->catalog_xmin;
+	slot->inactive_since = remote_slot->inactive_since;
 	SpinLockRelease(&slot->mutex);
 
 	if (xmin_changed)
@@ -652,9 +657,9 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 static bool
 synchronize_slots(WalReceiverConn *wrconn)
 {
-#define SLOTSYNC_COLUMN_COUNT 9
+#define SLOTSYNC_COLUMN_COUNT 10
 	Oid			slotRow[SLOTSYNC_COLUMN_COUNT] = {TEXTOID, TEXTOID, LSNOID,
-	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID};
+	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID, TIMESTAMPTZOID};
 
 	WalRcvExecResult *res;
 	TupleTableSlot *tupslot;
@@ -663,7 +668,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, invalidation_reason"
+		" database, invalidation_reason, inactive_since"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
@@ -743,6 +748,13 @@ synchronize_slots(WalReceiverConn *wrconn)
 		remote_slot->invalidated = isnull ? RS_INVAL_NONE :
 			GetSlotInvalidationCause(TextDatumGetCString(d));
 
+		/*
+		 * It is possible to get null value for inactive_since if the slot is
+		 * active on the primary server, so handle accordingly.
+		 */
+		d = slot_getattr(tupslot, ++col, &isnull);
+		remote_slot->inactive_since = isnull ? 0 : DatumGetTimestampTz(d);
+
 		/* Sanity check */
 		Assert(col == SLOTSYNC_COLUMN_COUNT);
 
@@ -1296,6 +1308,46 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Update the inactive_since property for synced slots.
+ */
+static void
+update_synced_slots_inactive_since(void)
+{
+	TimestampTz now = 0;
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			Assert(SlotIsLogical(s));
+
+			/*
+			 * We get the current time beforehand and only once to avoid
+			 * system calls overhead while holding the lock.
+			 */
+			if (now == 0)
+				now = GetCurrentTimestamp();
+
+			/*
+			 * Set the time since the slot has become inactive after shutting
+			 * down slot sync machinery. This helps correctly interpret the
+			 * time if the standby gets promoted without a restart.
+			 */
+			SpinLockAcquire(&s->mutex);
+			s->inactive_since = now;
+			SpinLockRelease(&s->mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1309,6 +1361,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(&SlotSyncCtx->mutex);
+		update_synced_slots_inactive_since();
 		return;
 	}
 	SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1341,6 +1394,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(&SlotSyncCtx->mutex);
+
+	update_synced_slots_inactive_since();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d778c0b921..7dbb44b7b0 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -42,6 +42,7 @@
 #include "access/transam.h"
 #include "access/xlog_internal.h"
 #include "access/xlogrecovery.h"
+#include "access/xlogutils.h"
 #include "common/file_utils.h"
 #include "common/string.h"
 #include "miscadmin.h"
@@ -655,6 +656,7 @@ ReplicationSlotRelease(void)
 	char	   *slotname = NULL;	/* keep compiler quiet */
 	bool		is_logical = false; /* keep compiler quiet */
 	TimestampTz now = 0;
+	bool		is_slot_being_synced = false;
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
@@ -690,12 +692,15 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has become inactive.
+	 *
+	 * Note that we don't set it for the slots currently being synced from the
+	 * primary to the standby, because such slots typically sync the data from
+	 * the remote slot.
 	 */
-	if (!(RecoveryInProgress() && slot->data.synced))
+	if (RecoveryInProgress() && slot->data.synced)
+		is_slot_being_synced = true;
+	else
 		now = GetCurrentTimestamp();
 
 	if (slot->data.persistency == RS_PERSISTENT)
@@ -706,11 +711,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		if (!is_slot_being_synced)
+			slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
-	else
+	else if (!is_slot_being_synced)
 	{
 		SpinLockAcquire(&slot->mutex);
 		slot->inactive_since = now;
@@ -2369,13 +2375,21 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * Set the time since the slot has become inactive after loading the
+		 * slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it.
+		 *
+		 * Note that we don't set it for the slots currently being synced from
+		 * the primary to the standby, because such slots typically sync the
+		 * data from the remote slot. We use InRecovery flag instead of
+		 * RecoveryInProgress() as the latter always returns true at this time
+		 * even on primary.
+		 *
+		 * Note that for synced slots after the standby starts up (i.e. after
+		 * the slots are loaded from the disk), the inactive_since will remain
+		 * zero until the next slot sync cycle.
 		 */
-		if (!(RecoveryInProgress() && slot->data.synced))
+		if (!(InRecovery && slot->data.synced))
 			slot->inactive_since = GetCurrentTimestamp();
 		else
 			slot->inactive_since = 0;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b08296605c..ddfc3236f3 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3276,6 +3276,40 @@ sub create_logical_slot_on_standby
 
 =pod
 
+=item $node->get_slot_inactive_since_value(self, slot_name, reference_time)
+
+Get inactive_since column value for a given replication slot validating it
+against optional reference time.
+
+=cut
+
+sub get_slot_inactive_since_value
+{
+	my ($self, $slot_name, $reference_time) = @_;
+	my $name = $self->name;
+
+	my $inactive_since = $self->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
+		);
+
+	# Check that the captured time is sane
+	if (defined $reference_time)
+	{
+		is($self->safe_psql('postgres',
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+					'$inactive_since'::timestamptz >= '$reference_time'::timestamptz;]
+			),
+			't',
+			"last inactive time for slot $slot_name is valid on node $name")
+			or die "could not validate captured inactive_since for slot $slot_name";
+	}
+
+	return $inactive_since;
+}
+
+=pod
+
 =item $node->advance_wal(num)
 
 Advance WAL of node by given number of segments.
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 3b9a306a8b..c8e5e5054d 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -443,7 +443,7 @@ $primary4->safe_psql(
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the standby below.
 my $inactive_since =
-	capture_and_validate_slot_inactive_since($primary4, $sb4_slot, $slot_creation_time);
+	$primary4->get_slot_inactive_since_value($sb4_slot, $slot_creation_time);
 
 $standby4->start;
 
@@ -502,7 +502,7 @@ $publisher4->safe_psql('postgres',
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the subscriber below.
 $inactive_since =
-	capture_and_validate_slot_inactive_since($publisher4, $lsub4_slot, $slot_creation_time);
+	$publisher4->get_slot_inactive_since_value($lsub4_slot, $slot_creation_time);
 
 $subscriber4->start;
 $subscriber4->safe_psql('postgres',
@@ -540,26 +540,4 @@ is( $publisher4->safe_psql(
 $publisher4->stop;
 $subscriber4->stop;
 
-# Capture and validate inactive_since of a given slot.
-sub capture_and_validate_slot_inactive_since
-{
-	my ($node, $slot_name, $slot_creation_time) = @_;
-
-	my $inactive_since = $node->safe_psql('postgres',
-		qq(SELECT inactive_since FROM pg_replication_slots
-			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
-		);
-
-	# Check that the captured time is sane
-	is( $node->safe_psql(
-			'postgres',
-			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
-				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
-		),
-		't',
-		"last inactive time for an active slot $slot_name is sane");
-
-	return $inactive_since;
-}
-
 done_testing();
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index f47bfd78eb..33e3a8dcf0 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -35,6 +35,13 @@ my $subscriber1 = PostgreSQL::Test::Cluster->new('subscriber1');
 $subscriber1->init;
 $subscriber1->start;
 
+# Capture the time before the logical failover slot is created on the
+# primary. We later call this publisher as primary anyway.
+my $slot_creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Create a slot on the publisher with failover disabled
 $publisher->safe_psql('postgres',
 	"SELECT 'init' FROM pg_create_logical_replication_slot('lsub1_slot', 'pgoutput', false, false, false);"
@@ -174,6 +181,10 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary
+my $inactive_since_on_primary =
+	$primary->get_slot_inactive_since_value('lsub1_slot', $slot_creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -190,6 +201,18 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Capture the inactive_since of the synced slot on the standby
+my $inactive_since_on_standby =
+	$standby1->get_slot_inactive_since_value('lsub1_slot', $slot_creation_time_on_primary);
+
+# Synced slots on the standby must get the inactive_since from the primary.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_primary'::timestamptz = '$inactive_since_on_standby'::timestamptz;"
+	),
+	"t",
+	'synchronized slot has got the inactive_since from the primary');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
@@ -750,8 +773,28 @@ $primary->reload;
 $standby1->start;
 $primary->wait_for_replay_catchup($standby1);
 
+# Capture the time before the standby is promoted
+my $promotion_time_on_primary = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 $standby1->promote;
 
+# Capture the inactive_since of the synced slot after the promotion.
+# Expectation here is that the slot gets its own inactive_since as part of the
+# promotion. We do this check before the slot is enabled on the new primary
+# below, otherwise the slot gets active setting inactive_since to NULL.
+my $inactive_since_on_new_primary =
+	$standby1->get_slot_inactive_since_value('lsub1_slot', $promotion_time_on_primary);
+
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_new_primary'::timestamptz > '$inactive_since_on_primary'::timestamptz"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since on the new primary');
+
 # Update subscription with the new primary's connection info
 my $standby1_conninfo = $standby1->connstr . ' dbname=postgres';
 $subscriber1->safe_psql('postgres',
-- 
2.34.1

v29-0002-Add-inactive_timeout-based-replication-slot-inva.patchapplication/octet-stream; name=v29-0002-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 88ab4ed8346103b61c37d471db40ea04f2837596 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 27 Mar 2024 15:28:40 +0000
Subject: [PATCH v29 2/2] Add inactive_timeout based replication slot
 invalidation.

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days at slot level, after which the inactive slots get
dropped.

To achieve the above, postgres introduces a GUC allowing users
set inactive timeout and then a slot stays inactive for this much
amount of time it invalidates the slot. The invalidation check
happens at various locations to help being as latest as possible,
these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint
Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/config.sgml                      |  18 ++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 171 ++++++++++-
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   9 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 278 ++++++++++++++++++
 13 files changed, 499 insertions(+), 12 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 65a6e6c408..db240c5b44 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4544,6 +4544,24 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time. If this value is specified without units, it is taken
+        as seconds. A value of zero (which is default) disables the timeout
+        mechanism. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index c8d97ab375..5c05fd1c07 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2582,6 +2582,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 9c95a4b062..71b6a254cf 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -321,7 +321,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -531,7 +531,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
 		 * if the slot is not acquired by other processes.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 7dbb44b7b0..0c5a601d87 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -108,10 +108,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -159,6 +161,8 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool InvalidateSlotForInactiveTimeout(ReplicationSlot *slot,
+											 bool need_locks);
 
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
@@ -536,9 +540,13 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * If check_for_invalidation is true, the slot is checked for invalidation
+ * based on replication_slot_inactive_timeout GUC and an error is raised after making the slot ours.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -616,6 +624,41 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * Check if the given slot can be invalidated based on its inactive
+	 * timeout. If yes, persist the invalidated state to disk and then error
+	 * out. We do this only after making the slot ours to avoid anyone else
+	 * acquiring it while we check for its invalidation.
+	 */
+	if (check_for_invalidation)
+	{
+		/* The slot is ours by now */
+		Assert(s->active_pid == MyProcPid);
+
+		/*
+		 * Well, the slot is not yet ours really unless we check for the
+		 * invalidation below.
+		 */
+		s->active_pid = 0;
+		if (InvalidateReplicationSlotForInactiveTimeout(s, true, true))
+		{
+			/*
+			 * If the slot has been invalidated, recalculate the resource
+			 * limits.
+			 */
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+
+			/* Might need it for slot clean up on error, so restore it */
+			s->active_pid = MyProcPid;
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("cannot acquire invalidated replication slot \"%s\"",
+							NameStr(MyReplicationSlot->data.name))));
+		}
+		s->active_pid = MyProcPid;
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -790,7 +833,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -813,7 +856,7 @@ ReplicationSlotAlter(const char *name, bool failover)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1515,6 +1558,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by replication_slot_inactive_timeout parameter."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1628,6 +1674,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					if (InvalidateReplicationSlotForInactiveTimeout(s, false, false))
+						invalidation_cause = cause;
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1781,6 +1831,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1832,6 +1883,103 @@ restart:
 	return invalidated;
 }
 
+/*
+ * Invalidate given slot based on replication_slot_inactive_timeout GUC.
+ *
+ * Returns true if the slot has got invalidated.
+ *
+ * NB - this function also runs as part of checkpoint, so avoid raising errors
+ * if possible.
+ */
+bool
+InvalidateReplicationSlotForInactiveTimeout(ReplicationSlot *slot,
+											bool need_locks,
+											bool persist_state)
+{
+	if (!InvalidateSlotForInactiveTimeout(slot, need_locks))
+		return false;
+
+	Assert(slot->active_pid == 0);
+
+	SpinLockAcquire(&slot->mutex);
+	slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT;
+
+	/* Make sure the invalidated state persists across server restart */
+	slot->just_dirtied = true;
+	slot->dirty = true;
+	SpinLockRelease(&slot->mutex);
+
+	if (persist_state)
+	{
+		char		path[MAXPGPATH];
+
+		sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+		SaveSlotToPath(slot, path, ERROR);
+	}
+
+	ReportSlotInvalidation(RS_INVAL_INACTIVE_TIMEOUT, false, 0,
+						   slot->data.name, InvalidXLogRecPtr,
+						   InvalidXLogRecPtr, InvalidTransactionId);
+
+	return true;
+}
+
+/*
+ * Helper for InvalidateReplicationSlotForInactiveTimeout
+ */
+static bool
+InvalidateSlotForInactiveTimeout(ReplicationSlot *slot, bool need_locks)
+{
+	ReplicationSlotInvalidationCause inavidation_cause = RS_INVAL_NONE;
+
+	if (slot->inactive_since == 0 ||
+		replication_slot_inactive_timeout == 0)
+		return false;
+
+	/*
+	 * Do not invalidate the slots which are currently being synced from the
+	 * primary to the standby.
+	 */
+	if (RecoveryInProgress() && slot->data.synced)
+		return false;
+
+	if (need_locks)
+	{
+		LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+		SpinLockAcquire(&slot->mutex);
+	}
+
+	Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
+
+	/*
+	 * Check if the slot needs to be invalidated due to
+	 * replication_slot_inactive_timeout GUC. We do this with the spinlock
+	 * held to avoid race conditions -- for example the restart_lsn could move
+	 * forward, or the slot could be dropped.
+	 */
+	if (slot->inactive_since > 0 &&
+		replication_slot_inactive_timeout > 0)
+	{
+		TimestampTz now;
+
+		/* inactive_since is only tracked for inactive slots */
+		Assert(slot->active_pid == 0);
+
+		now = GetCurrentTimestamp();
+		if (TimestampDifferenceExceeds(slot->inactive_since, now,
+									   replication_slot_inactive_timeout * 1000))
+			inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+	}
+
+	if (need_locks)
+	{
+		SpinLockRelease(&slot->mutex);
+		LWLockRelease(ReplicationSlotControlLock);
+	}
+
+	return (inavidation_cause == RS_INVAL_INACTIVE_TIMEOUT);
+}
+
 /*
  * Flush all replication slots to disk.
  *
@@ -1844,6 +1992,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated = false;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -1867,6 +2016,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		/* save the slot to disk, locking is handled in SaveSlotToPath() */
 		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
 
+		/*
+		 * Here's an opportunity to invalidate inactive replication slots
+		 * based on timeout, so let's do it.
+		 */
+		if (InvalidateReplicationSlotForInactiveTimeout(s, true, false))
+			invalidated = true;
+
 		/*
 		 * Slot's data is not flushed each time the confirmed_flush LSN is
 		 * updated as that could lead to frequent writes.  However, we decide
@@ -1893,6 +2049,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	/* If the slot has been invalidated, recalculate the resource limits */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index da57177c25..677c0bf0a2 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -651,7 +651,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bc40c454de..96eeb8b7d2 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -846,7 +846,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1459,7 +1459,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 1e71e7db4a..c97a2d83a7 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2954,6 +2954,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 2244ee52f7..622cdf503d 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -334,6 +334,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7b937d1a0c..cd98ab5112 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -230,6 +232,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -245,7 +248,8 @@ extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, bool failover);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
@@ -264,6 +268,9 @@ extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
+extern bool InvalidateReplicationSlotForInactiveTimeout(ReplicationSlot *slot,
+														bool need_locks,
+														bool persist_state);
 extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock);
 extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..cf233c5513
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,278 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to inactive timeout GUC. Also, check the logical
+# failover slot synced on to the standby doesn't invalidate the slot on its own,
+# but gets the invalidated state from the remote slot on the primary.
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr_1 = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb1_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('lsub1_sync_slot', 'test_decoding', false, false, true);}
+);
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot');
+]);
+
+$standby1->start;
+
+my $standby1_logstart = -s $standby1->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Synchronize the primary server slots to the standby.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced'.
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub1_sync_slot' AND synced AND NOT temporary;}
+	),
+	"t",
+	'logical slot has synced as true on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$primary->reload;
+
+# Wait for the logical failover slot to become inactive on the primary. Note
+# that nobody has acquired that slot yet, so due to inactive timeout setting
+# above it must get invalidated.
+wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart);
+
+# Set timeout on the standby also to check the synced slots don't get
+# invalidated due to timeout on the standby.
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$standby1->reload;
+
+# Now, sync the logical failover slot from the remote slot on the primary.
+# Note that the remote slot has already been invalidated due to inactive
+# timeout. Now, the standby must also see it as invalidated.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Wait for the inactive replication slot to be invalidated.
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'lsub1_sync_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for replication slot lsub1_sync_slot invalidation to be synced on standby";
+
+# Synced slot mustn't get invalidated on the standby, it must sync invalidation
+# from the primary. So, we must not see the slot's invalidation message in server
+# log.
+ok( !$standby1->log_contains(
+		"invalidating obsolete replication slot \"lsub1_sync_slot\"",
+		$standby1_logstart),
+	'check that syned slot has not been invalidated on the standby');
+
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
+wait_for_slot_invalidation($primary, 'sb1_slot', $logstart);
+
+# Testcase end: Invalidate streaming standby's slot as well as logical failover
+# slot on primary due to inactive timeout GUC. Also, check the logical failover
+# slot synced on to the standby doesn't invalidate the slot on its own, but
+# gets the invalidated state from the remote slot on the primary.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to inactive timeout
+# GUC.
+
+my $publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$subscriber->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart);
+
+# Testcase end: Invalidate logical subscriber's slot due to inactive timeout
+# GUC.
+# =============================================================================
+
+# =============================================================================
+# Start: Helper functions used for this test file
+
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $name = $node->name;
+
+	# Wait for the replication slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for replication slot to become inactive";
+
+	# Wait for the replication slot info to be updated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE inactive_since IS NOT NULL
+				AND slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for info of replication slot $slot_name to be updated on node $name";
+
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+
+	# Wait for the inactive replication slot to be invalidated.
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for inactive replication slot $slot_name to be invalidated on node $name";
+
+	# Check that the invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+
+	($result, $stdout, $stderr) = $primary->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot_name', '0/1');
+	]);
+
+	ok( $stderr =~ /cannot acquire invalidated replication slot "$slot_name"/,
+		"detected error upon trying to acquire invalidated slot $slot_name on node $name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot_name";
+}
+
+# Check for invalidation of slot in server log.
+sub check_for_slot_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot_name invalidation has been logged");
+}
+
+# =============================================================================
+# End: Helper functions used for this test file
+
+done_testing();
-- 
2.34.1

#181

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#180)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Mar 27, 2024 at 09:00:37PM +0530, Bharath Rupireddy wrote:

On Wed, Mar 27, 2024 at 6:54 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
Hi,

On Wed, Mar 27, 2024 at 05:55:05PM +0530, Bharath Rupireddy wrote:

On Wed, Mar 27, 2024 at 3:42 PM Bertrand Drouvot
Please see the attached v28 patch.

Thanks!

1 === sorry I missed it in the previous review
if (!(RecoveryInProgress() && slot->data.synced))
+       {
now = GetCurrentTimestamp();
+               update_inactive_since = true;
+       }
+       else
+               update_inactive_since = false;
I think update_inactive_since is not needed, we could rely on (now > 0) instead.
Thought of using it, but, at the expense of readability. I prefer to
use a variable instead.

That's fine too.

However, I changed the variable to be more meaningful to is_slot_being_synced.

Yeah makes sense and even easier to read.

v29-0001 LGTM.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#182

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#180)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 27, 2024 at 9:00 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the
standby for sync slots. 0002 implementing inactive timeout GUC based
invalidation mechanism.

Please have a look.

Thanks for the patches. v29-001 looks good to me.

thanks
Shveta

#183

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#180)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Mar 27, 2024 at 09:00:37PM +0530, Bharath Rupireddy wrote:

standby for sync slots. 0002 implementing inactive timeout GUC based
invalidation mechanism.

Please have a look.

Thanks!

Regarding 0002:

Some testing:

T1 ===

When the slot is invalidated on the primary, then the reason is propagated to
the sync slot (if any). That's fine but we are loosing the inactive_since on the
standby:

Primary:

postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from pg_replication_slots where slot_name='lsub29_slot';
slot_name | inactive_since | conflicting | invalidation_reason
-------------+-------------------------------+-------------+---------------------
lsub29_slot | 2024-03-28 08:24:51.672528+00 | f | inactive_timeout
(1 row)

Standby:

postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from pg_replication_slots where slot_name='lsub29_slot';
slot_name | inactive_since | conflicting | invalidation_reason
-------------+----------------+-------------+---------------------
lsub29_slot | | f | inactive_timeout
(1 row)

I think in this case it should always reflect the value from the primary (so
that one can understand why it is invalidated).

T2 ===

And it is set to a value during promotion:

postgres=# select pg_promote();
pg_promote
------------
t
(1 row)

postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from pg_replication_slots where slot_name='lsub29_slot';
slot_name | inactive_since | conflicting | invalidation_reason
-------------+------------------------------+-------------+---------------------
lsub29_slot | 2024-03-28 08:30:11.74505+00 | f | inactive_timeout
(1 row)

I think when it is invalidated it should always reflect the value from the
primary (so that one can understand why it is invalidated).

T3 ===

As far the slot invalidation on the primary:

postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub29_slot', NULL, NULL, 'include-xids', '0');
ERROR: cannot acquire invalidated replication slot "lsub29_slot"

Can we make the message more consistent with what can be found in CreateDecodingContext()
for example?

T4 ===

Also, it looks like querying pg_replication_slots() does not trigger an
invalidation: I think it should if the slot is not invalidated yet (and matches
the invalidation criteria).

Code review:

CR1 ===

+        Invalidate replication slots that are inactive for longer than this
+        amount of time. If this value is specified without units, it is taken

s/Invalidate/Invalidates/?

Should we mention the relationship with inactive_since?

CR2 ===

+ *
+ * If check_for_invalidation is true, the slot is checked for invalidation
+ * based on replication_slot_inactive_timeout GUC and an error is raised after making the slot ours.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+                                          bool check_for_invalidation)

s/check_for_invalidation/check_for_timeout_invalidation/?

CR3 ===

+       if (slot->inactive_since == 0 ||
+               replication_slot_inactive_timeout == 0)
+               return false;

Better to test replication_slot_inactive_timeout first? (I mean there is no
point of testing inactive_since if replication_slot_inactive_timeout == 0)

CR4 ===

+       if (slot->inactive_since > 0 &&
+               replication_slot_inactive_timeout > 0)
+       {

Same.

So, instead of CR3 === and CR4 ===, I wonder if it wouldn't be better to do
something like:

if (replication_slot_inactive_timeout == 0)
return false;
else if (slot->inactive_since > 0)
.
.
.
.
else
return false;

That would avoid checking replication_slot_inactive_timeout and inactive_since
multiple times.

CR5 ===

+        * held to avoid race conditions -- for example the restart_lsn could move
+        * forward, or the slot could be dropped.

Does the restart_lsn example makes sense here?

CR6 ===

+static bool
+InvalidateSlotForInactiveTimeout(ReplicationSlot *slot, bool need_locks)
+{

InvalidatePossiblyInactiveSlot() maybe?

CR7 ===

+       /* Make sure the invalidated state persists across server restart */
+       slot->just_dirtied = true;
+       slot->dirty = true;
+       SpinLockRelease(&slot->mutex);

Maybe we could create a new function say MarkGivenReplicationSlotDirty()
with a slot as parameter, that ReplicationSlotMarkDirty could call too?

Then maybe we could set slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT in
InvalidateSlotForInactiveTimeout()? (to avoid multiple SpinLockAcquire/SpinLockRelease).

CR8 ===

+       if (persist_state)
+       {
+               char            path[MAXPGPATH];
+
+               sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+               SaveSlotToPath(slot, path, ERROR);
+       }

Maybe we could create a new function say GivenReplicationSlotSave()
with a slot as parameter, that ReplicationSlotSave() could call too?

CR9 ===

+       if (check_for_invalidation)
+       {
+               /* The slot is ours by now */
+               Assert(s->active_pid == MyProcPid);
+
+               /*
+                * Well, the slot is not yet ours really unless we check for the
+                * invalidation below.
+                */
+               s->active_pid = 0;
+               if (InvalidateReplicationSlotForInactiveTimeout(s, true, true))
+               {
+                       /*
+                        * If the slot has been invalidated, recalculate the resource
+                        * limits.
+                        */
+                       ReplicationSlotsComputeRequiredXmin(false);
+                       ReplicationSlotsComputeRequiredLSN();
+
+                       /* Might need it for slot clean up on error, so restore it */
+                       s->active_pid = MyProcPid;
+                       ereport(ERROR,
+                                       (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                                        errmsg("cannot acquire invalidated replication slot \"%s\"",
+                                                       NameStr(MyReplicationSlot->data.name))));
+               }
+               s->active_pid = MyProcPid;

Are we not missing some SpinLockAcquire/Release on the slot's mutex here? (the
places where we set the active_pid).

CR10 ===

@@ -1628,6 +1674,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
                                        if (SlotIsLogical(s))
                                                invalidation_cause = cause;
                                        break;
+                               case RS_INVAL_INACTIVE_TIMEOUT:
+                                       if (InvalidateReplicationSlotForInactiveTimeout(s, false, false))
+                                               invalidation_cause = cause;
+                                       break;

InvalidatePossiblyObsoleteSlot() is not called with such a reason, better to use
an Assert here and in the caller too?

CR11 ===

+++ b/src/test/recovery/t/050_invalidate_slots.pl

why not using 019_replslot_limit.pl?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#184

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#180)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Mar 27, 2024 at 9:00 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the
standby for sync slots.

Commit message states: "why we can't just update inactive_since for
synced slots on the standby with the value received from remote slot
on the primary. This is consistent with any other slot parameter i.e.
all of them are synced from the primary."

The inactive_since is not consistent with other slot parameters which
we copy. We don't perform anything related to those other parameters
like say two_phase phase which can change that property. However, we
do acquire the slot, advance the slot (as per recent discussion [1]/messages/by-id/OS0PR01MB571615D35F486080616CA841943A2@OS0PR01MB5716.jpnprd01.prod.outlook.com),
and release it. Since these operations can impact inactive_since, it
seems to me that inactive_since is not the same as other parameters.
It can have a different value than the primary. Why would anyone want
to know the value of inactive_since from primary after the standby is
promoted? Now, the other concern is that calling GetCurrentTimestamp()
could be costly when the values for the slot are not going to be
updated but if that happens we can optimize such that before acquiring
the slot we can have some minimal pre-checks to ensure whether we need
to update the slot or not.

[1]: /messages/by-id/OS0PR01MB571615D35F486080616CA841943A2@OS0PR01MB5716.jpnprd01.prod.outlook.com

--
With Regards,
Amit Kapila.

#185

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#184)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote:

On Wed, Mar 27, 2024 at 9:00 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the
standby for sync slots.

Commit message states: "why we can't just update inactive_since for
synced slots on the standby with the value received from remote slot
on the primary. This is consistent with any other slot parameter i.e.
all of them are synced from the primary."

The inactive_since is not consistent with other slot parameters which
we copy. We don't perform anything related to those other parameters
like say two_phase phase which can change that property. However, we
do acquire the slot, advance the slot (as per recent discussion [1]),
and release it. Since these operations can impact inactive_since, it
seems to me that inactive_since is not the same as other parameters.
It can have a different value than the primary. Why would anyone want
to know the value of inactive_since from primary after the standby is
promoted?

I think it can be useful "before" it is promoted and in case the primary is down.
I agree that tracking the activity time of a synced slot can be useful, why
not creating a dedicated field for that purpose (and keep inactive_since a
perfect "copy" of the primary)?

Now, the other concern is that calling GetCurrentTimestamp()
could be costly when the values for the slot are not going to be
updated but if that happens we can optimize such that before acquiring
the slot we can have some minimal pre-checks to ensure whether we need
to update the slot or not.

Right, but for a very active slot it is likely that we call GetCurrentTimestamp()
during almost each sync cycle.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#186

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#185)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote:

Commit message states: "why we can't just update inactive_since for
synced slots on the standby with the value received from remote slot
on the primary. This is consistent with any other slot parameter i.e.
all of them are synced from the primary."

The inactive_since is not consistent with other slot parameters which
we copy. We don't perform anything related to those other parameters
like say two_phase phase which can change that property. However, we
do acquire the slot, advance the slot (as per recent discussion [1]),
and release it. Since these operations can impact inactive_since, it
seems to me that inactive_since is not the same as other parameters.
It can have a different value than the primary. Why would anyone want
to know the value of inactive_since from primary after the standby is
promoted?

I think it can be useful "before" it is promoted and in case the primary is down.

It is not clear to me what is user going to do by checking the
inactivity time for slots when the corresponding server is down. I
thought the idea was to check such slots and see if they need to be
dropped or enabled again to avoid excessive disk usage, etc.

I agree that tracking the activity time of a synced slot can be useful, why
not creating a dedicated field for that purpose (and keep inactive_since a
perfect "copy" of the primary)?

We can have a separate field for this but not sure if it is worth it.

Now, the other concern is that calling GetCurrentTimestamp()
could be costly when the values for the slot are not going to be
updated but if that happens we can optimize such that before acquiring
the slot we can have some minimal pre-checks to ensure whether we need
to update the slot or not.

Right, but for a very active slot it is likely that we call GetCurrentTimestamp()
during almost each sync cycle.

True, but if we have to save a slot to disk each time to persist the
changes (for an active slot) then probably GetCurrentTimestamp()
shouldn't be costly enough to matter.

--
With Regards,
Amit Kapila.

#187

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#186)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Fri, Mar 29, 2024 at 03:03:01PM +0530, Amit Kapila wrote:

On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote:

Commit message states: "why we can't just update inactive_since for
synced slots on the standby with the value received from remote slot
on the primary. This is consistent with any other slot parameter i.e.
all of them are synced from the primary."

The inactive_since is not consistent with other slot parameters which
we copy. We don't perform anything related to those other parameters
like say two_phase phase which can change that property. However, we
do acquire the slot, advance the slot (as per recent discussion [1]),
and release it. Since these operations can impact inactive_since, it
seems to me that inactive_since is not the same as other parameters.
It can have a different value than the primary. Why would anyone want
to know the value of inactive_since from primary after the standby is
promoted?

I think it can be useful "before" it is promoted and in case the primary is down.

It is not clear to me what is user going to do by checking the
inactivity time for slots when the corresponding server is down.

Say a failover needs to be done, then it could be useful to know for which
slots the activity needs to be resumed (thinking about external logical decoding
plugin, not about pub/sub here). If one see an inactive slot (since long "enough")
then he can start to reasonate about what to do with it.

I thought the idea was to check such slots and see if they need to be
dropped or enabled again to avoid excessive disk usage, etc.

Yeah that's the case but it does not mean inactive_since can't be useful in other
ways.

Also, say the slot has been invalidated on the primary (due to inactivity timeout),
primary is down and there is a failover. By keeping the inactive_since from
the primary, one could know when the inactivity that lead to the timeout started.

Again, more concerned about external logical decoding plugin than pub/sub here.

I agree that tracking the activity time of a synced slot can be useful, why
not creating a dedicated field for that purpose (and keep inactive_since a
perfect "copy" of the primary)?

We can have a separate field for this but not sure if it is worth it.

OTOH I'm not sure that erasing this information from the primary is useful. I
think that 2 fields would be the best option and would be less subject of
misinterpretation.

Now, the other concern is that calling GetCurrentTimestamp()
could be costly when the values for the slot are not going to be
updated but if that happens we can optimize such that before acquiring
the slot we can have some minimal pre-checks to ensure whether we need
to update the slot or not.

Right, but for a very active slot it is likely that we call GetCurrentTimestamp()
during almost each sync cycle.

True, but if we have to save a slot to disk each time to persist the
changes (for an active slot) then probably GetCurrentTimestamp()
shouldn't be costly enough to matter.

Right, persisting the changes to disk would be even more costly.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#188

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#183)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Mar 28, 2024 at 3:13 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Regarding 0002:

Thanks for reviewing it.

Some testing:

T1 ===

When the slot is invalidated on the primary, then the reason is propagated to
the sync slot (if any). That's fine but we are loosing the inactive_since on the
standby:

Primary:

postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from pg_replication_slots where slot_name='lsub29_slot';
slot_name | inactive_since | conflicting | invalidation_reason
-------------+-------------------------------+-------------+---------------------
lsub29_slot | 2024-03-28 08:24:51.672528+00 | f | inactive_timeout
(1 row)

Standby:

postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from pg_replication_slots where slot_name='lsub29_slot';
slot_name | inactive_since | conflicting | invalidation_reason
-------------+----------------+-------------+---------------------
lsub29_slot | | f | inactive_timeout
(1 row)

I think in this case it should always reflect the value from the primary (so
that one can understand why it is invalidated).

I'll come back to this as soon as we all agree on inactive_since
behavior for synced slots.

T2 ===

And it is set to a value during promotion:

postgres=# select pg_promote();
pg_promote
------------
t
(1 row)

postgres=# select slot_name,inactive_since,conflicting,invalidation_reason from pg_replication_slots where slot_name='lsub29_slot';
slot_name | inactive_since | conflicting | invalidation_reason
-------------+------------------------------+-------------+---------------------
lsub29_slot | 2024-03-28 08:30:11.74505+00 | f | inactive_timeout
(1 row)

I think when it is invalidated it should always reflect the value from the
primary (so that one can understand why it is invalidated).

I'll come back to this as soon as we all agree on inactive_since
behavior for synced slots.

T3 ===

As far the slot invalidation on the primary:

postgres=# SELECT * FROM pg_logical_slot_get_changes('lsub29_slot', NULL, NULL, 'include-xids', '0');
ERROR: cannot acquire invalidated replication slot "lsub29_slot"

Can we make the message more consistent with what can be found in CreateDecodingContext()
for example?

Hm, that makes sense because slot acquisition and release is something
internal to the server.

T4 ===

Also, it looks like querying pg_replication_slots() does not trigger an
invalidation: I think it should if the slot is not invalidated yet (and matches
the invalidation criteria).

There's a different opinion on this, check comment #3 from
/messages/by-id/CAA4eK1LLj+eaMN-K8oeOjfG+UuzTY=L5PXbcMJURZbFm+_aJSA@mail.gmail.com.

Code review:

CR1 ===

+        Invalidate replication slots that are inactive for longer than this
+        amount of time. If this value is specified without units, it is taken

s/Invalidate/Invalidates/?

Done.

Should we mention the relationship with inactive_since?

Done.

CR2 ===

+ *
+ * If check_for_invalidation is true, the slot is checked for invalidation
+ * based on replication_slot_inactive_timeout GUC and an error is raised after making the slot ours.
*/
void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+                                          bool check_for_invalidation)

s/check_for_invalidation/check_for_timeout_invalidation/?

Done.

CR3 ===
+       if (slot->inactive_since == 0 ||
+               replication_slot_inactive_timeout == 0)
+               return false;
Better to test replication_slot_inactive_timeout first? (I mean there is no
point of testing inactive_since if replication_slot_inactive_timeout == 0)

CR4 ===
+       if (slot->inactive_since > 0 &&
+               replication_slot_inactive_timeout > 0)
+       {
Same.

So, instead of CR3 === and CR4 ===, I wonder if it wouldn't be better to do
something like:

if (replication_slot_inactive_timeout == 0)
return false;
else if (slot->inactive_since > 0)
.
else
return false;

That would avoid checking replication_slot_inactive_timeout and inactive_since
multiple times.

Done.

CR5 ===

+        * held to avoid race conditions -- for example the restart_lsn could move
+        * forward, or the slot could be dropped.

Does the restart_lsn example makes sense here?

No, it doesn't. Modified that.

CR6 ===
+static bool
+InvalidateSlotForInactiveTimeout(ReplicationSlot *slot, bool need_locks)
+{
InvalidatePossiblyInactiveSlot() maybe?

I think we will lose the essence i.e. timeout from the suggested
function name, otherwise just the inactive doesn't give a clearer
meaning. I kept it that way unless anyone suggests otherwise.

CR7 ===
+       /* Make sure the invalidated state persists across server restart */
+       slot->just_dirtied = true;
+       slot->dirty = true;
+       SpinLockRelease(&slot->mutex);
Maybe we could create a new function say MarkGivenReplicationSlotDirty()
with a slot as parameter, that ReplicationSlotMarkDirty could call too?

Done that.

Then maybe we could set slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT in
InvalidateSlotForInactiveTimeout()? (to avoid multiple SpinLockAcquire/SpinLockRelease).

Done that.

CR8 ===
+       if (persist_state)
+       {
+               char            path[MAXPGPATH];
+
+               sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+               SaveSlotToPath(slot, path, ERROR);
+       }
Maybe we could create a new function say GivenReplicationSlotSave()
with a slot as parameter, that ReplicationSlotSave() could call too?

Done that.

CR9 ===

+       if (check_for_invalidation)
+       {
+               /* The slot is ours by now */
+               Assert(s->active_pid == MyProcPid);
+
+               /*
+                * Well, the slot is not yet ours really unless we check for the
+                * invalidation below.
+                */
+               s->active_pid = 0;
+               if (InvalidateReplicationSlotForInactiveTimeout(s, true, true))
+               {
+                       /*
+                        * If the slot has been invalidated, recalculate the resource
+                        * limits.
+                        */
+                       ReplicationSlotsComputeRequiredXmin(false);
+                       ReplicationSlotsComputeRequiredLSN();
+
+                       /* Might need it for slot clean up on error, so restore it */
+                       s->active_pid = MyProcPid;
+                       ereport(ERROR,
+                                       (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                                        errmsg("cannot acquire invalidated replication slot \"%s\"",
+                                                       NameStr(MyReplicationSlot->data.name))));
+               }
+               s->active_pid = MyProcPid;

Are we not missing some SpinLockAcquire/Release on the slot's mutex here? (the
places where we set the active_pid).

Hm, yes. But, shall I acquire the mutex, set active_pid to 0 for a
moment just to satisfy Assert(slot->active_pid == 0); in
InvalidateReplicationSlotForInactiveTimeout and
InvalidateSlotForInactiveTimeout? I just removed the assertions
because being replication_slot_inactive_timeout > 0 and inactive_since

0 is enough for these functions to think and decide on inactive

timeout invalidation.

CR10 ===

@@ -1628,6 +1674,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
if (SlotIsLogical(s))
invalidation_cause = cause;
break;
+                               case RS_INVAL_INACTIVE_TIMEOUT:
+                                       if (InvalidateReplicationSlotForInactiveTimeout(s, false, false))
+                                               invalidation_cause = cause;
+                                       break;

InvalidatePossiblyObsoleteSlot() is not called with such a reason, better to use
an Assert here and in the caller too?

Done.

CR11 ===
+++ b/src/test/recovery/t/050_invalidate_slots.pl
why not using 019_replslot_limit.pl?

I understand that 019_replslot_limit covers wal_removed related
invalidations. But, I don't want to kludge it with a bunch of other
tests. The new tests anyway need a bunch of new nodes and a couple of
helper functions. Any future invalidation mechanisms can be added here
in this new file. Also, having a separate file quickly helps isolate
any test failures that BF animals might report in future. I don't
think a separate test file here hurts anyone unless there's a strong
reason against it.

Please see the attached v30 patch. 0002 is where all of the above
review comments have been addressed.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v30-0001-Maintain-inactive_since-for-synced-slots-correct.patchapplication/x-patch; name=v30-0001-Maintain-inactive_since-for-synced-slots-correct.patchDownload

From b84cf9dc5d20e202e08c372e0aa7850966ed7271 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 30 Mar 2024 20:52:48 +0000
Subject: [PATCH v30 1/2] Maintain inactive_since for synced slots correctly.

The slot's inactive_since isn't currently maintained for
synced slots on the standby. The commit a11f330b55 prevents
updating inactive_since with RecoveryInProgress() check in
RestoreSlotFromDisk(). But, the issue is that
RecoveryInProgress() always returns true in
RestoreSlotFromDisk() as 'xlogctl->SharedRecoveryState' is
always 'RECOVERY_STATE_CRASH' at that time. Because of this,
inactive_since is always NULL on a promoted standby for all
synced slots even after server restart.

Above issue led us to a question as to why we can't just update
inactive_since for synced slots on the standby with the value
received from remote slot on the primary. This is consistent with
any other slot parameter i.e. all of them are synced from the
primary.

This commit does two things:
1) Updates inactive_since for sync slots with the value
received from the primary's slot.

2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACWLctoiH-pSjWnEpR54q4DED6rw_BRJm5pCx86_Y01MoQ%40mail.gmail.com
---
---
 doc/src/sgml/system-views.sgml                |  9 +++
 src/backend/replication/logical/slotsync.c    | 61 ++++++++++++++++++-
 src/backend/replication/slot.c                | 40 ++++++++----
 src/test/perl/PostgreSQL/Test/Cluster.pm      | 34 +++++++++++
 src/test/recovery/t/019_replslot_limit.pl     | 26 +-------
 .../t/040_standby_failover_slots_sync.pl      | 43 +++++++++++++
 6 files changed, 173 insertions(+), 40 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..c8d97ab375 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2530,6 +2530,15 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       <para>
         The time since the slot has become inactive.
         <literal>NULL</literal> if the slot is currently being used.
+        Note that the slots on the standbys that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>), will get the
+        <structfield>inactive_since</structfield> value from the
+        corresponding remote slot on the primary. Also, after the standby
+        starts up, the <structfield>inactive_since</structfield> value
+        (for such synced slots) will remain <literal>NULL</literal> until
+        the next synchronization (see
+        <xref linkend="logicaldecoding-replication-slots-synchronization"/>).
       </para></entry>
      </row>
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..9c95a4b062 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -137,9 +137,12 @@ typedef struct RemoteSlot
 
 	/* RS_INVAL_NONE if valid, or the reason of invalidation */
 	ReplicationSlotInvalidationCause invalidated;
+
+	TimestampTz inactive_since; /* in seconds */
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_since(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -167,6 +170,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		remote_slot->two_phase == slot->data.two_phase &&
 		remote_slot->failover == slot->data.failover &&
 		remote_slot->confirmed_lsn == slot->data.confirmed_flush &&
+		remote_slot->inactive_since == slot->inactive_since &&
 		strcmp(remote_slot->plugin, NameStr(slot->data.plugin)) == 0)
 		return false;
 
@@ -182,6 +186,7 @@ update_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 	slot->data.confirmed_flush = remote_slot->confirmed_lsn;
 	slot->data.catalog_xmin = remote_slot->catalog_xmin;
 	slot->effective_catalog_xmin = remote_slot->catalog_xmin;
+	slot->inactive_since = remote_slot->inactive_since;
 	SpinLockRelease(&slot->mutex);
 
 	if (xmin_changed)
@@ -652,9 +657,9 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 static bool
 synchronize_slots(WalReceiverConn *wrconn)
 {
-#define SLOTSYNC_COLUMN_COUNT 9
+#define SLOTSYNC_COLUMN_COUNT 10
 	Oid			slotRow[SLOTSYNC_COLUMN_COUNT] = {TEXTOID, TEXTOID, LSNOID,
-	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID};
+	LSNOID, XIDOID, BOOLOID, BOOLOID, TEXTOID, TEXTOID, TIMESTAMPTZOID};
 
 	WalRcvExecResult *res;
 	TupleTableSlot *tupslot;
@@ -663,7 +668,7 @@ synchronize_slots(WalReceiverConn *wrconn)
 	bool		started_tx = false;
 	const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn,"
 		" restart_lsn, catalog_xmin, two_phase, failover,"
-		" database, invalidation_reason"
+		" database, invalidation_reason, inactive_since"
 		" FROM pg_catalog.pg_replication_slots"
 		" WHERE failover and NOT temporary";
 
@@ -743,6 +748,13 @@ synchronize_slots(WalReceiverConn *wrconn)
 		remote_slot->invalidated = isnull ? RS_INVAL_NONE :
 			GetSlotInvalidationCause(TextDatumGetCString(d));
 
+		/*
+		 * It is possible to get null value for inactive_since if the slot is
+		 * active on the primary server, so handle accordingly.
+		 */
+		d = slot_getattr(tupslot, ++col, &isnull);
+		remote_slot->inactive_since = isnull ? 0 : DatumGetTimestampTz(d);
+
 		/* Sanity check */
 		Assert(col == SLOTSYNC_COLUMN_COUNT);
 
@@ -1296,6 +1308,46 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Update the inactive_since property for synced slots.
+ */
+static void
+update_synced_slots_inactive_since(void)
+{
+	TimestampTz now = 0;
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			Assert(SlotIsLogical(s));
+
+			/*
+			 * We get the current time beforehand and only once to avoid
+			 * system calls overhead while holding the lock.
+			 */
+			if (now == 0)
+				now = GetCurrentTimestamp();
+
+			/*
+			 * Set the time since the slot has become inactive after shutting
+			 * down slot sync machinery. This helps correctly interpret the
+			 * time if the standby gets promoted without a restart.
+			 */
+			SpinLockAcquire(&s->mutex);
+			s->inactive_since = now;
+			SpinLockRelease(&s->mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1309,6 +1361,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(&SlotSyncCtx->mutex);
+		update_synced_slots_inactive_since();
 		return;
 	}
 	SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1341,6 +1394,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(&SlotSyncCtx->mutex);
+
+	update_synced_slots_inactive_since();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d778c0b921..7dbb44b7b0 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -42,6 +42,7 @@
 #include "access/transam.h"
 #include "access/xlog_internal.h"
 #include "access/xlogrecovery.h"
+#include "access/xlogutils.h"
 #include "common/file_utils.h"
 #include "common/string.h"
 #include "miscadmin.h"
@@ -655,6 +656,7 @@ ReplicationSlotRelease(void)
 	char	   *slotname = NULL;	/* keep compiler quiet */
 	bool		is_logical = false; /* keep compiler quiet */
 	TimestampTz now = 0;
+	bool		is_slot_being_synced = false;
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
@@ -690,12 +692,15 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has become inactive.
+	 *
+	 * Note that we don't set it for the slots currently being synced from the
+	 * primary to the standby, because such slots typically sync the data from
+	 * the remote slot.
 	 */
-	if (!(RecoveryInProgress() && slot->data.synced))
+	if (RecoveryInProgress() && slot->data.synced)
+		is_slot_being_synced = true;
+	else
 		now = GetCurrentTimestamp();
 
 	if (slot->data.persistency == RS_PERSISTENT)
@@ -706,11 +711,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		if (!is_slot_being_synced)
+			slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
-	else
+	else if (!is_slot_being_synced)
 	{
 		SpinLockAcquire(&slot->mutex);
 		slot->inactive_since = now;
@@ -2369,13 +2375,21 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * Set the time since the slot has become inactive after loading the
+		 * slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it.
+		 *
+		 * Note that we don't set it for the slots currently being synced from
+		 * the primary to the standby, because such slots typically sync the
+		 * data from the remote slot. We use InRecovery flag instead of
+		 * RecoveryInProgress() as the latter always returns true at this time
+		 * even on primary.
+		 *
+		 * Note that for synced slots after the standby starts up (i.e. after
+		 * the slots are loaded from the disk), the inactive_since will remain
+		 * zero until the next slot sync cycle.
 		 */
-		if (!(RecoveryInProgress() && slot->data.synced))
+		if (!(InRecovery && slot->data.synced))
 			slot->inactive_since = GetCurrentTimestamp();
 		else
 			slot->inactive_since = 0;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b08296605c..ddfc3236f3 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3276,6 +3276,40 @@ sub create_logical_slot_on_standby
 
 =pod
 
+=item $node->get_slot_inactive_since_value(self, slot_name, reference_time)
+
+Get inactive_since column value for a given replication slot validating it
+against optional reference time.
+
+=cut
+
+sub get_slot_inactive_since_value
+{
+	my ($self, $slot_name, $reference_time) = @_;
+	my $name = $self->name;
+
+	my $inactive_since = $self->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
+		);
+
+	# Check that the captured time is sane
+	if (defined $reference_time)
+	{
+		is($self->safe_psql('postgres',
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+					'$inactive_since'::timestamptz >= '$reference_time'::timestamptz;]
+			),
+			't',
+			"last inactive time for slot $slot_name is valid on node $name")
+			or die "could not validate captured inactive_since for slot $slot_name";
+	}
+
+	return $inactive_since;
+}
+
+=pod
+
 =item $node->advance_wal(num)
 
 Advance WAL of node by given number of segments.
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 3b9a306a8b..c8e5e5054d 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -443,7 +443,7 @@ $primary4->safe_psql(
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the standby below.
 my $inactive_since =
-	capture_and_validate_slot_inactive_since($primary4, $sb4_slot, $slot_creation_time);
+	$primary4->get_slot_inactive_since_value($sb4_slot, $slot_creation_time);
 
 $standby4->start;
 
@@ -502,7 +502,7 @@ $publisher4->safe_psql('postgres',
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the subscriber below.
 $inactive_since =
-	capture_and_validate_slot_inactive_since($publisher4, $lsub4_slot, $slot_creation_time);
+	$publisher4->get_slot_inactive_since_value($lsub4_slot, $slot_creation_time);
 
 $subscriber4->start;
 $subscriber4->safe_psql('postgres',
@@ -540,26 +540,4 @@ is( $publisher4->safe_psql(
 $publisher4->stop;
 $subscriber4->stop;
 
-# Capture and validate inactive_since of a given slot.
-sub capture_and_validate_slot_inactive_since
-{
-	my ($node, $slot_name, $slot_creation_time) = @_;
-
-	my $inactive_since = $node->safe_psql('postgres',
-		qq(SELECT inactive_since FROM pg_replication_slots
-			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
-		);
-
-	# Check that the captured time is sane
-	is( $node->safe_psql(
-			'postgres',
-			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
-				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
-		),
-		't',
-		"last inactive time for an active slot $slot_name is sane");
-
-	return $inactive_since;
-}
-
 done_testing();
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index f47bfd78eb..33e3a8dcf0 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -35,6 +35,13 @@ my $subscriber1 = PostgreSQL::Test::Cluster->new('subscriber1');
 $subscriber1->init;
 $subscriber1->start;
 
+# Capture the time before the logical failover slot is created on the
+# primary. We later call this publisher as primary anyway.
+my $slot_creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Create a slot on the publisher with failover disabled
 $publisher->safe_psql('postgres',
 	"SELECT 'init' FROM pg_create_logical_replication_slot('lsub1_slot', 'pgoutput', false, false, false);"
@@ -174,6 +181,10 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary
+my $inactive_since_on_primary =
+	$primary->get_slot_inactive_since_value('lsub1_slot', $slot_creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -190,6 +201,18 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Capture the inactive_since of the synced slot on the standby
+my $inactive_since_on_standby =
+	$standby1->get_slot_inactive_since_value('lsub1_slot', $slot_creation_time_on_primary);
+
+# Synced slots on the standby must get the inactive_since from the primary.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_primary'::timestamptz = '$inactive_since_on_standby'::timestamptz;"
+	),
+	"t",
+	'synchronized slot has got the inactive_since from the primary');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
@@ -750,8 +773,28 @@ $primary->reload;
 $standby1->start;
 $primary->wait_for_replay_catchup($standby1);
 
+# Capture the time before the standby is promoted
+my $promotion_time_on_primary = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 $standby1->promote;
 
+# Capture the inactive_since of the synced slot after the promotion.
+# Expectation here is that the slot gets its own inactive_since as part of the
+# promotion. We do this check before the slot is enabled on the new primary
+# below, otherwise the slot gets active setting inactive_since to NULL.
+my $inactive_since_on_new_primary =
+	$standby1->get_slot_inactive_since_value('lsub1_slot', $promotion_time_on_primary);
+
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_new_primary'::timestamptz > '$inactive_since_on_primary'::timestamptz"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since on the new primary');
+
 # Update subscription with the new primary's connection info
 my $standby1_conninfo = $standby1->connstr . ' dbname=postgres';
 $subscriber1->safe_psql('postgres',
-- 
2.34.1

v30-0002-Add-inactive_timeout-based-replication-slot-inva.patchapplication/x-patch; name=v30-0002-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 64bb4f8396595dae62ee07726dfacadb6e87f119 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sun, 31 Mar 2024 04:40:42 +0000
Subject: [PATCH v30 2/2] Add inactive_timeout based replication slot
 invalidation.

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days at slot level, after which the inactive slots get
dropped.

To achieve the above, postgres introduces a GUC allowing users
set inactive timeout and then a slot stays inactive for this much
amount of time it invalidates the slot. The invalidation check
happens at various locations to help being as latest as possible,
these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint
Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/config.sgml                      |  25 ++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 196 +++++++++++-
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   8 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 278 ++++++++++++++++++
 13 files changed, 518 insertions(+), 24 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f65c17e5ae..126b461bb1 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4545,6 +4545,31 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidates replication slots that are inactive for longer the
+        specified amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the timeout mechanism. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+
+       <para>
+        The timeout is measured from the time since the slot has become
+        inactive (known from its
+        <structfield>inactive_since</structfield> value) until it gets
+        used (i.e., its <structfield>active</structfield> is set to true).
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index c8d97ab375..5c05fd1c07 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2582,6 +2582,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 9c95a4b062..71b6a254cf 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -321,7 +321,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -531,7 +531,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
 		 * if the slot is not acquired by other processes.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 7dbb44b7b0..7182d89b58 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -108,10 +108,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -159,6 +161,7 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool InvalidateSlotForInactiveTimeout(ReplicationSlot *slot);
 
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
@@ -536,9 +539,14 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * If check_for_timeout_invalidation is true, the slot is checked for
+ * invalidation based on replication_slot_inactive_timeout GUC, and an error is
+ * raised after making the slot ours.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_timeout_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -616,6 +624,34 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * Check if the given slot can be invalidated based on its inactive
+	 * timeout. If yes, persist the invalidated state to disk and then error
+	 * out. We do this only after making the slot ours to avoid anyone else
+	 * acquiring it while we check for its invalidation.
+	 */
+	if (check_for_timeout_invalidation)
+	{
+		/* The slot is ours by now */
+		Assert(s->active_pid == MyProcPid);
+
+		if (InvalidateReplicationSlotForInactiveTimeout(s, true))
+		{
+			/*
+			 * If the slot has been invalidated, recalculate the resource
+			 * limits.
+			 */
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(MyReplicationSlot->data.name)),
+					 errdetail("This slot has been invalidated because it was inactive for more than the time specified by replication_slot_inactive_timeout parameter.")));
+		}
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -790,7 +826,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -813,7 +849,7 @@ ReplicationSlotAlter(const char *name, bool failover)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -989,6 +1025,20 @@ ReplicationSlotDropPtr(ReplicationSlot *slot)
 	LWLockRelease(ReplicationSlotAllocationLock);
 }
 
+/*
+ * Helper for ReplicationSlotSave
+ */
+static inline void
+SaveGivenReplicationSlot(ReplicationSlot *slot, int elevel)
+{
+	char		path[MAXPGPATH];
+
+	Assert(slot != NULL);
+
+	sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+	SaveSlotToPath(slot, path, elevel);
+}
+
 /*
  * Serialize the currently acquired slot's state from memory to disk, thereby
  * guaranteeing the current state will survive a crash.
@@ -996,12 +1046,21 @@ ReplicationSlotDropPtr(ReplicationSlot *slot)
 void
 ReplicationSlotSave(void)
 {
-	char		path[MAXPGPATH];
+	SaveGivenReplicationSlot(MyReplicationSlot, ERROR);
+}
 
-	Assert(MyReplicationSlot != NULL);
+/*
+ * Helper for ReplicationSlotMarkDirty
+ */
+static inline void
+MarkGivenReplicationSlotDirty(ReplicationSlot *slot)
+{
+	Assert(slot != NULL);
 
-	sprintf(path, "pg_replslot/%s", NameStr(MyReplicationSlot->data.name));
-	SaveSlotToPath(MyReplicationSlot, path, ERROR);
+	SpinLockAcquire(&slot->mutex);
+	slot->just_dirtied = true;
+	slot->dirty = true;
+	SpinLockRelease(&slot->mutex);
 }
 
 /*
@@ -1014,14 +1073,7 @@ ReplicationSlotSave(void)
 void
 ReplicationSlotMarkDirty(void)
 {
-	ReplicationSlot *slot = MyReplicationSlot;
-
-	Assert(MyReplicationSlot != NULL);
-
-	SpinLockAcquire(&slot->mutex);
-	MyReplicationSlot->just_dirtied = true;
-	MyReplicationSlot->dirty = true;
-	SpinLockRelease(&slot->mutex);
+	MarkGivenReplicationSlotDirty(MyReplicationSlot);
 }
 
 /*
@@ -1515,6 +1567,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by replication_slot_inactive_timeout parameter."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1559,6 +1614,13 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
 
+	/*
+	 * This function isn't expected to be called for inactive timeout based
+	 * invalidation. A separate function
+	 * InvalidateReplicationSlotForInactiveTimeout is to be used for that.
+	 */
+	Assert(cause != RS_INVAL_INACTIVE_TIMEOUT);
+
 	for (;;)
 	{
 		XLogRecPtr	restart_lsn;
@@ -1628,6 +1690,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					/* not reachable */
+					Assert(false);
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1781,6 +1847,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1796,6 +1863,13 @@ InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
 	Assert(cause != RS_INVAL_WAL_REMOVED || oldestSegno > 0);
 	Assert(cause != RS_INVAL_NONE);
 
+	/*
+	 * This function isn't expected to be called for inactive timeout based
+	 * invalidation. A separate function
+	 * InvalidateReplicationSlotForInactiveTimeout is to be used for that.
+	 */
+	Assert(cause != RS_INVAL_INACTIVE_TIMEOUT);
+
 	if (max_replication_slots == 0)
 		return invalidated;
 
@@ -1832,6 +1906,81 @@ restart:
 	return invalidated;
 }
 
+/*
+ * Invalidate given slot based on replication_slot_inactive_timeout GUC.
+ *
+ * Returns true if the slot has got invalidated.
+ *
+ * NB - this function also runs as part of checkpoint, so avoid raising errors
+ * if possible.
+ */
+bool
+InvalidateReplicationSlotForInactiveTimeout(ReplicationSlot *slot,
+											bool persist_state)
+{
+	if (!InvalidateSlotForInactiveTimeout(slot))
+		return false;
+
+	/* Make sure the invalidated state persists across server restart */
+	MarkGivenReplicationSlotDirty(slot);
+
+	if (persist_state)
+		SaveGivenReplicationSlot(slot, ERROR);
+
+	ReportSlotInvalidation(RS_INVAL_INACTIVE_TIMEOUT, false, 0,
+						   slot->data.name, InvalidXLogRecPtr,
+						   InvalidXLogRecPtr, InvalidTransactionId);
+
+	return true;
+}
+
+/*
+ * Helper for InvalidateReplicationSlotForInactiveTimeout
+ */
+static bool
+InvalidateSlotForInactiveTimeout(ReplicationSlot *slot)
+{
+	ReplicationSlotInvalidationCause inavidation_cause = RS_INVAL_NONE;
+
+	if (replication_slot_inactive_timeout == 0)
+		return false;
+	else if (slot->inactive_since > 0)
+	{
+		TimestampTz now;
+
+		/*
+		 * Do not invalidate the slots which are currently being synced from
+		 * the primary to the standby.
+		 */
+		if (RecoveryInProgress() && slot->data.synced)
+			return false;
+
+		LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+		SpinLockAcquire(&slot->mutex);
+
+		/*
+		 * Check if the slot needs to be invalidated due to
+		 * replication_slot_inactive_timeout GUC. We do this with the spinlock
+		 * held to avoid race conditions -- for example the inactive_since
+		 * could change, or the slot could be dropped.
+		 */
+		now = GetCurrentTimestamp();
+		if (TimestampDifferenceExceeds(slot->inactive_since, now,
+									   replication_slot_inactive_timeout * 1000))
+		{
+			inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+			slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT;
+		}
+
+		SpinLockRelease(&slot->mutex);
+		LWLockRelease(ReplicationSlotControlLock);
+
+		return (inavidation_cause == RS_INVAL_INACTIVE_TIMEOUT);
+	}
+
+	return false;
+}
+
 /*
  * Flush all replication slots to disk.
  *
@@ -1844,6 +1993,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated = false;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -1867,6 +2017,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		/* save the slot to disk, locking is handled in SaveSlotToPath() */
 		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
 
+		/*
+		 * Here's an opportunity to invalidate inactive replication slots
+		 * based on timeout, so let's do it.
+		 */
+		if (InvalidateReplicationSlotForInactiveTimeout(s, false))
+			invalidated = true;
+
 		/*
 		 * Slot's data is not flushed each time the confirmed_flush LSN is
 		 * updated as that could lead to frequent writes.  However, we decide
@@ -1893,6 +2050,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	/* If the slot has been invalidated, recalculate the resource limits */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index da57177c25..677c0bf0a2 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -651,7 +651,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bc40c454de..96eeb8b7d2 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -846,7 +846,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1459,7 +1459,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 92fcd5fa4d..c63f76505f 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2971,6 +2971,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index adcc0257f9..18dd57e589 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -334,6 +334,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7b937d1a0c..8f5e602745 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -230,6 +232,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -245,7 +248,8 @@ extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, bool failover);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_timeout_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
@@ -264,6 +268,8 @@ extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
+extern bool InvalidateReplicationSlotForInactiveTimeout(ReplicationSlot *slot,
+														bool persist_state);
 extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock);
 extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..8e919915f1
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,278 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to inactive timeout GUC. Also, check the logical
+# failover slot synced on to the standby doesn't invalidate the slot on its own,
+# but gets the invalidated state from the remote slot on the primary.
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr_1 = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb1_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('lsub1_sync_slot', 'test_decoding', false, false, true);}
+);
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot');
+]);
+
+$standby1->start;
+
+my $standby1_logstart = -s $standby1->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Synchronize the primary server slots to the standby.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced'.
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub1_sync_slot' AND synced AND NOT temporary;}
+	),
+	"t",
+	'logical slot has synced as true on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$primary->reload;
+
+# Wait for the logical failover slot to become inactive on the primary. Note
+# that nobody has acquired that slot yet, so due to inactive timeout setting
+# above it must get invalidated.
+wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart);
+
+# Set timeout on the standby also to check the synced slots don't get
+# invalidated due to timeout on the standby.
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$standby1->reload;
+
+# Now, sync the logical failover slot from the remote slot on the primary.
+# Note that the remote slot has already been invalidated due to inactive
+# timeout. Now, the standby must also see it as invalidated.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Wait for the inactive replication slot to be invalidated.
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'lsub1_sync_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for replication slot lsub1_sync_slot invalidation to be synced on standby";
+
+# Synced slot mustn't get invalidated on the standby, it must sync invalidation
+# from the primary. So, we must not see the slot's invalidation message in server
+# log.
+ok( !$standby1->log_contains(
+		"invalidating obsolete replication slot \"lsub1_sync_slot\"",
+		$standby1_logstart),
+	'check that syned slot has not been invalidated on the standby');
+
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
+wait_for_slot_invalidation($primary, 'sb1_slot', $logstart);
+
+# Testcase end: Invalidate streaming standby's slot as well as logical failover
+# slot on primary due to inactive timeout GUC. Also, check the logical failover
+# slot synced on to the standby doesn't invalidate the slot on its own, but
+# gets the invalidated state from the remote slot on the primary.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to inactive timeout
+# GUC.
+
+my $publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$subscriber->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart);
+
+# Testcase end: Invalidate logical subscriber's slot due to inactive timeout
+# GUC.
+# =============================================================================
+
+# =============================================================================
+# Start: Helper functions used for this test file
+
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $name = $node->name;
+
+	# Wait for the replication slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for replication slot to become inactive";
+
+	# Wait for the replication slot info to be updated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE inactive_since IS NOT NULL
+				AND slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for info of replication slot $slot_name to be updated on node $name";
+
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+
+	# Wait for the inactive replication slot to be invalidated.
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for inactive replication slot $slot_name to be invalidated on node $name";
+
+	# Check that the invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+
+	($result, $stdout, $stderr) = $primary->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot_name', '0/1');
+	]);
+
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot_name"/,
+		"detected error upon trying to acquire invalidated slot $slot_name on node $name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot_name";
+}
+
+# Check for invalidation of slot in server log.
+sub check_for_slot_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot_name invalidation has been logged");
+}
+
+# =============================================================================
+# End: Helper functions used for this test file
+
+done_testing();
-- 
2.34.1

#189

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#184)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 29, 2024 at 9:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Commit message states: "why we can't just update inactive_since for
synced slots on the standby with the value received from remote slot
on the primary. This is consistent with any other slot parameter i.e.
all of them are synced from the primary."

The inactive_since is not consistent with other slot parameters which
we copy. We don't perform anything related to those other parameters
like say two_phase phase which can change that property. However, we
do acquire the slot, advance the slot (as per recent discussion [1]),
and release it. Since these operations can impact inactive_since, it
seems to me that inactive_since is not the same as other parameters.
It can have a different value than the primary. Why would anyone want
to know the value of inactive_since from primary after the standby is
promoted?

After thinking about it for a while now, it feels to me that the
synced slots (slots on the standby that are being synced from the
primary) can have their own inactive_sicne value. Fundamentally,
inactive_sicne is set to 0 when slot is acquired and set to current
time when slot is released, no matter who acquires and releases it -
be it walsenders for replication, or backends for slot advance, or
backends for slot sync using pg_sync_replication_slots, or backends
for other slot functions, or background sync worker. Remember the
earlier patch was updating inactive_since just for walsenders, but
then the suggestion was to update it unconditionally -
/messages/by-id/CAJpy0uD64X=2ENmbHaRiWTKeQawr-rbGoy_GdhQQLVXzUSKTMg@mail.gmail.com.
Whoever syncs the slot, *acutally* acquires the slot i.e. makes it
theirs, syncs it from the primary, and releases it. IMO, no
differentiation is to be made for synced slots.

There was a suggestion on using inactive_since of the synced slot on
the standby to know the inactivity of the slot on the primary. If one
wants to do that, they better look at/monitor the primary slot
info/logs/pg_replication_slot/whatever. I really don't see a point in
having two different meanings for a single property of a replication
slot - inactive_since for a regular slot tells since when this slot
has become inactive, and for a synced slot since when the
corresponding remote slot has become inactive. I think this will
confuse users for sure.

Also, if inactive_since is being changed on the primary so frequently,
and none of the other parameters are changing, if we copy
inactive_since to the synced slots, then standby will just be doing
*sync* work (mark the slots dirty and save to disk) for updating
inactive_since. I think this is unnecessary behaviour for sure.

Coming to a future patch for inactive timeout based slot invalidation,
we can either allow invalidation without any differentiation for
synced slots or restrict invalidation to avoid more sync work. For
instance, if inactive timeout is kept low on the standby, the sync
worker will be doing more work as it drops and recreates a slot
repeatedly if it keeps getting invalidated. Another thing is that the
standby takes independent invalidation decisions for synced slots.
AFAICS, invalidation due to wal_removal is the only sole reason (out
of all available invalidation reasons) for a synced slot to get
invalidated independently of the primary. Check
/messages/by-id/CAA4eK1JXBwTaDRD_=8t6UB1fhRNjC1C+gH4YdDxj_9U6djLnXw@mail.gmail.com
for the suggestion on we better not differentiaing invalidation
decisions for synced slots.

The assumption of letting synced slots have their own inactive_since
not only simplifies the code, but also looks less-confusing and more
meaningful to the user. The only code that we put in on top of the
committed code is to use InRecovery in place of
RecoveryInProgress() in RestoreSlotFromDisk() to fix the issue raised
by Shveta upthread.

Now, the other concern is that calling GetCurrentTimestamp()
could be costly when the values for the slot are not going to be
updated but if that happens we can optimize such that before acquiring
the slot we can have some minimal pre-checks to ensure whether we need
to update the slot or not.

[1] - /messages/by-id/OS0PR01MB571615D35F486080616CA841943A2@OS0PR01MB5716.jpnprd01.prod.outlook.com

A quick test with a function to measure the cost of
GetCurrentTimestamp [1]Datum pg_get_current_timestamp(PG_FUNCTION_ARGS) { int loops = PG_GETARG_INT32(0); TimestampTz ctime; on my Ubuntu dev system (an AWS EC2 c5.4xlarge
instance), gives me [2]postgres=# \timing Timing is on. postgres=# SELECT pg_get_current_timestamp(1000000000); pg_get_current_timestamp ------------------------------- 2024-03-30 19:07:57.374797+00 (1 row). It took 0.388 ms, 2.269 ms, 21.144 ms,
209.333 ms, 2091.174 ms, 20908.942 ms for 10K, 100K, 1million,
10million, 100million, 1billion times respectively. Costs might be
different on various systems with different OS, but it gives us a
rough idea.

If we are too much concerned about the cost of GetCurrentTimestamp(),
a possible approach is just don't set inactive_since for slots being
synced on the standby. Just let the first acquisition and release
after the promotion do that job. We can always call this out in the
docs saying "replication slots on the streaming standbys which are
being synced from the primary are not inactive in practice, so the
inactive_since is always NULL for them unless the standby is
promoted".

[1]: Datum pg_get_current_timestamp(PG_FUNCTION_ARGS) { int loops = PG_GETARG_INT32(0); TimestampTz ctime;
Datum
pg_get_current_timestamp(PG_FUNCTION_ARGS)
{
int loops = PG_GETARG_INT32(0);
TimestampTz ctime;

for (int i = 0; i < loops; i++)
ctime = GetCurrentTimestamp();

PG_RETURN_TIMESTAMPTZ(ctime);
}

[2]: postgres=# \timing Timing is on. postgres=# SELECT pg_get_current_timestamp(1000000000); pg_get_current_timestamp ------------------------------- 2024-03-30 19:07:57.374797+00 (1 row)
postgres=# \timing
Timing is on.
postgres=# SELECT pg_get_current_timestamp(1000000000);
pg_get_current_timestamp
-------------------------------
2024-03-30 19:07:57.374797+00
(1 row)

Time: 20908.942 ms (00:20.909)
postgres=# SELECT pg_get_current_timestamp(100000000);
pg_get_current_timestamp
-------------------------------
2024-03-30 19:08:21.038064+00
(1 row)

Time: 2091.174 ms (00:02.091)
postgres=# SELECT pg_get_current_timestamp(10000000);
pg_get_current_timestamp
-------------------------------
2024-03-30 19:08:24.329949+00
(1 row)

Time: 209.333 ms
postgres=# SELECT pg_get_current_timestamp(1000000);
pg_get_current_timestamp
-------------------------------
2024-03-30 19:08:26.978016+00
(1 row)

Time: 21.144 ms
postgres=# SELECT pg_get_current_timestamp(100000);
pg_get_current_timestamp
-------------------------------
2024-03-30 19:08:29.142248+00
(1 row)

Time: 2.269 ms
postgres=# SELECT pg_get_current_timestamp(10000);
pg_get_current_timestamp
------------------------------
2024-03-30 19:08:31.34621+00
(1 row)

Time: 0.388 ms

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#190

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#187)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Mar 29, 2024 at 6:17 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 29, 2024 at 03:03:01PM +0530, Amit Kapila wrote:

On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote:

Commit message states: "why we can't just update inactive_since for
synced slots on the standby with the value received from remote slot
on the primary. This is consistent with any other slot parameter i.e.
all of them are synced from the primary."

The inactive_since is not consistent with other slot parameters which
we copy. We don't perform anything related to those other parameters
like say two_phase phase which can change that property. However, we
do acquire the slot, advance the slot (as per recent discussion [1]),
and release it. Since these operations can impact inactive_since, it
seems to me that inactive_since is not the same as other parameters.
It can have a different value than the primary. Why would anyone want
to know the value of inactive_since from primary after the standby is
promoted?

I think it can be useful "before" it is promoted and in case the primary is down.

It is not clear to me what is user going to do by checking the
inactivity time for slots when the corresponding server is down.

Say a failover needs to be done, then it could be useful to know for which
slots the activity needs to be resumed (thinking about external logical decoding
plugin, not about pub/sub here). If one see an inactive slot (since long "enough")
then he can start to reasonate about what to do with it.

I thought the idea was to check such slots and see if they need to be
dropped or enabled again to avoid excessive disk usage, etc.

Yeah that's the case but it does not mean inactive_since can't be useful in other
ways.

Also, say the slot has been invalidated on the primary (due to inactivity timeout),
primary is down and there is a failover. By keeping the inactive_since from
the primary, one could know when the inactivity that lead to the timeout started.

So, this means at promotion, we won't set the current_time for
inactive_since which is not what the currently proposed patch is
doing. Moreover, doing the invalidation on promoted standby based on
inactive_since of the primary node sounds debatable because the
inactive_timeout could be different on the new node (promoted
standby).

Again, more concerned about external logical decoding plugin than pub/sub here.

I agree that tracking the activity time of a synced slot can be useful, why
not creating a dedicated field for that purpose (and keep inactive_since a
perfect "copy" of the primary)?

We can have a separate field for this but not sure if it is worth it.

OTOH I'm not sure that erasing this information from the primary is useful. I
think that 2 fields would be the best option and would be less subject of
misinterpretation.

Now, the other concern is that calling GetCurrentTimestamp()
could be costly when the values for the slot are not going to be
updated but if that happens we can optimize such that before acquiring
the slot we can have some minimal pre-checks to ensure whether we need
to update the slot or not.

Right, but for a very active slot it is likely that we call GetCurrentTimestamp()
during almost each sync cycle.

True, but if we have to save a slot to disk each time to persist the
changes (for an active slot) then probably GetCurrentTimestamp()
shouldn't be costly enough to matter.

Right, persisting the changes to disk would be even more costly.

The point I was making is that currently after copying the
remote_node's values, we always persist the slots to disk, so the cost
of current_time shouldn't be much. Now, if the values won't change
then probably there is some cost but in most cases (active slots), the
values will always change. Also, if all the slots are inactive then we
will slow down the speed of sync. We also need to consider if we want
to copy the value of inactive_since from the primary and if that is
the only value changed then shall we persist the slot or not?

--
With Regards,
Amit Kapila.

#191

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#190)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Mon, Apr 01, 2024 at 09:04:43AM +0530, Amit Kapila wrote:

On Fri, Mar 29, 2024 at 6:17 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 29, 2024 at 03:03:01PM +0530, Amit Kapila wrote:

On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote:

Commit message states: "why we can't just update inactive_since for
synced slots on the standby with the value received from remote slot
on the primary. This is consistent with any other slot parameter i.e.
all of them are synced from the primary."

The inactive_since is not consistent with other slot parameters which
we copy. We don't perform anything related to those other parameters
like say two_phase phase which can change that property. However, we
do acquire the slot, advance the slot (as per recent discussion [1]),
and release it. Since these operations can impact inactive_since, it
seems to me that inactive_since is not the same as other parameters.
It can have a different value than the primary. Why would anyone want
to know the value of inactive_since from primary after the standby is
promoted?

I think it can be useful "before" it is promoted and in case the primary is down.

It is not clear to me what is user going to do by checking the
inactivity time for slots when the corresponding server is down.

Say a failover needs to be done, then it could be useful to know for which
slots the activity needs to be resumed (thinking about external logical decoding
plugin, not about pub/sub here). If one see an inactive slot (since long "enough")
then he can start to reasonate about what to do with it.

I thought the idea was to check such slots and see if they need to be
dropped or enabled again to avoid excessive disk usage, etc.

Yeah that's the case but it does not mean inactive_since can't be useful in other
ways.

Also, say the slot has been invalidated on the primary (due to inactivity timeout),
primary is down and there is a failover. By keeping the inactive_since from
the primary, one could know when the inactivity that lead to the timeout started.

So, this means at promotion, we won't set the current_time for
inactive_since which is not what the currently proposed patch is
doing.

Yeah, that's why I made the comment T2 in [1]/messages/by-id/ZgU70MjdOfO6l0O0@ip-10-97-1-34.eu-west-3.compute.internal.

Moreover, doing the invalidation on promoted standby based on
inactive_since of the primary node sounds debatable because the
inactive_timeout could be different on the new node (promoted
standby).

I think that if the slot is not invalidated before the promotion then we should
erase the value from the primary and use the promotion time.

Again, more concerned about external logical decoding plugin than pub/sub here.

I agree that tracking the activity time of a synced slot can be useful, why
not creating a dedicated field for that purpose (and keep inactive_since a
perfect "copy" of the primary)?

We can have a separate field for this but not sure if it is worth it.

OTOH I'm not sure that erasing this information from the primary is useful. I
think that 2 fields would be the best option and would be less subject of
misinterpretation.

Now, the other concern is that calling GetCurrentTimestamp()
could be costly when the values for the slot are not going to be
updated but if that happens we can optimize such that before acquiring
the slot we can have some minimal pre-checks to ensure whether we need
to update the slot or not.

Right, but for a very active slot it is likely that we call GetCurrentTimestamp()
during almost each sync cycle.

True, but if we have to save a slot to disk each time to persist the
changes (for an active slot) then probably GetCurrentTimestamp()
shouldn't be costly enough to matter.

Right, persisting the changes to disk would be even more costly.

The point I was making is that currently after copying the
remote_node's values, we always persist the slots to disk, so the cost
of current_time shouldn't be much.

Oh right, I missed this (was focusing only on inactive_since that we don't persist
to disk IIRC).

BTW, If we are going this way, maybe we could accept a bit less accuracy
and use GetCurrentTransactionStopTimestamp() instead?

Now, if the values won't change
then probably there is some cost but in most cases (active slots), the
values will always change.

Right.

Also, if all the slots are inactive then we
will slow down the speed of sync.

Yes.

We also need to consider if we want
to copy the value of inactive_since from the primary and if that is
the only value changed then shall we persist the slot or not?

Good point, then I don't think we should as inactive_since is not persisted on disk.

[1]: /messages/by-id/ZgU70MjdOfO6l0O0@ip-10-97-1-34.eu-west-3.compute.internal

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#192

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#189)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Mon, Apr 01, 2024 at 08:47:59AM +0530, Bharath Rupireddy wrote:

On Fri, Mar 29, 2024 at 9:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Commit message states: "why we can't just update inactive_since for
synced slots on the standby with the value received from remote slot
on the primary. This is consistent with any other slot parameter i.e.
all of them are synced from the primary."

The inactive_since is not consistent with other slot parameters which
we copy. We don't perform anything related to those other parameters
like say two_phase phase which can change that property. However, we
do acquire the slot, advance the slot (as per recent discussion [1]),
and release it. Since these operations can impact inactive_since, it
seems to me that inactive_since is not the same as other parameters.
It can have a different value than the primary. Why would anyone want
to know the value of inactive_since from primary after the standby is
promoted?

After thinking about it for a while now, it feels to me that the
synced slots (slots on the standby that are being synced from the
primary) can have their own inactive_sicne value. Fundamentally,
inactive_sicne is set to 0 when slot is acquired and set to current
time when slot is released, no matter who acquires and releases it -
be it walsenders for replication, or backends for slot advance, or
backends for slot sync using pg_sync_replication_slots, or backends
for other slot functions, or background sync worker. Remember the
earlier patch was updating inactive_since just for walsenders, but
then the suggestion was to update it unconditionally -
/messages/by-id/CAJpy0uD64X=2ENmbHaRiWTKeQawr-rbGoy_GdhQQLVXzUSKTMg@mail.gmail.com.
Whoever syncs the slot, *acutally* acquires the slot i.e. makes it
theirs, syncs it from the primary, and releases it. IMO, no
differentiation is to be made for synced slots.

There was a suggestion on using inactive_since of the synced slot on
the standby to know the inactivity of the slot on the primary. If one
wants to do that, they better look at/monitor the primary slot
info/logs/pg_replication_slot/whatever.

Yeah but the use case was in case the primary is down for whatever reason.

I really don't see a point in
having two different meanings for a single property of a replication
slot - inactive_since for a regular slot tells since when this slot
has become inactive, and for a synced slot since when the
corresponding remote slot has become inactive. I think this will
confuse users for sure.

I'm not sure as we are speaking about "synced" slots. I can also see some confusion
if this value is not "synced".

Also, if inactive_since is being changed on the primary so frequently,
and none of the other parameters are changing, if we copy
inactive_since to the synced slots, then standby will just be doing
*sync* work (mark the slots dirty and save to disk) for updating
inactive_since. I think this is unnecessary behaviour for sure.

Right, I think we should avoid the save slot to disk in that case (question raised
by Amit in [1]/messages/by-id/CAA4eK1JtKieWMivbswYg5FVVB5FugCftLvQKVsxh=m_8nk04vw@mail.gmail.com).

Coming to a future patch for inactive timeout based slot invalidation,
we can either allow invalidation without any differentiation for
synced slots or restrict invalidation to avoid more sync work. For
instance, if inactive timeout is kept low on the standby, the sync
worker will be doing more work as it drops and recreates a slot
repeatedly if it keeps getting invalidated. Another thing is that the
standby takes independent invalidation decisions for synced slots.
AFAICS, invalidation due to wal_removal is the only sole reason (out
of all available invalidation reasons) for a synced slot to get
invalidated independently of the primary. Check
/messages/by-id/CAA4eK1JXBwTaDRD_=8t6UB1fhRNjC1C+gH4YdDxj_9U6djLnXw@mail.gmail.com
for the suggestion on we better not differentiaing invalidation
decisions for synced slots.

Yeah, I think the invalidation decision on the standby is highly linked to
what inactive_since on the standby is: synced from primary or not.

The assumption of letting synced slots have their own inactive_since
not only simplifies the code, but also looks less-confusing and more
meaningful to the user.

I'm not sure at all. But if the majority of us thinks it's the case then let's
go that way.

Now, the other concern is that calling GetCurrentTimestamp()
could be costly when the values for the slot are not going to be
updated but if that happens we can optimize such that before acquiring
the slot we can have some minimal pre-checks to ensure whether we need
to update the slot or not.

Also maybe we could accept a bit less accuracy and use
GetCurrentTransactionStopTimestamp() instead?

If we are too much concerned about the cost of GetCurrentTimestamp(),
a possible approach is just don't set inactive_since for slots being
synced on the standby.
Just let the first acquisition and release
after the promotion do that job. We can always call this out in the
docs saying "replication slots on the streaming standbys which are
being synced from the primary are not inactive in practice, so the
inactive_since is always NULL for them unless the standby is
promoted".

I think that was the initial behavior that lead to Robert's remark (see [2]/messages/by-id/CA+Tgmob_Ta-t2ty8QrKHBGnNLrf4ZYcwhGHGFsuUoFrAEDw4sA@mail.gmail.com):

"
And I'm suspicious that having an exception for slots being synced is
a bad idea. That makes too much of a judgement about how the user will
use this field. It's usually better to just expose the data, and if
the user needs helps to make sense of that data, then give them that
help separately.
"

[1]: /messages/by-id/CAA4eK1JtKieWMivbswYg5FVVB5FugCftLvQKVsxh=m_8nk04vw@mail.gmail.com
[2]: /messages/by-id/CA+Tgmob_Ta-t2ty8QrKHBGnNLrf4ZYcwhGHGFsuUoFrAEDw4sA@mail.gmail.com

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#193

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#188)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Sun, Mar 31, 2024 at 10:25:46AM +0530, Bharath Rupireddy wrote:

On Thu, Mar 28, 2024 at 3:13 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

I think in this case it should always reflect the value from the primary (so
that one can understand why it is invalidated).

I'll come back to this as soon as we all agree on inactive_since
behavior for synced slots.

Makes sense. Also if the majority of us thinks it's not needed for inactive_since
to be an exact copy of the primary, then let's go that way.

I think when it is invalidated it should always reflect the value from the
primary (so that one can understand why it is invalidated).

I'll come back to this as soon as we all agree on inactive_since
behavior for synced slots.

Yeah.

T4 ===

Also, it looks like querying pg_replication_slots() does not trigger an
invalidation: I think it should if the slot is not invalidated yet (and matches
the invalidation criteria).

There's a different opinion on this, check comment #3 from
/messages/by-id/CAA4eK1LLj+eaMN-K8oeOjfG+UuzTY=L5PXbcMJURZbFm+_aJSA@mail.gmail.com.

Oh right, I can see Amit's point too. Let's put pg_replication_slots() out of
the game then.

CR6 ===
+static bool
+InvalidateSlotForInactiveTimeout(ReplicationSlot *slot, bool need_locks)
+{
InvalidatePossiblyInactiveSlot() maybe?
I think we will lose the essence i.e. timeout from the suggested
function name, otherwise just the inactive doesn't give a clearer
meaning. I kept it that way unless anyone suggests otherwise.

Right. OTOH I think that "Possibly" adds some nuance (like InvalidatePossiblyObsoleteSlot()
is already doing).

Please see the attached v30 patch. 0002 is where all of the above
review comments have been addressed.

Thanks! FYI, I did not look at the content yet, just replied to the above
comments.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#194

Masahiko Sawada

sawada.mshk@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#189)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Apr 1, 2024 at 12:18 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Fri, Mar 29, 2024 at 9:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Commit message states: "why we can't just update inactive_since for
synced slots on the standby with the value received from remote slot
on the primary. This is consistent with any other slot parameter i.e.
all of them are synced from the primary."

The inactive_since is not consistent with other slot parameters which
we copy. We don't perform anything related to those other parameters
like say two_phase phase which can change that property. However, we
do acquire the slot, advance the slot (as per recent discussion [1]),
and release it. Since these operations can impact inactive_since, it
seems to me that inactive_since is not the same as other parameters.
It can have a different value than the primary. Why would anyone want
to know the value of inactive_since from primary after the standby is
promoted?

After thinking about it for a while now, it feels to me that the
synced slots (slots on the standby that are being synced from the
primary) can have their own inactive_sicne value. Fundamentally,
inactive_sicne is set to 0 when slot is acquired and set to current
time when slot is released, no matter who acquires and releases it -
be it walsenders for replication, or backends for slot advance, or
backends for slot sync using pg_sync_replication_slots, or backends
for other slot functions, or background sync worker. Remember the
earlier patch was updating inactive_since just for walsenders, but
then the suggestion was to update it unconditionally -
/messages/by-id/CAJpy0uD64X=2ENmbHaRiWTKeQawr-rbGoy_GdhQQLVXzUSKTMg@mail.gmail.com.
Whoever syncs the slot, *acutally* acquires the slot i.e. makes it
theirs, syncs it from the primary, and releases it. IMO, no
differentiation is to be made for synced slots.

FWIW, coming to this thread late, I think that the inactive_since
should not be synchronized from the primary. The wall clocks are
different on the primary and the standby so having the primary's
timestamp on the standby can confuse users, especially when there is a
big clock drift. Also, as Amit mentioned, inactive_since seems not to
be consistent with other parameters we copy. The
replication_slot_inactive_timeout feature should work on the standby
independent from the primary, like other slot invalidation mechanisms,
and it should be based on its own local clock.

Coming to a future patch for inactive timeout based slot invalidation,
we can either allow invalidation without any differentiation for
synced slots or restrict invalidation to avoid more sync work. For
instance, if inactive timeout is kept low on the standby, the sync
worker will be doing more work as it drops and recreates a slot
repeatedly if it keeps getting invalidated. Another thing is that the
standby takes independent invalidation decisions for synced slots.
AFAICS, invalidation due to wal_removal is the only sole reason (out
of all available invalidation reasons) for a synced slot to get
invalidated independently of the primary. Check
/messages/by-id/CAA4eK1JXBwTaDRD_=8t6UB1fhRNjC1C+gH4YdDxj_9U6djLnXw@mail.gmail.com
for the suggestion on we better not differentiaing invalidation
decisions for synced slots.

The assumption of letting synced slots have their own inactive_since
not only simplifies the code, but also looks less-confusing and more
meaningful to the user. The only code that we put in on top of the
committed code is to use InRecovery in place of
RecoveryInProgress() in RestoreSlotFromDisk() to fix the issue raised
by Shveta upthread.

If we want to invalidate the synced slots due to the timeout, I think
we need to define what is "inactive" for synced slots.

Suppose that the slotsync worker updates the local (synced) slot's
inactive_since whenever releasing the slot, irrespective of the actual
LSNs (or other slot parameters) having been updated. I think that this
idea cannot handle a slot that is not acquired on the primary. In this
case, the remote slot is inactive but the local slot is regarded as
active. WAL files are piled up on the standby (and on the primary) as
the slot's LSNs don't move forward. I think we want to regard such a
slot as "inactive" also on the standby and invalidate it because of
the timeout.

Now, the other concern is that calling GetCurrentTimestamp()
could be costly when the values for the slot are not going to be
updated but if that happens we can optimize such that before acquiring
the slot we can have some minimal pre-checks to ensure whether we need
to update the slot or not.

If we use such pre-checks, another problem might happen; it cannot
handle a case where the slot is acquired on the primary but its LSNs
don't move forward. Imagine a logical replication conflict happened on
the subscriber, and the logical replication enters the retry loop. In
this case, the remote slot's inactive_since gets updated for every
retry, but it looks inactive from the standby since the slot LSNs
don't change. Therefore, only the local slot could be invalidated due
to the timeout but probably we don't want to regard such a slot as
"inactive".

Another idea I came up with is that the slotsync worker updates the
local slot's inactive_since to the local timestamp only when the
remote slot might have got inactive. If the remote slot is acquired by
someone, the local slot's inactive_since is also NULL. If the remote
slot gets inactive, the slotsync worker sets the local timestamp to
the local slot's inactive_since. Since the remote slot could be
acquired and released before the slotsync worker gets the remote slot
data again, if the remote slot's inactive_since > the local slot's
inactive_since, the slotsync worker updates the local one. IOW, we
detect whether the remote slot was acquired and released since the
last synchronization, by checking the remote slot's inactive_since.
This idea seems to handle these cases I mentioned unless I'm missing
something, but it requires for the slotsync worker to update
inactive_since in a different way than other parameters.

Or a simple solution is that the slotsync worker updates
inactive_since as it does for non-synced slots, and disables
timeout-based slot invalidation for synced slots.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#195

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Masahiko Sawada (#194)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Apr 02, 2024 at 12:07:54PM +0900, Masahiko Sawada wrote:

On Mon, Apr 1, 2024 at 12:18 PM Bharath Rupireddy

FWIW, coming to this thread late, I think that the inactive_since
should not be synchronized from the primary. The wall clocks are
different on the primary and the standby so having the primary's
timestamp on the standby can confuse users, especially when there is a
big clock drift. Also, as Amit mentioned, inactive_since seems not to
be consistent with other parameters we copy. The
replication_slot_inactive_timeout feature should work on the standby
independent from the primary, like other slot invalidation mechanisms,
and it should be based on its own local clock.

Thanks for sharing your thoughts! So, it looks like that most of us agree to not
sync inactive_since from the primary, I'm fine with that.

If we want to invalidate the synced slots due to the timeout, I think
we need to define what is "inactive" for synced slots.

Suppose that the slotsync worker updates the local (synced) slot's
inactive_since whenever releasing the slot, irrespective of the actual
LSNs (or other slot parameters) having been updated. I think that this
idea cannot handle a slot that is not acquired on the primary. In this
case, the remote slot is inactive but the local slot is regarded as
active. WAL files are piled up on the standby (and on the primary) as
the slot's LSNs don't move forward. I think we want to regard such a
slot as "inactive" also on the standby and invalidate it because of
the timeout.

I think that makes sense to somehow link inactive_since on the standby to
the actual LSNs (or other slot parameters) being updated or not.

Now, the other concern is that calling GetCurrentTimestamp()
could be costly when the values for the slot are not going to be
updated but if that happens we can optimize such that before acquiring
the slot we can have some minimal pre-checks to ensure whether we need
to update the slot or not.

If we use such pre-checks, another problem might happen; it cannot
handle a case where the slot is acquired on the primary but its LSNs
don't move forward. Imagine a logical replication conflict happened on
the subscriber, and the logical replication enters the retry loop. In
this case, the remote slot's inactive_since gets updated for every
retry, but it looks inactive from the standby since the slot LSNs
don't change. Therefore, only the local slot could be invalidated due
to the timeout but probably we don't want to regard such a slot as
"inactive".

Another idea I came up with is that the slotsync worker updates the
local slot's inactive_since to the local timestamp only when the
remote slot might have got inactive. If the remote slot is acquired by
someone, the local slot's inactive_since is also NULL. If the remote
slot gets inactive, the slotsync worker sets the local timestamp to
the local slot's inactive_since. Since the remote slot could be
acquired and released before the slotsync worker gets the remote slot
data again, if the remote slot's inactive_since > the local slot's
inactive_since, the slotsync worker updates the local one.

Then I think we would need to be careful about time zone comparison.

IOW, we
detect whether the remote slot was acquired and released since the
last synchronization, by checking the remote slot's inactive_since.
This idea seems to handle these cases I mentioned unless I'm missing
something, but it requires for the slotsync worker to update
inactive_since in a different way than other parameters.

Or a simple solution is that the slotsync worker updates
inactive_since as it does for non-synced slots, and disables
timeout-based slot invalidation for synced slots.

Yeah, I think the main question to help us decide is: do we want to invalidate
"inactive" synced slots locally (in addition to synchronizing the invalidation
from the primary)?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#196

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#195)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Apr 2, 2024 at 11:58 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Or a simple solution is that the slotsync worker updates
inactive_since as it does for non-synced slots, and disables
timeout-based slot invalidation for synced slots.

Yeah, I think the main question to help us decide is: do we want to invalidate
"inactive" synced slots locally (in addition to synchronizing the invalidation
from the primary)?

I think this approach looks way simpler than the other one. The other
approach of linking inactive_since on the standby for synced slots to
the actual LSNs (or other slot parameters) being updated or not looks
more complicated, and might not go well with the end user. However,
we need to be able to say why we don't invalidate synced slots due to
inactive timeout unlike the wal_removed invalidation that can happen
right now on the standby for synced slots. This leads us to define
actually what a slot being active means. Is syncing the data from the
remote slot considered as the slot being active?

On the other hand, it may not sound great if we don't invalidate
synced slots due to inactive timeout even though they hold resources
such as WAL and XIDs.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#197

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#196)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Tue, Apr 02, 2024 at 12:41:35PM +0530, Bharath Rupireddy wrote:

On Tue, Apr 2, 2024 at 11:58 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Or a simple solution is that the slotsync worker updates
inactive_since as it does for non-synced slots, and disables
timeout-based slot invalidation for synced slots.

Yeah, I think the main question to help us decide is: do we want to invalidate
"inactive" synced slots locally (in addition to synchronizing the invalidation
from the primary)?

I think this approach looks way simpler than the other one. The other
approach of linking inactive_since on the standby for synced slots to
the actual LSNs (or other slot parameters) being updated or not looks
more complicated, and might not go well with the end user. However,
we need to be able to say why we don't invalidate synced slots due to
inactive timeout unlike the wal_removed invalidation that can happen
right now on the standby for synced slots. This leads us to define
actually what a slot being active means. Is syncing the data from the
remote slot considered as the slot being active?

On the other hand, it may not sound great if we don't invalidate
synced slots due to inactive timeout even though they hold resources
such as WAL and XIDs.

Right and the "only" benefit then would be to give an idea as to when the last
sync did occur on the local slot.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#198

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#195)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Apr 2, 2024 at 11:58 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Tue, Apr 02, 2024 at 12:07:54PM +0900, Masahiko Sawada wrote:

On Mon, Apr 1, 2024 at 12:18 PM Bharath Rupireddy

FWIW, coming to this thread late, I think that the inactive_since
should not be synchronized from the primary. The wall clocks are
different on the primary and the standby so having the primary's
timestamp on the standby can confuse users, especially when there is a
big clock drift. Also, as Amit mentioned, inactive_since seems not to
be consistent with other parameters we copy. The
replication_slot_inactive_timeout feature should work on the standby
independent from the primary, like other slot invalidation mechanisms,
and it should be based on its own local clock.

Thanks for sharing your thoughts! So, it looks like that most of us agree to not
sync inactive_since from the primary, I'm fine with that.

+1 on not syncing slots from primary.

If we want to invalidate the synced slots due to the timeout, I think
we need to define what is "inactive" for synced slots.

Suppose that the slotsync worker updates the local (synced) slot's
inactive_since whenever releasing the slot, irrespective of the actual
LSNs (or other slot parameters) having been updated. I think that this
idea cannot handle a slot that is not acquired on the primary. In this
case, the remote slot is inactive but the local slot is regarded as
active. WAL files are piled up on the standby (and on the primary) as
the slot's LSNs don't move forward. I think we want to regard such a
slot as "inactive" also on the standby and invalidate it because of
the timeout.

I think that makes sense to somehow link inactive_since on the standby to
the actual LSNs (or other slot parameters) being updated or not.

Now, the other concern is that calling GetCurrentTimestamp()
could be costly when the values for the slot are not going to be
updated but if that happens we can optimize such that before acquiring
the slot we can have some minimal pre-checks to ensure whether we need
to update the slot or not.

If we use such pre-checks, another problem might happen; it cannot
handle a case where the slot is acquired on the primary but its LSNs
don't move forward. Imagine a logical replication conflict happened on
the subscriber, and the logical replication enters the retry loop. In
this case, the remote slot's inactive_since gets updated for every
retry, but it looks inactive from the standby since the slot LSNs
don't change. Therefore, only the local slot could be invalidated due
to the timeout but probably we don't want to regard such a slot as
"inactive".

Another idea I came up with is that the slotsync worker updates the
local slot's inactive_since to the local timestamp only when the
remote slot might have got inactive. If the remote slot is acquired by
someone, the local slot's inactive_since is also NULL. If the remote
slot gets inactive, the slotsync worker sets the local timestamp to
the local slot's inactive_since. Since the remote slot could be
acquired and released before the slotsync worker gets the remote slot
data again, if the remote slot's inactive_since > the local slot's
inactive_since, the slotsync worker updates the local one.

Then I think we would need to be careful about time zone comparison.

Yes. Also we need to consider the case when a user is relying on
pg_sync_replication_slots() and has not enabled slot-sync worker. In
such a case if synced slot's inactive_since is derived from inactivity
of remote-slot, it might not be that frequently updated (based on when
the user actually runs the SQL function) and thus may be misleading.
OTOH, if inactivty_since of synced slots represents its own
inactivity, then it will give correct info even for the case when the
SQL function is run after a long time and slot-sync worker is
disabled.

IOW, we
detect whether the remote slot was acquired and released since the
last synchronization, by checking the remote slot's inactive_since.
This idea seems to handle these cases I mentioned unless I'm missing
something, but it requires for the slotsync worker to update
inactive_since in a different way than other parameters.

Or a simple solution is that the slotsync worker updates
inactive_since as it does for non-synced slots, and disables
timeout-based slot invalidation for synced slots.

I like this idea better, it takes care of such a case too when the
user is relying on sync-function rather than worker and does not want
to get the slots invalidated in between 2 sync function calls.

Yeah, I think the main question to help us decide is: do we want to invalidate
"inactive" synced slots locally (in addition to synchronizing the invalidation
from the primary)?

thanks
Shveta

#199

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: shveta malik (#198)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Apr 3, 2024 at 8:38 AM shveta malik <shveta.malik@gmail.com> wrote:

Or a simple solution is that the slotsync worker updates
inactive_since as it does for non-synced slots, and disables
timeout-based slot invalidation for synced slots.

I like this idea better, it takes care of such a case too when the
user is relying on sync-function rather than worker and does not want
to get the slots invalidated in between 2 sync function calls.

Please find the attached v31 patches implementing the above idea:

- synced slots get their on inactive_since just like any other slot
- synced slots don't get invalidated due to inactive timeout because
such slots not considered active at all as they don't perform logical
decoding (of course, they will perform in fast_forward mode to fix the
other data loss issue, but they don't generate changes for them to be
called as *active* slots)
- synced slots inactive_since is set to current timestamp after the
standby gets promoted to help inactive_since interpret correctly just
like any other slot.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v31-0001-Allow-synced-slots-to-have-their-own-inactive_si.patchapplication/x-patch; name=v31-0001-Allow-synced-slots-to-have-their-own-inactive_si.patchDownload

From fd2cb48726dd4e1932f8809dfb36e0fe9f922226 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 3 Apr 2024 05:03:22 +0000
Subject: [PATCH v31 1/2] Allow synced slots to have their own inactive_since.

---
 doc/src/sgml/system-views.sgml                |  7 +++
 src/backend/replication/logical/slotsync.c    | 44 +++++++++++++++++
 src/backend/replication/slot.c                | 23 +++------
 src/test/perl/PostgreSQL/Test/Cluster.pm      | 34 +++++++++++++
 src/test/recovery/t/019_replslot_limit.pl     | 26 +---------
 .../t/040_standby_failover_slots_sync.pl      | 49 +++++++++++++++++++
 6 files changed, 144 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..b64274a1fb 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2530,6 +2530,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       <para>
         The time since the slot has become inactive.
         <literal>NULL</literal> if the slot is currently being used.
+        Note that for slots on the standby that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>), the
+        <structfield>inactive_since</structfield> value will get updated
+        after every synchronization (see
+        <xref linkend="logicaldecoding-replication-slots-synchronization"/>)
+        from the corresponding remote slot on the primary.
       </para></entry>
      </row>
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..4050bd40f8 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -140,6 +140,7 @@ typedef struct RemoteSlot
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_since(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -1296,6 +1297,46 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Update the inactive_since property for synced slots.
+ */
+static void
+update_synced_slots_inactive_since(void)
+{
+	TimestampTz now = 0;
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			Assert(SlotIsLogical(s));
+
+			/*
+			 * We get the current time beforehand and only once to avoid
+			 * system calls overhead while holding the lock.
+			 */
+			if (now == 0)
+				now = GetCurrentTimestamp();
+
+			/*
+			 * Set the time since the slot has become inactive after shutting
+			 * down slot sync machinery. This helps correctly interpret the
+			 * time if the standby gets promoted without a restart.
+			 */
+			SpinLockAcquire(&s->mutex);
+			s->inactive_since = now;
+			SpinLockRelease(&s->mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1309,6 +1350,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(&SlotSyncCtx->mutex);
+		update_synced_slots_inactive_since();
 		return;
 	}
 	SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1341,6 +1383,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(&SlotSyncCtx->mutex);
+
+	update_synced_slots_inactive_since();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d778c0b921..5549ca9640 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -42,6 +42,7 @@
 #include "access/transam.h"
 #include "access/xlog_internal.h"
 #include "access/xlogrecovery.h"
+#include "access/xlogutils.h"
 #include "common/file_utils.h"
 #include "common/string.h"
 #include "miscadmin.h"
@@ -690,13 +691,10 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has become inactive. We get the current
+	 * time beforehand to avoid system call overhead while holding the lock.
 	 */
-	if (!(RecoveryInProgress() && slot->data.synced))
-		now = GetCurrentTimestamp();
+	now = GetCurrentTimestamp();
 
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
@@ -2369,16 +2367,11 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * Set the time since the slot has become inactive after loading the
+		 * slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it.
 		 */
-		if (!(RecoveryInProgress() && slot->data.synced))
-			slot->inactive_since = GetCurrentTimestamp();
-		else
-			slot->inactive_since = 0;
+		slot->inactive_since = GetCurrentTimestamp();
 
 		restored = true;
 		break;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b08296605c..ddfc3236f3 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3276,6 +3276,40 @@ sub create_logical_slot_on_standby
 
 =pod
 
+=item $node->get_slot_inactive_since_value(self, slot_name, reference_time)
+
+Get inactive_since column value for a given replication slot validating it
+against optional reference time.
+
+=cut
+
+sub get_slot_inactive_since_value
+{
+	my ($self, $slot_name, $reference_time) = @_;
+	my $name = $self->name;
+
+	my $inactive_since = $self->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
+		);
+
+	# Check that the captured time is sane
+	if (defined $reference_time)
+	{
+		is($self->safe_psql('postgres',
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+					'$inactive_since'::timestamptz >= '$reference_time'::timestamptz;]
+			),
+			't',
+			"last inactive time for slot $slot_name is valid on node $name")
+			or die "could not validate captured inactive_since for slot $slot_name";
+	}
+
+	return $inactive_since;
+}
+
+=pod
+
 =item $node->advance_wal(num)
 
 Advance WAL of node by given number of segments.
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 3b9a306a8b..c8e5e5054d 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -443,7 +443,7 @@ $primary4->safe_psql(
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the standby below.
 my $inactive_since =
-	capture_and_validate_slot_inactive_since($primary4, $sb4_slot, $slot_creation_time);
+	$primary4->get_slot_inactive_since_value($sb4_slot, $slot_creation_time);
 
 $standby4->start;
 
@@ -502,7 +502,7 @@ $publisher4->safe_psql('postgres',
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the subscriber below.
 $inactive_since =
-	capture_and_validate_slot_inactive_since($publisher4, $lsub4_slot, $slot_creation_time);
+	$publisher4->get_slot_inactive_since_value($lsub4_slot, $slot_creation_time);
 
 $subscriber4->start;
 $subscriber4->safe_psql('postgres',
@@ -540,26 +540,4 @@ is( $publisher4->safe_psql(
 $publisher4->stop;
 $subscriber4->stop;
 
-# Capture and validate inactive_since of a given slot.
-sub capture_and_validate_slot_inactive_since
-{
-	my ($node, $slot_name, $slot_creation_time) = @_;
-
-	my $inactive_since = $node->safe_psql('postgres',
-		qq(SELECT inactive_since FROM pg_replication_slots
-			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
-		);
-
-	# Check that the captured time is sane
-	is( $node->safe_psql(
-			'postgres',
-			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
-				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
-		),
-		't',
-		"last inactive time for an active slot $slot_name is sane");
-
-	return $inactive_since;
-}
-
 done_testing();
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index f47bfd78eb..50be94e629 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -35,6 +35,13 @@ my $subscriber1 = PostgreSQL::Test::Cluster->new('subscriber1');
 $subscriber1->init;
 $subscriber1->start;
 
+# Capture the time before the logical failover slot is created on the
+# primary. We later call this publisher as primary anyway.
+my $slot_creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Create a slot on the publisher with failover disabled
 $publisher->safe_psql('postgres',
 	"SELECT 'init' FROM pg_create_logical_replication_slot('lsub1_slot', 'pgoutput', false, false, false);"
@@ -174,6 +181,10 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary
+my $inactive_since_on_primary =
+	$primary->get_slot_inactive_since_value('lsub1_slot', $slot_creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -181,6 +192,11 @@ $primary->wait_for_replay_catchup($standby1);
 # Synchronize the primary server slots to the standby.
 $standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
 
+my $slot_sync_time = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Confirm that the logical failover slots are created on the standby and are
 # flagged as 'synced'
 is( $standby1->safe_psql(
@@ -190,6 +206,19 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Capture the inactive_since of the synced slot on the standby
+my $inactive_since_on_standby =
+	$standby1->get_slot_inactive_since_value('lsub1_slot', $slot_creation_time_on_primary);
+
+# Synced slot on the standby must get its own inactive_since.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_primary'::timestamptz < '$inactive_since_on_standby'::timestamptz AND
+			'$inactive_since_on_standby'::timestamptz < '$slot_sync_time'::timestamptz;"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
@@ -750,8 +779,28 @@ $primary->reload;
 $standby1->start;
 $primary->wait_for_replay_catchup($standby1);
 
+# Capture the time before the standby is promoted
+my $promotion_time_on_primary = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 $standby1->promote;
 
+# Capture the inactive_since of the synced slot after the promotion.
+# Expectation here is that the slot gets its own inactive_since as part of the
+# promotion. We do this check before the slot is enabled on the new primary
+# below, otherwise the slot gets active setting inactive_since to NULL.
+my $inactive_since_on_new_primary =
+	$standby1->get_slot_inactive_since_value('lsub1_slot', $promotion_time_on_primary);
+
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_new_primary'::timestamptz > '$inactive_since_on_primary'::timestamptz"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since on the new primary after promotion');
+
 # Update subscription with the new primary's connection info
 my $standby1_conninfo = $standby1->connstr . ' dbname=postgres';
 $subscriber1->safe_psql('postgres',
-- 
2.34.1

v31-0002-Add-inactive_timeout-based-replication-slot-inva.patchapplication/x-patch; name=v31-0002-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 382a63bd4c48318ec53bdbe292b3ab1ecbaeb341 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 3 Apr 2024 05:32:14 +0000
Subject: [PATCH v31 2/2] Add inactive_timeout based replication slot
 invalidation.

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days at slot level, after which the inactive slots get
dropped.

To achieve the above, postgres introduces a GUC allowing users
set inactive timeout and then a slot stays inactive for this much
amount of time it invalidates the slot. The invalidation check
happens at various locations to help being as latest as possible,
these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint
Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/config.sgml                      |  33 +++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 203 ++++++++++++-
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   8 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/044_invalidate_slots.pl   | 279 ++++++++++++++++++
 13 files changed, 534 insertions(+), 24 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 624518e0b0..0a5ed00d8d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4547,6 +4547,39 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidates replication slots that are inactive for longer the
+        specified amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the timeout mechanism. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+
+       <para>
+        The timeout is measured from the time since the slot has become
+        inactive (known from its
+        <structfield>inactive_since</structfield> value) until it gets
+        used (i.e., its <structfield>active</structfield> is set to true).
+       </para>
+
+       <para>
+        Note that this inactive timeout invalidation mechanism is not
+        applicable for slots on the standby that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>) as such synced slots don't actually perform
+        logical decoding.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index b64274a1fb..3d33afa796 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2580,6 +2580,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 4050bd40f8..08de929970 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -317,7 +317,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -527,7 +527,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
 		 * if the slot is not acquired by other processes.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 5549ca9640..e583d0ee82 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -108,10 +108,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -159,6 +161,7 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool InvalidatePossiblyInactiveSlot(ReplicationSlot *slot);
 
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
@@ -536,9 +539,14 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * If check_for_timeout_invalidation is true, the slot is checked for
+ * invalidation based on replication_slot_inactive_timeout GUC, and an error is
+ * raised after making the slot ours.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_timeout_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -616,6 +624,34 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * Check if the given slot can be invalidated based on its inactive
+	 * timeout. If yes, persist the invalidated state to disk and then error
+	 * out. We do this only after making the slot ours to avoid anyone else
+	 * acquiring it while we check for its invalidation.
+	 */
+	if (check_for_timeout_invalidation)
+	{
+		/* The slot is ours by now */
+		Assert(s->active_pid == MyProcPid);
+
+		if (InvalidateInactiveReplicationSlot(s, true))
+		{
+			/*
+			 * If the slot has been invalidated, recalculate the resource
+			 * limits.
+			 */
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(MyReplicationSlot->data.name)),
+					 errdetail("This slot has been invalidated because it was inactive for more than the time specified by replication_slot_inactive_timeout parameter.")));
+		}
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -782,7 +818,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -805,7 +841,7 @@ ReplicationSlotAlter(const char *name, bool failover)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -981,6 +1017,20 @@ ReplicationSlotDropPtr(ReplicationSlot *slot)
 	LWLockRelease(ReplicationSlotAllocationLock);
 }
 
+/*
+ * Helper for ReplicationSlotSave
+ */
+static inline void
+SaveGivenReplicationSlot(ReplicationSlot *slot, int elevel)
+{
+	char		path[MAXPGPATH];
+
+	Assert(slot != NULL);
+
+	sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+	SaveSlotToPath(slot, path, elevel);
+}
+
 /*
  * Serialize the currently acquired slot's state from memory to disk, thereby
  * guaranteeing the current state will survive a crash.
@@ -988,12 +1038,21 @@ ReplicationSlotDropPtr(ReplicationSlot *slot)
 void
 ReplicationSlotSave(void)
 {
-	char		path[MAXPGPATH];
+	SaveGivenReplicationSlot(MyReplicationSlot, ERROR);
+}
 
-	Assert(MyReplicationSlot != NULL);
+/*
+ * Helper for ReplicationSlotMarkDirty
+ */
+static inline void
+MarkGivenReplicationSlotDirty(ReplicationSlot *slot)
+{
+	Assert(slot != NULL);
 
-	sprintf(path, "pg_replslot/%s", NameStr(MyReplicationSlot->data.name));
-	SaveSlotToPath(MyReplicationSlot, path, ERROR);
+	SpinLockAcquire(&slot->mutex);
+	slot->just_dirtied = true;
+	slot->dirty = true;
+	SpinLockRelease(&slot->mutex);
 }
 
 /*
@@ -1006,14 +1065,7 @@ ReplicationSlotSave(void)
 void
 ReplicationSlotMarkDirty(void)
 {
-	ReplicationSlot *slot = MyReplicationSlot;
-
-	Assert(MyReplicationSlot != NULL);
-
-	SpinLockAcquire(&slot->mutex);
-	MyReplicationSlot->just_dirtied = true;
-	MyReplicationSlot->dirty = true;
-	SpinLockRelease(&slot->mutex);
+	MarkGivenReplicationSlotDirty(MyReplicationSlot);
 }
 
 /*
@@ -1507,6 +1559,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by replication_slot_inactive_timeout parameter."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1551,6 +1606,13 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
 
+	/*
+	 * This function isn't expected to be called for inactive timeout based
+	 * invalidation. A separate function InvalidateInactiveReplicationSlot is
+	 * to be used for that.
+	 */
+	Assert(cause != RS_INVAL_INACTIVE_TIMEOUT);
+
 	for (;;)
 	{
 		XLogRecPtr	restart_lsn;
@@ -1620,6 +1682,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					/* not reachable */
+					Assert(false);
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1773,6 +1839,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1788,6 +1855,13 @@ InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
 	Assert(cause != RS_INVAL_WAL_REMOVED || oldestSegno > 0);
 	Assert(cause != RS_INVAL_NONE);
 
+	/*
+	 * This function isn't expected to be called for inactive timeout based
+	 * invalidation. A separate function InvalidateInactiveReplicationSlot is
+	 * to be used for that.
+	 */
+	Assert(cause != RS_INVAL_INACTIVE_TIMEOUT);
+
 	if (max_replication_slots == 0)
 		return invalidated;
 
@@ -1824,6 +1898,88 @@ restart:
 	return invalidated;
 }
 
+/*
+ * Invalidate given slot based on replication_slot_inactive_timeout GUC.
+ *
+ * Returns true if the slot has got invalidated.
+ *
+ * NB - this function also runs as part of checkpoint, so avoid raising errors
+ * if possible.
+ */
+bool
+InvalidateInactiveReplicationSlot(ReplicationSlot *slot, bool persist_state)
+{
+	if (!InvalidatePossiblyInactiveSlot(slot))
+		return false;
+
+	/* Make sure the invalidated state persists across server restart */
+	MarkGivenReplicationSlotDirty(slot);
+
+	if (persist_state)
+		SaveGivenReplicationSlot(slot, ERROR);
+
+	ReportSlotInvalidation(RS_INVAL_INACTIVE_TIMEOUT, false, 0,
+						   slot->data.name, InvalidXLogRecPtr,
+						   InvalidXLogRecPtr, InvalidTransactionId);
+
+	return true;
+}
+
+/*
+ * Helper for InvalidateInactiveReplicationSlot
+ */
+static bool
+InvalidatePossiblyInactiveSlot(ReplicationSlot *slot)
+{
+	ReplicationSlotInvalidationCause inavidation_cause = RS_INVAL_NONE;
+
+	/*
+	 * Note that we don't invalidate slot on the standby that's currently
+	 * being synced from the primary, because such slots are typically
+	 * considered not active as they don't actually perform logical decoding.
+	 */
+	if (RecoveryInProgress() && slot->data.synced)
+		return false;
+
+	if (replication_slot_inactive_timeout == 0)
+		return false;
+	else if (slot->inactive_since > 0)
+	{
+		TimestampTz now;
+
+		/*
+		 * Do not invalidate the slots which are currently being synced from
+		 * the primary to the standby.
+		 */
+		if (RecoveryInProgress() && slot->data.synced)
+			return false;
+
+		LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+		SpinLockAcquire(&slot->mutex);
+
+		/*
+		 * Check if the slot needs to be invalidated due to
+		 * replication_slot_inactive_timeout GUC. We do this with the spinlock
+		 * held to avoid race conditions -- for example the inactive_since
+		 * could change, or the slot could be dropped.
+		 */
+		now = GetCurrentTimestamp();
+		if (TimestampDifferenceExceeds(slot->inactive_since, now,
+									   replication_slot_inactive_timeout * 1000))
+		{
+			inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+			slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT;
+		}
+
+		SpinLockRelease(&slot->mutex);
+		LWLockRelease(ReplicationSlotControlLock);
+
+		return (inavidation_cause == RS_INVAL_INACTIVE_TIMEOUT);
+	}
+
+	return false;
+}
+
 /*
  * Flush all replication slots to disk.
  *
@@ -1836,6 +1992,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated = false;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -1859,6 +2016,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		/* save the slot to disk, locking is handled in SaveSlotToPath() */
 		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
 
+		/*
+		 * Here's an opportunity to invalidate inactive replication slots
+		 * based on timeout, so let's do it.
+		 */
+		if (InvalidateInactiveReplicationSlot(s, false))
+			invalidated = true;
+
 		/*
 		 * Slot's data is not flushed each time the confirmed_flush LSN is
 		 * updated as that could lead to frequent writes.  However, we decide
@@ -1885,6 +2049,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	/* If the slot has been invalidated, recalculate the resource limits */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index da57177c25..677c0bf0a2 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -651,7 +651,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bc40c454de..96eeb8b7d2 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -846,7 +846,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1459,7 +1459,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index c12784cbec..4149ff1ffe 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2971,6 +2971,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index baecde2841..2e1ad2eaca 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -335,6 +335,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7b937d1a0c..f0ac324ce9 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -230,6 +232,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -245,7 +248,8 @@ extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, bool failover);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_timeout_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
@@ -264,6 +268,8 @@ extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
+extern bool InvalidateInactiveReplicationSlot(ReplicationSlot *slot,
+											  bool persist_state);
 extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock);
 extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 712924c2fa..0437ab5c46 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_wal_replay_wait.pl',
+      't/044_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_slots.pl b/src/test/recovery/t/044_invalidate_slots.pl
new file mode 100644
index 0000000000..0f49d16ff7
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_slots.pl
@@ -0,0 +1,279 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to inactive timeout GUC. Also, check the logical
+# failover slot synced on to the standby doesn't invalidate the slot on its own,
+# but gets the invalidated state from the remote slot on the primary.
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr_1 = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb1_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('lsub1_sync_slot', 'test_decoding', false, false, true);}
+);
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot');
+]);
+
+$standby1->start;
+
+my $standby1_logstart = -s $standby1->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Synchronize the primary server slots to the standby.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced'.
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub1_sync_slot' AND synced AND NOT temporary;}
+	),
+	"t",
+	'logical slot has synced as true on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$primary->reload;
+
+# Wait for the logical failover slot to become inactive on the primary. Note
+# that nobody has acquired that slot yet, so due to inactive timeout setting
+# above it must get invalidated.
+wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart);
+
+# Set timeout on the standby also to check the synced slots don't get
+# invalidated due to timeout on the standby.
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$standby1->reload;
+
+# Now, sync the logical failover slot from the remote slot on the primary.
+# Note that the remote slot has already been invalidated due to inactive
+# timeout. Now, the standby must also see it as invalidated.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Wait for the inactive replication slot to be invalidated.
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'lsub1_sync_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for replication slot lsub1_sync_slot invalidation to be synced on standby";
+
+# Synced slot mustn't get invalidated on the standby, it must sync invalidation
+# from the primary. So, we must not see the slot's invalidation message in server
+# log.
+ok( !$standby1->log_contains(
+		"invalidating obsolete replication slot \"lsub1_sync_slot\"",
+		$standby1_logstart),
+	'check that syned slot has not been invalidated on the standby');
+
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
+wait_for_slot_invalidation($primary, 'sb1_slot', $logstart);
+
+# Testcase end: Invalidate streaming standby's slot as well as logical failover
+# slot on primary due to inactive timeout GUC. Also, check the logical failover
+# slot synced on to the standby doesn't invalidate the slot on its own, but
+# gets the invalidated state from the remote slot on the primary.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to inactive timeout
+# GUC.
+
+my $publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$subscriber->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart);
+
+# Testcase end: Invalidate logical subscriber's slot due to inactive timeout
+# GUC.
+# =============================================================================
+
+# =============================================================================
+# Start: Helper functions used for this test file
+
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $name = $node->name;
+
+	# Wait for the replication slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for replication slot to become inactive";
+
+	# Wait for the replication slot info to be updated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE inactive_since IS NOT NULL
+				AND slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for info of replication slot $slot_name to be updated on node $name";
+
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+
+	# Wait for the inactive replication slot to be invalidated.
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for inactive replication slot $slot_name to be invalidated on node $name";
+
+	# Check that the invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+
+	($result, $stdout, $stderr) = $primary->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot_name', '0/1');
+	]);
+
+	ok( $stderr =~
+		  /can no longer get changes from replication slot "$slot_name"/,
+		"detected error upon trying to acquire invalidated slot $slot_name on node $name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot_name";
+}
+
+# Check for invalidation of slot in server log.
+sub check_for_slot_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot_name invalidation has been logged");
+}
+
+# =============================================================================
+# End: Helper functions used for this test file
+
+done_testing();
-- 
2.34.1

#200

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#199)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Apr 03, 2024 at 11:17:41AM +0530, Bharath Rupireddy wrote:

On Wed, Apr 3, 2024 at 8:38 AM shveta malik <shveta.malik@gmail.com> wrote:

Or a simple solution is that the slotsync worker updates
inactive_since as it does for non-synced slots, and disables
timeout-based slot invalidation for synced slots.

I like this idea better, it takes care of such a case too when the
user is relying on sync-function rather than worker and does not want
to get the slots invalidated in between 2 sync function calls.

Please find the attached v31 patches implementing the above idea:

Thanks!

Some comments related to v31-0001:

=== testing the behavior

T1 ===

- synced slots get their on inactive_since just like any other slot

It behaves as described.

T2 ===

- synced slots inactive_since is set to current timestamp after the
standby gets promoted to help inactive_since interpret correctly just
like any other slot.

It behaves as described.

CR1 ===

+        <structfield>inactive_since</structfield> value will get updated
+        after every synchronization

indicates the last synchronization time? (I think that after every synchronization
could lead to confusion).

CR2 ===

+                       /*
+                        * Set the time since the slot has become inactive after shutting
+                        * down slot sync machinery. This helps correctly interpret the
+                        * time if the standby gets promoted without a restart.
+                        */

It looks to me that this comment is not at the right place because there is
nothing after the comment that indicates that we shutdown the "slot sync machinery".

Maybe a better place is before the function definition and mention that this is
currently called when we shutdown the "slot sync machinery"?

CR3 ===

+                        * We get the current time beforehand and only once to avoid
+                        * system calls overhead while holding the lock.

s/avoid system calls overhead while holding the lock/avoid system calls while holding the spinlock/?

CR4 ===

+        * Set the time since the slot has become inactive. We get the current
+        * time beforehand to avoid system call overhead while holding the lock

Same.

CR5 ===

+       # Check that the captured time is sane
+       if (defined $reference_time)
+       {

s/Check that the captured time is sane/Check that the inactive_since is sane/?

Sorry if some of those comments could have been done while I did review v29-0001.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#201

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#199)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Apr 03, 2024 at 11:17:41AM +0530, Bharath Rupireddy wrote:

On Wed, Apr 3, 2024 at 8:38 AM shveta malik <shveta.malik@gmail.com> wrote:

Or a simple solution is that the slotsync worker updates
inactive_since as it does for non-synced slots, and disables
timeout-based slot invalidation for synced slots.

I like this idea better, it takes care of such a case too when the
user is relying on sync-function rather than worker and does not want
to get the slots invalidated in between 2 sync function calls.

Please find the attached v31 patches implementing the above idea:

Thanks!

Some comments regarding v31-0002:

=== testing the behavior

T1 ===

- synced slots don't get invalidated due to inactive timeout because
such slots not considered active at all as they don't perform logical
decoding (of course, they will perform in fast_forward mode to fix the
other data loss issue, but they don't generate changes for them to be
called as *active* slots)

It behaves as described. OTOH non synced logical slots on the standby and
physical slots on the standby are invalidated which is what is expected.

T2 ===

In case the slot is invalidated on the primary,

primary:

postgres=# select slot_name, inactive_since, invalidation_reason from pg_replication_slots where slot_name = 's1';
slot_name | inactive_since | invalidation_reason
-----------+-------------------------------+---------------------
s1 | 2024-04-03 06:56:28.075637+00 | inactive_timeout

then on the standby we get:

standby:

postgres=# select slot_name, inactive_since, invalidation_reason from pg_replication_slots where slot_name = 's1';
slot_name | inactive_since | invalidation_reason
-----------+------------------------------+---------------------
s1 | 2024-04-03 07:06:43.37486+00 | inactive_timeout

shouldn't the slot be dropped/recreated instead of updating inactive_since?

=== code

CR1 ===

+        Invalidates replication slots that are inactive for longer the
+        specified amount of time

s/for longer the/for longer that/?

CR2 ===

+        <literal>true</literal>) as such synced slots don't actually perform
+        logical decoding.

We're switching in fast forward logical due to [1]/messages/by-id/OS0PR01MB5716B3942AE49F3F725ACA92943B2@OS0PR01MB5716.jpnprd01.prod.outlook.com, so I'm not sure that's 100%
accurate here. I'm not sure we need to specify a reason.

CR3 ===

+ errdetail("This slot has been invalidated because it was inactive for more than the time specified by replication_slot_inactive_timeout parameter.")));

I think we can remove "parameter" (see for example the error message in
validate_remote_info()) and reduce it a bit, something like?

"This slot has been invalidated because it was inactive for more than replication_slot_inactive_timeout"?

CR4 ===

+ appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by replication_slot_inactive_timeout parameter."));

Same.

CR5 ===

+       /*
+        * This function isn't expected to be called for inactive timeout based
+        * invalidation. A separate function InvalidateInactiveReplicationSlot is
+        * to be used for that.

Do you think it's worth to explain why?

CR6 ===

+       if (replication_slot_inactive_timeout == 0)
+               return false;
+       else if (slot->inactive_since > 0)

"else" is not needed here.

CR7 ===

+               SpinLockAcquire(&slot->mutex);
+
+               /*
+                * Check if the slot needs to be invalidated due to
+                * replication_slot_inactive_timeout GUC. We do this with the spinlock
+                * held to avoid race conditions -- for example the inactive_since
+                * could change, or the slot could be dropped.
+                */
+               now = GetCurrentTimestamp();

We should not call GetCurrentTimestamp() while holding a spinlock.

CR8 ===

+# Testcase start: Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to inactive timeout GUC. Also, check the logical

s/inactive timeout GUC/replication_slot_inactive_timeout/?

CR9 ===

+# Start: Helper functions used for this test file
+# End: Helper functions used for this test file

I think that's the first TAP test with this comment. Not saying we should not but
why did you feel the need to add those?

[1]: /messages/by-id/OS0PR01MB5716B3942AE49F3F725ACA92943B2@OS0PR01MB5716.jpnprd01.prod.outlook.com

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#202

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#199)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Apr 3, 2024 at 11:17 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Apr 3, 2024 at 8:38 AM shveta malik <shveta.malik@gmail.com> wrote:

Or a simple solution is that the slotsync worker updates
inactive_since as it does for non-synced slots, and disables
timeout-based slot invalidation for synced slots.

I like this idea better, it takes care of such a case too when the
user is relying on sync-function rather than worker and does not want
to get the slots invalidated in between 2 sync function calls.

Please find the attached v31 patches implementing the above idea:

Thanks for the patches, please find few comments:

v31-001:

1)
system-views.sgml:
value will get updated after every synchronization from the
corresponding remote slot on the primary.

--This is confusing. It will be good to rephrase it.

2)
update_synced_slots_inactive_since()

--May be, we should mention in the header that this function is called
only during promotion.

3) 040_standby_failover_slots_sync.pl:
We capture inactive_since_on_primary when we do this for the first time at #175
ALTER SUBSCRIPTION regress_mysub1 DISABLE"

But we again recreate the sub and disable it at line #280.
Do you think we shall get inactive_since_on_primary again here, to be
compared with inactive_since_on_new_primary later?

v31-002:
(I had reviewed v29-002 but missed to post comments, I think these
are still applicable)

1) I think replication_slot_inactivity_timeout was recommended here
(instead of replication_slot_inactive_timeout, so please give it a
thought):
/messages/by-id/202403260739.udlp7lxixktx@alvherre.pgsql

2) Commit msg:
a)
"It is often easy for developers to set a timeout of say 1
or 2 or 3 days at slot level, after which the inactive slots get
dropped."

Shall we say invalidated rather than dropped?

b)
"To achieve the above, postgres introduces a GUC allowing users
set inactive timeout and then a slot stays inactive for this much
amount of time it invalidates the slot."

Broken sentence.

thanks
Shveta

#203

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#200)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Apr 3, 2024 at 12:20 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Wed, Apr 03, 2024 at 11:17:41AM +0530, Bharath Rupireddy wrote:

On Wed, Apr 3, 2024 at 8:38 AM shveta malik <shveta.malik@gmail.com> wrote:

Or a simple solution is that the slotsync worker updates
inactive_since as it does for non-synced slots, and disables
timeout-based slot invalidation for synced slots.

I like this idea better, it takes care of such a case too when the
user is relying on sync-function rather than worker and does not want
to get the slots invalidated in between 2 sync function calls.

Please find the attached v31 patches implementing the above idea:

Thanks!

Some comments related to v31-0001:

=== testing the behavior

T1 ===

- synced slots get their on inactive_since just like any other slot

It behaves as described.

T2 ===

- synced slots inactive_since is set to current timestamp after the
standby gets promoted to help inactive_since interpret correctly just
like any other slot.

It behaves as described.

CR1 ===
+        <structfield>inactive_since</structfield> value will get updated
+        after every synchronization
indicates the last synchronization time? (I think that after every synchronization
could lead to confusion).

+1.

CR2 ===
+                       /*
+                        * Set the time since the slot has become inactive after shutting
+                        * down slot sync machinery. This helps correctly interpret the
+                        * time if the standby gets promoted without a restart.
+                        */
It looks to me that this comment is not at the right place because there is
nothing after the comment that indicates that we shutdown the "slot sync machinery".

Maybe a better place is before the function definition and mention that this is
currently called when we shutdown the "slot sync machinery"?

Won't it be better to have an assert for SlotSyncCtx->pid? IIRC, we
have some existing issues where we don't ensure that no one is running
sync API before shutdown is complete but I think we can deal with that
separately and here we can still have an Assert.

CR3 ===
+                        * We get the current time beforehand and only once to avoid
+                        * system calls overhead while holding the lock.
s/avoid system calls overhead while holding the lock/avoid system calls while holding the spinlock/?

Is it valid to say that there is overhead of this call while holding
spinlock? Because I don't think at the time of promotion we expect any
other concurrent slot activity. The first reason seems good enough.

One other observation:
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -42,6 +42,7 @@
 #include "access/transam.h"
 #include "access/xlog_internal.h"
 #include "access/xlogrecovery.h"
+#include "access/xlogutils.h"

Is there a reason for this inclusion? I don't see any change which
should need this one.

--
With Regards,
Amit Kapila.

#204

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: shveta malik (#202)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Apr 3, 2024 at 2:58 PM shveta malik <shveta.malik@gmail.com> wrote:

On Wed, Apr 3, 2024 at 11:17 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Apr 3, 2024 at 8:38 AM shveta malik <shveta.malik@gmail.com> wrote:

Or a simple solution is that the slotsync worker updates
inactive_since as it does for non-synced slots, and disables
timeout-based slot invalidation for synced slots.

I like this idea better, it takes care of such a case too when the
user is relying on sync-function rather than worker and does not want
to get the slots invalidated in between 2 sync function calls.

Please find the attached v31 patches implementing the above idea:

Thanks for the patches, please find few comments:

v31-001:

1)
system-views.sgml:
value will get updated after every synchronization from the
corresponding remote slot on the primary.

--This is confusing. It will be good to rephrase it.

2)
update_synced_slots_inactive_since()

--May be, we should mention in the header that this function is called
only during promotion.

3) 040_standby_failover_slots_sync.pl:
We capture inactive_since_on_primary when we do this for the first time at #175
ALTER SUBSCRIPTION regress_mysub1 DISABLE"

But we again recreate the sub and disable it at line #280.
Do you think we shall get inactive_since_on_primary again here, to be
compared with inactive_since_on_new_primary later?

I think so.

Few additional comments on tests:
1.
+is( $standby1->safe_psql(
+ 'postgres',
+ "SELECT '$inactive_since_on_primary'::timestamptz <
'$inactive_since_on_standby'::timestamptz AND
+ '$inactive_since_on_standby'::timestamptz < '$slot_sync_time'::timestamptz;"

Shall we do <= check as we are doing in the main function
get_slot_inactive_since_value as the time duration is less so it can
be the same as well? Similarly, please check other tests.

2.
+=item $node->get_slot_inactive_since_value(self, slot_name, reference_time)
+
+Get inactive_since column value for a given replication slot validating it
+against optional reference time.
+
+=cut
+
+sub get_slot_inactive_since_value

I see that all callers validate against reference time. It is better
to name it validate_slot_inactive_since rather than using get_* as the
main purpose is to validate the passed value.

--
With Regards,
Amit Kapila.

#205

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#201)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Apr 3, 2024 at 12:20 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Please find the attached v31 patches implementing the above idea:

Some comments related to v31-0001:

=== testing the behavior

T1 ===

- synced slots get their on inactive_since just like any other slot

It behaves as described.

T2 ===

- synced slots inactive_since is set to current timestamp after the
standby gets promoted to help inactive_since interpret correctly just
like any other slot.

It behaves as described.

Thanks for testing.

CR1 ===
+        <structfield>inactive_since</structfield> value will get updated
+        after every synchronization
indicates the last synchronization time? (I think that after every synchronization
could lead to confusion).

Done.

CR2 ===
+                       /*
+                        * Set the time since the slot has become inactive after shutting
+                        * down slot sync machinery. This helps correctly interpret the
+                        * time if the standby gets promoted without a restart.
+                        */
It looks to me that this comment is not at the right place because there is
nothing after the comment that indicates that we shutdown the "slot sync machinery".

Maybe a better place is before the function definition and mention that this is
currently called when we shutdown the "slot sync machinery"?

Done.

CR3 ===
+                        * We get the current time beforehand and only once to avoid
+                        * system calls overhead while holding the lock.
s/avoid system calls overhead while holding the lock/avoid system calls while holding the spinlock/?

Done.

CR4 ===

+        * Set the time since the slot has become inactive. We get the current
+        * time beforehand to avoid system call overhead while holding the lock

Same.

Done.

CR5 ===
+       # Check that the captured time is sane
+       if (defined $reference_time)
+       {
s/Check that the captured time is sane/Check that the inactive_since is sane/?

Sorry if some of those comments could have been done while I did review v29-0001.

Done.

On Wed, Apr 3, 2024 at 2:58 PM shveta malik <shveta.malik@gmail.com> wrote:

Thanks for the patches, please find few comments:

v31-001:

1)
system-views.sgml:
value will get updated after every synchronization from the
corresponding remote slot on the primary.

--This is confusing. It will be good to rephrase it.

Done as per Bertrand's suggestion.

2)
update_synced_slots_inactive_since()

--May be, we should mention in the header that this function is called
only during promotion.

Done as per Bertrand's suggestion.

3) 040_standby_failover_slots_sync.pl:
We capture inactive_since_on_primary when we do this for the first time at #175
ALTER SUBSCRIPTION regress_mysub1 DISABLE"

But we again recreate the sub and disable it at line #280.
Do you think we shall get inactive_since_on_primary again here, to be
compared with inactive_since_on_new_primary later?

Hm. Done that. Recapturing both slot_creation_time_on_primary and
inactive_since_on_primary before and after CREATE SUBSCRIPTION creates
the slot again on the primary/publisher.

On Wed, Apr 3, 2024 at 3:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

CR2 ===
+                       /*
+                        * Set the time since the slot has become inactive after shutting
+                        * down slot sync machinery. This helps correctly interpret the
+                        * time if the standby gets promoted without a restart.
+                        */
It looks to me that this comment is not at the right place because there is
nothing after the comment that indicates that we shutdown the "slot sync machinery".

Maybe a better place is before the function definition and mention that this is
currently called when we shutdown the "slot sync machinery"?
Won't it be better to have an assert for SlotSyncCtx->pid? IIRC, we
have some existing issues where we don't ensure that no one is running
sync API before shutdown is complete but I think we can deal with that
separately and here we can still have an Assert.

That can work to ensure the slot sync worker isn't running as
SlotSyncCtx->pid gets updated only for the slot sync worker. I added
this assertion for now.

We need to ensure (in a separate patch and thread) there is no backend
acquiring it and performing sync while the slot sync worker is
shutting down. Otherwise, some of the slots can get resynced and some
are not while we are shutting down the slot sync worker as part of the
standby promotion which might leave the slots in an inconsistent
state.

CR3 ===
+                        * We get the current time beforehand and only once to avoid
+                        * system calls overhead while holding the lock.
s/avoid system calls overhead while holding the lock/avoid system calls while holding the spinlock/?
Is it valid to say that there is overhead of this call while holding
spinlock? Because I don't think at the time of promotion we expect any
other concurrent slot activity. The first reason seems good enough.

No slot activity but why GetCurrentTimestamp needs to be called every
time in a loop.

One other observation:
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -42,6 +42,7 @@
#include "access/transam.h"
#include "access/xlog_internal.h"
#include "access/xlogrecovery.h"
+#include "access/xlogutils.h"

Is there a reason for this inclusion? I don't see any change which
should need this one.

Not anymore. It was earlier needed for using the InRecovery flag in
the then approach.

On Wed, Apr 3, 2024 at 4:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

3) 040_standby_failover_slots_sync.pl:
We capture inactive_since_on_primary when we do this for the first time at #175
ALTER SUBSCRIPTION regress_mysub1 DISABLE"

But we again recreate the sub and disable it at line #280.
Do you think we shall get inactive_since_on_primary again here, to be
compared with inactive_since_on_new_primary later?

I think so.

Modified this to recapture the times before and after the slot gets recreated.

Few additional comments on tests:
1.
+is( $standby1->safe_psql(
+ 'postgres',
+ "SELECT '$inactive_since_on_primary'::timestamptz <
'$inactive_since_on_standby'::timestamptz AND
+ '$inactive_since_on_standby'::timestamptz < '$slot_sync_time'::timestamptz;"
Shall we do <= check as we are doing in the main function
get_slot_inactive_since_value as the time duration is less so it can
be the same as well? Similarly, please check other tests.

I get you. If the tests are so fast that losing a bit of precision
might cause tests to fail. So, I'v added equality check for all the
tests.

2.
+=item $node->get_slot_inactive_since_value(self, slot_name, reference_time)
+
+Get inactive_since column value for a given replication slot validating it
+against optional reference time.
+
+=cut
+
+sub get_slot_inactive_since_value
I see that all callers validate against reference time. It is better
to name it validate_slot_inactive_since rather than using get_* as the
main purpose is to validate the passed value.

Existing callers yes. Also, I've removed the reference time as an
optional parameter.

Per an offlist chat with Amit, I've added the following note in
synchronize_one_slot:

@@ -584,6 +585,11 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
          * overwriting 'invalidated' flag to remote_slot's value. See
          * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
          * if the slot is not acquired by other processes.
+         *
+         * XXX: If it ever turns out that slot acquire/release is costly for
+         * cases when none of the slot property is changed then we can do a
+         * pre-check to ensure that at least one of the slot property is
+         * changed before acquiring the slot.
          */
         ReplicationSlotAcquire(remote_slot->name, true);

Please find the attached v32-0001 patch with the above review comments
addressed. I'm working on review comments for 0002.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v32-0001-Allow-synced-slots-to-have-their-own-inactive_si.patchapplication/octet-stream; name=v32-0001-Allow-synced-slots-to-have-their-own-inactive_si.patchDownload

From 63230aab91d5447a384a5c9d1723675f3b0ac4de Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 3 Apr 2024 11:40:11 +0000
Subject: [PATCH v32] Allow synced slots to have their own inactive_since.

The slot's inactive_since isn't currently maintained for
synced slots on the standby. The commit a11f330b55 prevents
updating inactive_since with RecoveryInProgress() check in
RestoreSlotFromDisk(). But, the issue is that
RecoveryInProgress() always returns true in
RestoreSlotFromDisk() as 'xlogctl->SharedRecoveryState' is
always 'RECOVERY_STATE_CRASH' at that time. Because of this,
inactive_since is always NULL on a promoted standby for all
synced slots even after server restart.

Above issue led us to a question as to why we can't just let
standby maintain its own inactive_since for synced slots. This is
consistent with any regular slots and also indicates the last
synchronization time of the slot. This approach simplifies things
when compared to just copying inactive_since received from the
remote slot on the primary (for instance, there can exists clock
drift between primary and standby so just copying inactive_since
from the primary slot to the standby sync slot may not represent
the correct value).

This commit does two things:
1) Maintains inactive_since for sync slots whenever the slot is
released just like any other regular slot.

2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACWLctoiH-pSjWnEpR54q4DED6rw_BRJm5pCx86_Y01MoQ%40mail.gmail.com
---
 doc/src/sgml/system-views.sgml                |  7 +++
 src/backend/replication/logical/slotsync.c    | 51 +++++++++++++++
 src/backend/replication/slot.c                | 22 +++----
 src/test/perl/PostgreSQL/Test/Cluster.pm      | 31 ++++++++++
 src/test/recovery/t/019_replslot_limit.pl     | 30 ++-------
 .../t/040_standby_failover_slots_sync.pl      | 62 +++++++++++++++++++
 6 files changed, 162 insertions(+), 41 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..7ed617170f 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2530,6 +2530,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       <para>
         The time since the slot has become inactive.
         <literal>NULL</literal> if the slot is currently being used.
+        Note that for slots on the standby that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>), the
+        <structfield>inactive_since</structfield> indicates the last
+        synchronization (see
+        <xref linkend="logicaldecoding-replication-slots-synchronization"/>)
+        time.
       </para></entry>
      </row>
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 9ac847b780..755bf40a9a 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -150,6 +150,7 @@ typedef struct RemoteSlot
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_since(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -584,6 +585,11 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * overwriting 'invalidated' flag to remote_slot's value. See
 		 * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
 		 * if the slot is not acquired by other processes.
+		 *
+		 * XXX: If it ever turns out that slot acquire/release is costly for
+		 * cases when none of the slot property is changed then we can do a
+		 * pre-check to ensure that at least one of the slot property is
+		 * changed before acquiring the slot.
 		 */
 		ReplicationSlotAcquire(remote_slot->name, true);
 
@@ -1355,6 +1361,48 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Update the inactive_since property for synced slots.
+ *
+ * Note that this function is currently called when we shutdown the slot sync
+ * machinery. This helps correctly interpret the inactive_since if the standby
+ * gets promoted without a restart.
+ */
+static void
+update_synced_slots_inactive_since(void)
+{
+	TimestampTz now = 0;
+
+	/* The slot sync worker mustn't be running by now */
+	Assert(SlotSyncCtx->pid == InvalidPid);
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			Assert(SlotIsLogical(s));
+
+			/*
+			 * We get the current time beforehand and only once to avoid
+			 * system calls while holding the spinlock.
+			 */
+			if (now == 0)
+				now = GetCurrentTimestamp();
+
+			SpinLockAcquire(&s->mutex);
+			s->inactive_since = now;
+			SpinLockRelease(&s->mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(&SlotSyncCtx->mutex);
+		update_synced_slots_inactive_since();
 		return;
 	}
 	SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(&SlotSyncCtx->mutex);
+
+	update_synced_slots_inactive_since();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d778c0b921..3bddaae022 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -690,13 +690,10 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has become inactive. We get the current
+	 * time beforehand to avoid system call while holding the spinlock.
 	 */
-	if (!(RecoveryInProgress() && slot->data.synced))
-		now = GetCurrentTimestamp();
+	now = GetCurrentTimestamp();
 
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
@@ -2369,16 +2366,11 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * Set the time since the slot has become inactive after loading the
+		 * slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it.
 		 */
-		if (!(RecoveryInProgress() && slot->data.synced))
-			slot->inactive_since = GetCurrentTimestamp();
-		else
-			slot->inactive_since = 0;
+		slot->inactive_since = GetCurrentTimestamp();
 
 		restored = true;
 		break;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b08296605c..d68db6a6f9 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3276,6 +3276,37 @@ sub create_logical_slot_on_standby
 
 =pod
 
+=item $node->validate_slot_inactive_since(self, slot_name, reference_time)
+
+Validate inactive_since value of a given replication slot against the reference
+time and return it.
+
+=cut
+
+sub validate_slot_inactive_since
+{
+	my ($self, $slot_name, $reference_time) = @_;
+	my $name = $self->name;
+
+	my $inactive_since = $self->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
+		);
+
+	# Check that the inactive_since is sane
+	is($self->safe_psql('postgres',
+		qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+				'$inactive_since'::timestamptz >= '$reference_time'::timestamptz;]
+		),
+		't',
+		"last inactive time for slot $slot_name is valid on node $name")
+		or die "could not validate captured inactive_since for slot $slot_name";
+
+	return $inactive_since;
+}
+
+=pod
+
 =item $node->advance_wal(num)
 
 Advance WAL of node by given number of segments.
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 3b9a306a8b..712141a33b 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -443,7 +443,7 @@ $primary4->safe_psql(
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the standby below.
 my $inactive_since =
-	capture_and_validate_slot_inactive_since($primary4, $sb4_slot, $slot_creation_time);
+	$primary4->validate_slot_inactive_since($sb4_slot, $slot_creation_time);
 
 $standby4->start;
 
@@ -467,7 +467,7 @@ $primary4->restart;
 
 is( $primary4->safe_psql(
 		'postgres',
-		qq[SELECT inactive_since > '$inactive_since'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND inactive_since IS NOT NULL;]
+		qq[SELECT inactive_since >= '$inactive_since'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND inactive_since IS NOT NULL;]
 	),
 	't',
 	'last inactive time for an inactive physical slot is updated correctly');
@@ -502,7 +502,7 @@ $publisher4->safe_psql('postgres',
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the subscriber below.
 $inactive_since =
-	capture_and_validate_slot_inactive_since($publisher4, $lsub4_slot, $slot_creation_time);
+	$publisher4->validate_slot_inactive_since($lsub4_slot, $slot_creation_time);
 
 $subscriber4->start;
 $subscriber4->safe_psql('postgres',
@@ -529,7 +529,7 @@ $publisher4->restart;
 
 is( $publisher4->safe_psql(
 		'postgres',
-		qq[SELECT inactive_since > '$inactive_since'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND inactive_since IS NOT NULL;]
+		qq[SELECT inactive_since >= '$inactive_since'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND inactive_since IS NOT NULL;]
 	),
 	't',
 	'last inactive time for an inactive logical slot is updated correctly');
@@ -540,26 +540,4 @@ is( $publisher4->safe_psql(
 $publisher4->stop;
 $subscriber4->stop;
 
-# Capture and validate inactive_since of a given slot.
-sub capture_and_validate_slot_inactive_since
-{
-	my ($node, $slot_name, $slot_creation_time) = @_;
-
-	my $inactive_since = $node->safe_psql('postgres',
-		qq(SELECT inactive_since FROM pg_replication_slots
-			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
-		);
-
-	# Check that the captured time is sane
-	is( $node->safe_psql(
-			'postgres',
-			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
-				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
-		),
-		't',
-		"last inactive time for an active slot $slot_name is sane");
-
-	return $inactive_since;
-}
-
 done_testing();
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 869e3d2e91..a8be8ac7fc 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -35,6 +35,13 @@ my $subscriber1 = PostgreSQL::Test::Cluster->new('subscriber1');
 $subscriber1->init;
 $subscriber1->start;
 
+# Capture the time before the logical failover slot is created on the
+# primary. We later call this publisher as primary anyway.
+my $slot_creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Create a slot on the publisher with failover disabled
 $publisher->safe_psql('postgres',
 	"SELECT 'init' FROM pg_create_logical_replication_slot('lsub1_slot', 'pgoutput', false, false, false);"
@@ -174,6 +181,11 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary. Note that the slot
+# is not yet active.
+my $inactive_since_on_primary =
+	$primary->validate_slot_inactive_since('lsub1_slot', $slot_creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -181,6 +193,11 @@ $primary->wait_for_replay_catchup($standby1);
 # Synchronize the primary server slots to the standby.
 $standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
 
+my $slot_sync_time = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Confirm that the logical failover slots are created on the standby and are
 # flagged as 'synced'
 is( $standby1->safe_psql(
@@ -190,6 +207,19 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Capture the inactive_since of the synced slot on the standby
+my $inactive_since_on_standby =
+	$standby1->validate_slot_inactive_since('lsub1_slot', $slot_creation_time_on_primary);
+
+# Synced slot on the standby must get its own inactive_since.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_primary'::timestamptz <= '$inactive_since_on_standby'::timestamptz AND
+			'$inactive_since_on_standby'::timestamptz <= '$slot_sync_time'::timestamptz;"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
@@ -237,6 +267,13 @@ is( $standby1->safe_psql(
 $standby1->append_conf('postgresql.conf', 'max_slot_wal_keep_size = -1');
 $standby1->reload;
 
+# Capture the time before the logical failover slot is created on the primary.
+# Note that the subscription creates the slot again on the primary.
+$slot_creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # To ensure that restart_lsn has moved to a recent WAL position, we re-create
 # the subscription and the logical slot.
 $subscriber1->safe_psql(
@@ -257,6 +294,11 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary. Note that the slot
+# is not yet active but has been dropped and recreated.
+$inactive_since_on_primary =
+	$primary->validate_slot_inactive_since('lsub1_slot', $slot_creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -808,8 +850,28 @@ $primary->reload;
 $standby1->start;
 $primary->wait_for_replay_catchup($standby1);
 
+# Capture the time before the standby is promoted
+my $promotion_time_on_primary = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 $standby1->promote;
 
+# Capture the inactive_since of the synced slot after the promotion.
+# Expectation here is that the slot gets its own inactive_since as part of the
+# promotion. We do this check before the slot is enabled on the new primary
+# below, otherwise the slot gets active setting inactive_since to NULL.
+my $inactive_since_on_new_primary =
+	$standby1->validate_slot_inactive_since('lsub1_slot', $promotion_time_on_primary);
+
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_new_primary'::timestamptz >= '$inactive_since_on_primary'::timestamptz"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since on the new primary after promotion');
+
 # Update subscription with the new primary's connection info
 my $standby1_conninfo = $standby1->connstr . ' dbname=postgres';
 $subscriber1->safe_psql('postgres',
-- 
2.34.1

#206

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#205)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Apr 03, 2024 at 05:12:12PM +0530, Bharath Rupireddy wrote:

On Wed, Apr 3, 2024 at 4:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
+ 'postgres',
+ "SELECT '$inactive_since_on_primary'::timestamptz <
'$inactive_since_on_standby'::timestamptz AND
+ '$inactive_since_on_standby'::timestamptz < '$slot_sync_time'::timestamptz;"
Shall we do <= check as we are doing in the main function
get_slot_inactive_since_value as the time duration is less so it can
be the same as well? Similarly, please check other tests.
I get you. If the tests are so fast that losing a bit of precision
might cause tests to fail. So, I'v added equality check for all the
tests.

Please find the attached v32-0001 patch with the above review comments
addressed.

Thanks!

Just one comment on v32-0001:

+# Synced slot on the standby must get its own inactive_since.
+is( $standby1->safe_psql(
+               'postgres',
+               "SELECT '$inactive_since_on_primary'::timestamptz <= '$inactive_since_on_standby'::timestamptz AND
+                       '$inactive_since_on_standby'::timestamptz <= '$slot_sync_time'::timestamptz;"
+       ),
+       "t",
+       'synchronized slot has got its own inactive_since');
+

By using <= we are not testing that it must get its own inactive_since (as we
allow them to be equal in the test). I think we should just add some usleep()
where appropriate and deny equality during the tests on inactive_since.

Except for the above, v32-0001 LGTM.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#207

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#206)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Apr 3, 2024 at 6:46 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Just one comment on v32-0001:
+# Synced slot on the standby must get its own inactive_since.
+is( $standby1->safe_psql(
+               'postgres',
+               "SELECT '$inactive_since_on_primary'::timestamptz <= '$inactive_since_on_standby'::timestamptz AND
+                       '$inactive_since_on_standby'::timestamptz <= '$slot_sync_time'::timestamptz;"
+       ),
+       "t",
+       'synchronized slot has got its own inactive_since');
+
By using <= we are not testing that it must get its own inactive_since (as we
allow them to be equal in the test). I think we should just add some usleep()
where appropriate and deny equality during the tests on inactive_since.

Thanks. It looks like we can ignore the equality in all of the
inactive_since comparisons. IIUC, all the TAP tests do run with
primary and standbys on the single BF animals. And, it looks like
assigning the inactive_since timestamps to perl variables is giving
the microseconds precision level
(./tmp_check/log/regress_log_040_standby_failover_slots_sync:inactive_since
2024-04-03 14:30:09.691648+00). FWIW, we already have some TAP and SQL
tests relying on stats_reset timestamps without equality. So, I've
left the equality for the inactive_since tests.

Except for the above, v32-0001 LGTM.

Thanks. Please see the attached v33-0001 patch after removing equality
on inactive_since TAP tests.

On Wed, Apr 3, 2024 at 1:47 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Some comments regarding v31-0002:

=== testing the behavior

T1 ===

- synced slots don't get invalidated due to inactive timeout because
such slots not considered active at all as they don't perform logical
decoding (of course, they will perform in fast_forward mode to fix the
other data loss issue, but they don't generate changes for them to be
called as *active* slots)

It behaves as described. OTOH non synced logical slots on the standby and
physical slots on the standby are invalidated which is what is expected.

Right.

T2 ===

In case the slot is invalidated on the primary,

primary:

postgres=# select slot_name, inactive_since, invalidation_reason from pg_replication_slots where slot_name = 's1';
slot_name | inactive_since | invalidation_reason
-----------+-------------------------------+---------------------
s1 | 2024-04-03 06:56:28.075637+00 | inactive_timeout

then on the standby we get:

standby:

postgres=# select slot_name, inactive_since, invalidation_reason from pg_replication_slots where slot_name = 's1';
slot_name | inactive_since | invalidation_reason
-----------+------------------------------+---------------------
s1 | 2024-04-03 07:06:43.37486+00 | inactive_timeout

shouldn't the slot be dropped/recreated instead of updating inactive_since?

The sync slots that are invalidated on the primary aren't dropped and
recreated on the standby. There's no point in doing so because
invalidated slots on the primary can't be made useful. However, I
found that the synced slot is acquired and released unnecessarily
after the invalidation_reason is synced from the primary. I added a
skip check in synchronize_one_slot to skip acquiring and releasing the
slot if it's locally found inactive. With this, inactive_since won't
get updated for invalidated sync slots on the standby as we don't
acquire and release the slot.

=== code

CR1 ===
+        Invalidates replication slots that are inactive for longer the
+        specified amount of time
s/for longer the/for longer that/?

Fixed.

CR2 ===
+        <literal>true</literal>) as such synced slots don't actually perform
+        logical decoding.
We're switching in fast forward logical due to [1], so I'm not sure that's 100%
accurate here. I'm not sure we need to specify a reason.

Fixed.

CR3 ===

+ errdetail("This slot has been invalidated because it was inactive for more than the time specified by replication_slot_inactive_timeout parameter.")));

I think we can remove "parameter" (see for example the error message in
validate_remote_info()) and reduce it a bit, something like?

"This slot has been invalidated because it was inactive for more than replication_slot_inactive_timeout"?

Done.

CR4 ===

+ appendStringInfoString(&err_detail, _("The slot has been inactive for more than the time specified by replication_slot_inactive_timeout parameter."));

Same.

Done. Changed it to "The slot has been inactive for more than
replication_slot_inactive_timeout."

CR5 ===

+       /*
+        * This function isn't expected to be called for inactive timeout based
+        * invalidation. A separate function InvalidateInactiveReplicationSlot is
+        * to be used for that.

Do you think it's worth to explain why?

Hm, I just wanted to point out the actual function here. I modified it
to something like the following, if others feel we don't need that, I
can remove it.

/*
* Use InvalidateInactiveReplicationSlot for inactive timeout based
* invalidation.
*/

CR6 ===

+       if (replication_slot_inactive_timeout == 0)
+               return false;
+       else if (slot->inactive_since > 0)

"else" is not needed here.

Nothing wrong there, but removed.

CR7 ===

+               SpinLockAcquire(&slot->mutex);
+
+               /*
+                * Check if the slot needs to be invalidated due to
+                * replication_slot_inactive_timeout GUC. We do this with the spinlock
+                * held to avoid race conditions -- for example the inactive_since
+                * could change, or the slot could be dropped.
+                */
+               now = GetCurrentTimestamp();

We should not call GetCurrentTimestamp() while holding a spinlock.

I was thinking why to add up the wait time to acquire
LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);. Now that I
moved it up before the spinlock but after the LWLockAcquire.

CR8 ===

+# Testcase start: Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to inactive timeout GUC. Also, check the logical

s/inactive timeout GUC/replication_slot_inactive_timeout/?

Done.

CR9 ===
+# Start: Helper functions used for this test file
+# End: Helper functions used for this test file
I think that's the first TAP test with this comment. Not saying we should not but
why did you feel the need to add those?

Hm. Removed.

[1]: /messages/by-id/OS0PR01MB5716B3942AE49F3F725ACA92943B2@OS0PR01MB5716.jpnprd01.prod.outlook.com

On Wed, Apr 3, 2024 at 2:58 PM shveta malik <shveta.malik@gmail.com> wrote:

v31-002:
(I had reviewed v29-002 but missed to post comments, I think these
are still applicable)

1) I think replication_slot_inactivity_timeout was recommended here
(instead of replication_slot_inactive_timeout, so please give it a
thought):
/messages/by-id/202403260739.udlp7lxixktx@alvherre.pgsql

Yeah. It's synonymous with inactive_since. If others have an opinion
to have replication_slot_inactivity_timeout, I'm fine with it.

2) Commit msg:
a)
"It is often easy for developers to set a timeout of say 1
or 2 or 3 days at slot level, after which the inactive slots get
dropped."

Shall we say invalidated rather than dropped?

Right. Done that.

b)
"To achieve the above, postgres introduces a GUC allowing users
set inactive timeout and then a slot stays inactive for this much
amount of time it invalidates the slot."

Broken sentence.

Reworded it a bit.

Please find the attached v33 patches.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v33-0001-Allow-synced-slots-to-have-their-own-inactive_si.patchapplication/x-patch; name=v33-0001-Allow-synced-slots-to-have-their-own-inactive_si.patchDownload

From 03a9e02df871dc86846b79a62a2aaf00e7152f14 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 3 Apr 2024 14:38:38 +0000
Subject: [PATCH v33 1/2] Allow synced slots to have their own inactive_since.

The slot's inactive_since isn't currently maintained for
synced slots on the standby. The commit a11f330b55 prevents
updating inactive_since with RecoveryInProgress() check in
RestoreSlotFromDisk(). But, the issue is that
RecoveryInProgress() always returns true in
RestoreSlotFromDisk() as 'xlogctl->SharedRecoveryState' is
always 'RECOVERY_STATE_CRASH' at that time. Because of this,
inactive_since is always NULL on a promoted standby for all
synced slots even after server restart.

Above issue led us to a question as to why we can't just let
standby maintain its own inactive_since for synced slots. This is
consistent with any regular slots and also indicates the last
synchronization time of the slot. This approach simplifies things
when compared to just copying inactive_since received from the
remote slot on the primary (for instance, there can exists clock
drift between primary and standby so just copying inactive_since
from the primary slot to the standby sync slot may not represent
the correct value).

This commit does two things:
1) Maintains inactive_since for sync slots whenever the slot is
released just like any other regular slot.

2) Ensures the value is set to current timestamp during the
shutdown of slot sync machinery to help correctly interpret the
time if the standby gets promoted without a restart.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACWLctoiH-pSjWnEpR54q4DED6rw_BRJm5pCx86_Y01MoQ%40mail.gmail.com
---
 doc/src/sgml/system-views.sgml                |  7 +++
 src/backend/replication/logical/slotsync.c    | 51 +++++++++++++++
 src/backend/replication/slot.c                | 22 +++----
 src/test/perl/PostgreSQL/Test/Cluster.pm      | 31 ++++++++++
 src/test/recovery/t/019_replslot_limit.pl     | 26 +-------
 .../t/040_standby_failover_slots_sync.pl      | 62 +++++++++++++++++++
 6 files changed, 160 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..7ed617170f 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2530,6 +2530,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       <para>
         The time since the slot has become inactive.
         <literal>NULL</literal> if the slot is currently being used.
+        Note that for slots on the standby that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>), the
+        <structfield>inactive_since</structfield> indicates the last
+        synchronization (see
+        <xref linkend="logicaldecoding-replication-slots-synchronization"/>)
+        time.
       </para></entry>
      </row>
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 9ac847b780..755bf40a9a 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -150,6 +150,7 @@ typedef struct RemoteSlot
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_since(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -584,6 +585,11 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * overwriting 'invalidated' flag to remote_slot's value. See
 		 * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
 		 * if the slot is not acquired by other processes.
+		 *
+		 * XXX: If it ever turns out that slot acquire/release is costly for
+		 * cases when none of the slot property is changed then we can do a
+		 * pre-check to ensure that at least one of the slot property is
+		 * changed before acquiring the slot.
 		 */
 		ReplicationSlotAcquire(remote_slot->name, true);
 
@@ -1355,6 +1361,48 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Update the inactive_since property for synced slots.
+ *
+ * Note that this function is currently called when we shutdown the slot sync
+ * machinery. This helps correctly interpret the inactive_since if the standby
+ * gets promoted without a restart.
+ */
+static void
+update_synced_slots_inactive_since(void)
+{
+	TimestampTz now = 0;
+
+	/* The slot sync worker mustn't be running by now */
+	Assert(SlotSyncCtx->pid == InvalidPid);
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			Assert(SlotIsLogical(s));
+
+			/*
+			 * We get the current time beforehand and only once to avoid
+			 * system calls while holding the spinlock.
+			 */
+			if (now == 0)
+				now = GetCurrentTimestamp();
+
+			SpinLockAcquire(&s->mutex);
+			s->inactive_since = now;
+			SpinLockRelease(&s->mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(&SlotSyncCtx->mutex);
+		update_synced_slots_inactive_since();
 		return;
 	}
 	SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(&SlotSyncCtx->mutex);
+
+	update_synced_slots_inactive_since();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d778c0b921..3bddaae022 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -690,13 +690,10 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has become inactive. We get the current
+	 * time beforehand to avoid system call while holding the spinlock.
 	 */
-	if (!(RecoveryInProgress() && slot->data.synced))
-		now = GetCurrentTimestamp();
+	now = GetCurrentTimestamp();
 
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
@@ -2369,16 +2366,11 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * Set the time since the slot has become inactive after loading the
+		 * slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it.
 		 */
-		if (!(RecoveryInProgress() && slot->data.synced))
-			slot->inactive_since = GetCurrentTimestamp();
-		else
-			slot->inactive_since = 0;
+		slot->inactive_since = GetCurrentTimestamp();
 
 		restored = true;
 		break;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b08296605c..54e1008ae5 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3276,6 +3276,37 @@ sub create_logical_slot_on_standby
 
 =pod
 
+=item $node->validate_slot_inactive_since(self, slot_name, reference_time)
+
+Validate inactive_since value of a given replication slot against the reference
+time and return it.
+
+=cut
+
+sub validate_slot_inactive_since
+{
+	my ($self, $slot_name, $reference_time) = @_;
+	my $name = $self->name;
+
+	my $inactive_since = $self->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
+		);
+
+	# Check that the inactive_since is sane
+	is($self->safe_psql('postgres',
+		qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+				'$inactive_since'::timestamptz > '$reference_time'::timestamptz;]
+		),
+		't',
+		"last inactive time for slot $slot_name is valid on node $name")
+		or die "could not validate captured inactive_since for slot $slot_name";
+
+	return $inactive_since;
+}
+
+=pod
+
 =item $node->advance_wal(num)
 
 Advance WAL of node by given number of segments.
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 3b9a306a8b..96b60cedbb 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -443,7 +443,7 @@ $primary4->safe_psql(
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the standby below.
 my $inactive_since =
-	capture_and_validate_slot_inactive_since($primary4, $sb4_slot, $slot_creation_time);
+	$primary4->validate_slot_inactive_since($sb4_slot, $slot_creation_time);
 
 $standby4->start;
 
@@ -502,7 +502,7 @@ $publisher4->safe_psql('postgres',
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the subscriber below.
 $inactive_since =
-	capture_and_validate_slot_inactive_since($publisher4, $lsub4_slot, $slot_creation_time);
+	$publisher4->validate_slot_inactive_since($lsub4_slot, $slot_creation_time);
 
 $subscriber4->start;
 $subscriber4->safe_psql('postgres',
@@ -540,26 +540,4 @@ is( $publisher4->safe_psql(
 $publisher4->stop;
 $subscriber4->stop;
 
-# Capture and validate inactive_since of a given slot.
-sub capture_and_validate_slot_inactive_since
-{
-	my ($node, $slot_name, $slot_creation_time) = @_;
-
-	my $inactive_since = $node->safe_psql('postgres',
-		qq(SELECT inactive_since FROM pg_replication_slots
-			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
-		);
-
-	# Check that the captured time is sane
-	is( $node->safe_psql(
-			'postgres',
-			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
-				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
-		),
-		't',
-		"last inactive time for an active slot $slot_name is sane");
-
-	return $inactive_since;
-}
-
 done_testing();
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 869e3d2e91..e7d92e3276 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -35,6 +35,13 @@ my $subscriber1 = PostgreSQL::Test::Cluster->new('subscriber1');
 $subscriber1->init;
 $subscriber1->start;
 
+# Capture the time before the logical failover slot is created on the
+# primary. We later call this publisher as primary anyway.
+my $slot_creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Create a slot on the publisher with failover disabled
 $publisher->safe_psql('postgres',
 	"SELECT 'init' FROM pg_create_logical_replication_slot('lsub1_slot', 'pgoutput', false, false, false);"
@@ -174,6 +181,11 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary. Note that the slot
+# is not yet active.
+my $inactive_since_on_primary =
+	$primary->validate_slot_inactive_since('lsub1_slot', $slot_creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -181,6 +193,11 @@ $primary->wait_for_replay_catchup($standby1);
 # Synchronize the primary server slots to the standby.
 $standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
 
+my $slot_sync_time = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Confirm that the logical failover slots are created on the standby and are
 # flagged as 'synced'
 is( $standby1->safe_psql(
@@ -190,6 +207,19 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Capture the inactive_since of the synced slot on the standby
+my $inactive_since_on_standby =
+	$standby1->validate_slot_inactive_since('lsub1_slot', $slot_creation_time_on_primary);
+
+# Synced slot on the standby must get its own inactive_since.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_primary'::timestamptz < '$inactive_since_on_standby'::timestamptz AND
+			'$inactive_since_on_standby'::timestamptz < '$slot_sync_time'::timestamptz;"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
@@ -237,6 +267,13 @@ is( $standby1->safe_psql(
 $standby1->append_conf('postgresql.conf', 'max_slot_wal_keep_size = -1');
 $standby1->reload;
 
+# Capture the time before the logical failover slot is created on the primary.
+# Note that the subscription creates the slot again on the primary.
+$slot_creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # To ensure that restart_lsn has moved to a recent WAL position, we re-create
 # the subscription and the logical slot.
 $subscriber1->safe_psql(
@@ -257,6 +294,11 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary. Note that the slot
+# is not yet active but has been dropped and recreated.
+$inactive_since_on_primary =
+	$primary->validate_slot_inactive_since('lsub1_slot', $slot_creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -808,8 +850,28 @@ $primary->reload;
 $standby1->start;
 $primary->wait_for_replay_catchup($standby1);
 
+# Capture the time before the standby is promoted
+my $promotion_time_on_primary = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 $standby1->promote;
 
+# Capture the inactive_since of the synced slot after the promotion.
+# Expectation here is that the slot gets its own inactive_since as part of the
+# promotion. We do this check before the slot is enabled on the new primary
+# below, otherwise the slot gets active setting inactive_since to NULL.
+my $inactive_since_on_new_primary =
+	$standby1->validate_slot_inactive_since('lsub1_slot', $promotion_time_on_primary);
+
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_new_primary'::timestamptz > '$inactive_since_on_primary'::timestamptz"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since on the new primary after promotion');
+
 # Update subscription with the new primary's connection info
 my $standby1_conninfo = $standby1->connstr . ' dbname=postgres';
 $subscriber1->safe_psql('postgres',
-- 
2.34.1

v33-0002-Add-inactive_timeout-based-replication-slot-inva.patchapplication/x-patch; name=v33-0002-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From abe5113e8431e0a691ef61c04288dea6a0f4a2ac Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 3 Apr 2024 14:39:36 +0000
Subject: [PATCH v33 2/2] Add inactive_timeout based replication slot
 invalidation.

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days at slot level, after which the inactive slots get
invalidated.

To achieve the above, postgres introduces a GUC allowing users
set inactive timeout. The replication slots that are inactive
for longer than specified amount of time get invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint

Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/config.sgml                      |  32 ++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |  11 +-
 src/backend/replication/slot.c                | 208 ++++++++++++-
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   8 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/044_invalidate_slots.pl   | 274 ++++++++++++++++++
 13 files changed, 540 insertions(+), 24 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 624518e0b0..626eac7125 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4547,6 +4547,38 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidates replication slots that are inactive for longer than
+        specified amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the timeout mechanism. This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+
+       <para>
+        The timeout is measured from the time since the slot has become
+        inactive (known from its
+        <structfield>inactive_since</structfield> value) until it gets
+        used (i.e., its <structfield>active</structfield> is set to true).
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>).
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 7ed617170f..063638beda 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2580,6 +2580,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 755bf40a9a..080edc0d74 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -362,7 +362,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -575,6 +575,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 						   " name slot \"%s\" already exists on the standby",
 						   remote_slot->name));
 
+		/*
+		 * Skip the sync if the local slot is already invalidated. We do this
+		 * beforehand to save on slot acquire and release.
+		 */
+		if (slot->data.invalidated != RS_INVAL_NONE)
+			return false;
+
 		/*
 		 * The slot has been synchronized before.
 		 *
@@ -591,7 +598,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot property is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 3bddaae022..e5ee934b24 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -158,6 +160,7 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool InvalidatePossiblyInactiveSlot(ReplicationSlot *slot);
 
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
@@ -535,9 +538,14 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * If check_for_timeout_invalidation is true, the slot is checked for
+ * invalidation based on replication_slot_inactive_timeout GUC, and an error is
+ * raised after making the slot ours.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_timeout_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +623,34 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * Check if the given slot can be invalidated based on its inactive
+	 * timeout. If yes, persist the invalidated state to disk and then error
+	 * out. We do this only after making the slot ours to avoid anyone else
+	 * acquiring it while we check for its invalidation.
+	 */
+	if (check_for_timeout_invalidation)
+	{
+		/* The slot is ours by now */
+		Assert(s->active_pid == MyProcPid);
+
+		if (InvalidateInactiveReplicationSlot(s, true))
+		{
+			/*
+			 * If the slot has been invalidated, recalculate the resource
+			 * limits.
+			 */
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(MyReplicationSlot->data.name)),
+					 errdetail("This slot has been invalidated because it was inactive for more than replication_slot_inactive_timeout.")));
+		}
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -781,7 +817,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -804,7 +840,7 @@ ReplicationSlotAlter(const char *name, bool failover)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -980,6 +1016,20 @@ ReplicationSlotDropPtr(ReplicationSlot *slot)
 	LWLockRelease(ReplicationSlotAllocationLock);
 }
 
+/*
+ * Helper for ReplicationSlotSave
+ */
+static inline void
+SaveGivenReplicationSlot(ReplicationSlot *slot, int elevel)
+{
+	char		path[MAXPGPATH];
+
+	Assert(slot != NULL);
+
+	sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+	SaveSlotToPath(slot, path, elevel);
+}
+
 /*
  * Serialize the currently acquired slot's state from memory to disk, thereby
  * guaranteeing the current state will survive a crash.
@@ -987,12 +1037,21 @@ ReplicationSlotDropPtr(ReplicationSlot *slot)
 void
 ReplicationSlotSave(void)
 {
-	char		path[MAXPGPATH];
+	SaveGivenReplicationSlot(MyReplicationSlot, ERROR);
+}
 
-	Assert(MyReplicationSlot != NULL);
+/*
+ * Helper for ReplicationSlotMarkDirty
+ */
+static inline void
+MarkGivenReplicationSlotDirty(ReplicationSlot *slot)
+{
+	Assert(slot != NULL);
 
-	sprintf(path, "pg_replslot/%s", NameStr(MyReplicationSlot->data.name));
-	SaveSlotToPath(MyReplicationSlot, path, ERROR);
+	SpinLockAcquire(&slot->mutex);
+	slot->just_dirtied = true;
+	slot->dirty = true;
+	SpinLockRelease(&slot->mutex);
 }
 
 /*
@@ -1005,14 +1064,7 @@ ReplicationSlotSave(void)
 void
 ReplicationSlotMarkDirty(void)
 {
-	ReplicationSlot *slot = MyReplicationSlot;
-
-	Assert(MyReplicationSlot != NULL);
-
-	SpinLockAcquire(&slot->mutex);
-	MyReplicationSlot->just_dirtied = true;
-	MyReplicationSlot->dirty = true;
-	SpinLockRelease(&slot->mutex);
+	MarkGivenReplicationSlotDirty(MyReplicationSlot);
 }
 
 /*
@@ -1506,6 +1558,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than replication_slot_inactive_timeout."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1550,6 +1605,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
 
+	/*
+	 * Use InvalidateInactiveReplicationSlot for inactive timeout based
+	 * invalidation.
+	 */
+	Assert(cause != RS_INVAL_INACTIVE_TIMEOUT);
+
 	for (;;)
 	{
 		XLogRecPtr	restart_lsn;
@@ -1619,6 +1680,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					/* not reachable */
+					Assert(false);
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1772,6 +1837,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1787,6 +1853,12 @@ InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
 	Assert(cause != RS_INVAL_WAL_REMOVED || oldestSegno > 0);
 	Assert(cause != RS_INVAL_NONE);
 
+	/*
+	 * Use InvalidateInactiveReplicationSlot for inactive timeout based
+	 * invalidation.
+	 */
+	Assert(cause != RS_INVAL_INACTIVE_TIMEOUT);
+
 	if (max_replication_slots == 0)
 		return invalidated;
 
@@ -1823,6 +1895,95 @@ restart:
 	return invalidated;
 }
 
+/*
+ * Invalidate given slot based on replication_slot_inactive_timeout GUC.
+ *
+ * Returns true if the slot has got invalidated.
+ *
+ * NB - this function also runs as part of checkpoint, so avoid raising errors
+ * if possible.
+ */
+bool
+InvalidateInactiveReplicationSlot(ReplicationSlot *slot, bool persist_state)
+{
+	if (!InvalidatePossiblyInactiveSlot(slot))
+		return false;
+
+	/* Make sure the invalidated state persists across server restart */
+	MarkGivenReplicationSlotDirty(slot);
+
+	if (persist_state)
+		SaveGivenReplicationSlot(slot, ERROR);
+
+	ReportSlotInvalidation(RS_INVAL_INACTIVE_TIMEOUT, false, 0,
+						   slot->data.name, InvalidXLogRecPtr,
+						   InvalidXLogRecPtr, InvalidTransactionId);
+
+	return true;
+}
+
+/*
+ * Helper for InvalidateInactiveReplicationSlot
+ */
+static bool
+InvalidatePossiblyInactiveSlot(ReplicationSlot *slot)
+{
+	ReplicationSlotInvalidationCause inavidation_cause = RS_INVAL_NONE;
+
+	/*
+	 * Note that we don't invalidate slot on the standby that's currently
+	 * being synced from the primary, because such slots are typically
+	 * considered not active as they don't actually perform logical decoding.
+	 */
+	if (RecoveryInProgress() && slot->data.synced)
+		return false;
+
+	if (replication_slot_inactive_timeout == 0)
+		return false;
+
+	if (slot->inactive_since > 0)
+	{
+		TimestampTz now;
+
+		/*
+		 * Do not invalidate the slots which are currently being synced from
+		 * the primary to the standby.
+		 */
+		if (RecoveryInProgress() && slot->data.synced)
+			return false;
+
+		LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+		/*
+		 * We get the current time beforehand to avoid system call while
+		 * holding the spinlock.
+		 */
+		now = GetCurrentTimestamp();
+
+		SpinLockAcquire(&slot->mutex);
+
+		/*
+		 * Check if the slot needs to be invalidated due to
+		 * replication_slot_inactive_timeout GUC. We do this with the spinlock
+		 * held to avoid race conditions -- for example the inactive_since
+		 * could change, or the slot could be dropped.
+		 */
+		if (TimestampDifferenceExceeds(slot->inactive_since, now,
+									   replication_slot_inactive_timeout * 1000))
+		{
+			inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+			slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT;
+		}
+
+		SpinLockRelease(&slot->mutex);
+		LWLockRelease(ReplicationSlotControlLock);
+
+		return (inavidation_cause == RS_INVAL_INACTIVE_TIMEOUT);
+	}
+
+	return false;
+}
+
 /*
  * Flush all replication slots to disk.
  *
@@ -1835,6 +1996,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated = false;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -1858,6 +2020,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		/* save the slot to disk, locking is handled in SaveSlotToPath() */
 		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
 
+		/*
+		 * Here's an opportunity to invalidate inactive replication slots
+		 * based on timeout, so let's do it.
+		 */
+		if (InvalidateInactiveReplicationSlot(s, false))
+			invalidated = true;
+
 		/*
 		 * Slot's data is not flushed each time the confirmed_flush LSN is
 		 * updated as that could lead to frequent writes.  However, we decide
@@ -1884,6 +2053,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	/* If the slot has been invalidated, recalculate the resource limits */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index dd6c1d5a7e..9ad3e55704 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -539,7 +539,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bc40c454de..96eeb8b7d2 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -846,7 +846,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1459,7 +1459,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index c12784cbec..4149ff1ffe 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2971,6 +2971,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index baecde2841..2e1ad2eaca 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -335,6 +335,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7b937d1a0c..f0ac324ce9 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -230,6 +232,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -245,7 +248,8 @@ extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, bool failover);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_timeout_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
@@ -264,6 +268,8 @@ extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
+extern bool InvalidateInactiveReplicationSlot(ReplicationSlot *slot,
+											  bool persist_state);
 extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock);
 extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 712924c2fa..0437ab5c46 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_wal_replay_wait.pl',
+      't/044_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_slots.pl b/src/test/recovery/t/044_invalidate_slots.pl
new file mode 100644
index 0000000000..8f7967a253
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_slots.pl
@@ -0,0 +1,274 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to replication_slot_inactive_timeout. Also,
+# check the logical failover slot synced on to the standby doesn't invalidate
+# the slot on its own, but gets the invalidated state from the remote slot on
+# the primary.
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr_1 = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb1_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('lsub1_sync_slot', 'test_decoding', false, false, true);}
+);
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot');
+]);
+
+$standby1->start;
+
+my $standby1_logstart = -s $standby1->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Synchronize the primary server slots to the standby.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced'.
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub1_sync_slot' AND synced AND NOT temporary;}
+	),
+	"t",
+	'logical slot has synced as true on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$primary->reload;
+
+# Wait for the logical failover slot to become inactive on the primary. Note
+# that nobody has acquired that slot yet, so due to
+# replication_slot_inactive_timeout setting above it must get invalidated.
+wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart);
+
+# Set timeout on the standby also to check the synced slots don't get
+# invalidated due to timeout on the standby.
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$standby1->reload;
+
+# Now, sync the logical failover slot from the remote slot on the primary.
+# Note that the remote slot has already been invalidated due to inactive
+# timeout. Now, the standby must also see it as invalidated.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Wait for the inactive replication slot to be invalidated.
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'lsub1_sync_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for replication slot lsub1_sync_slot invalidation to be synced on standby";
+
+# Synced slot mustn't get invalidated on the standby, it must sync invalidation
+# from the primary. So, we must not see the slot's invalidation message in server
+# log.
+ok( !$standby1->log_contains(
+		"invalidating obsolete replication slot \"lsub1_sync_slot\"",
+		$standby1_logstart),
+	'check that syned slot has not been invalidated on the standby');
+
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
+wait_for_slot_invalidation($primary, 'sb1_slot', $logstart);
+
+# Testcase end: Invalidate streaming standby's slot as well as logical failover
+# slot on primary due to replication_slot_inactive_timeout. Also, check the
+# logical failover slot synced on to the standby doesn't invalidate the slot on
+# its own, but gets the invalidated state from the remote slot on the primary.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+
+my $publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$subscriber->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart);
+
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+# =============================================================================
+
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $name = $node->name;
+
+	# Wait for the replication slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for replication slot to become inactive";
+
+	# Wait for the replication slot info to be updated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE inactive_since IS NOT NULL
+				AND slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for info of replication slot $slot_name to be updated on node $name";
+
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+
+	# Wait for the inactive replication slot to be invalidated.
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for inactive replication slot $slot_name to be invalidated on node $name";
+
+	# Check that the invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+
+	($result, $stdout, $stderr) = $primary->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot_name', '0/1');
+	]);
+
+	ok( $stderr =~
+		  /can no longer get changes from replication slot "$slot_name"/,
+		"detected error upon trying to acquire invalidated slot $slot_name on node $name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot_name";
+}
+
+# Check for invalidation of slot in server log.
+sub check_for_slot_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot_name invalidation has been logged");
+}
+
+done_testing();
-- 
2.34.1

#208

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#207)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Wed, Apr 03, 2024 at 08:28:04PM +0530, Bharath Rupireddy wrote:

On Wed, Apr 3, 2024 at 6:46 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
Just one comment on v32-0001:
+# Synced slot on the standby must get its own inactive_since.
+is( $standby1->safe_psql(
+               'postgres',
+               "SELECT '$inactive_since_on_primary'::timestamptz <= '$inactive_since_on_standby'::timestamptz AND
+                       '$inactive_since_on_standby'::timestamptz <= '$slot_sync_time'::timestamptz;"
+       ),
+       "t",
+       'synchronized slot has got its own inactive_since');
+
By using <= we are not testing that it must get its own inactive_since (as we
allow them to be equal in the test). I think we should just add some usleep()
where appropriate and deny equality during the tests on inactive_since.
Except for the above, v32-0001 LGTM.

Thanks. Please see the attached v33-0001 patch after removing equality
on inactive_since TAP tests.

Thanks! v33-0001 LGTM.

On Wed, Apr 3, 2024 at 1:47 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Some comments regarding v31-0002:

T2 ===

In case the slot is invalidated on the primary,

primary:

postgres=# select slot_name, inactive_since, invalidation_reason from pg_replication_slots where slot_name = 's1';
slot_name | inactive_since | invalidation_reason
-----------+-------------------------------+---------------------
s1 | 2024-04-03 06:56:28.075637+00 | inactive_timeout

then on the standby we get:

standby:

postgres=# select slot_name, inactive_since, invalidation_reason from pg_replication_slots where slot_name = 's1';
slot_name | inactive_since | invalidation_reason
-----------+------------------------------+---------------------
s1 | 2024-04-03 07:06:43.37486+00 | inactive_timeout

shouldn't the slot be dropped/recreated instead of updating inactive_since?

The sync slots that are invalidated on the primary aren't dropped and
recreated on the standby.

Yeah, right (I was confused with synced slots that are invalidated locally).

However, I
found that the synced slot is acquired and released unnecessarily
after the invalidation_reason is synced from the primary. I added a
skip check in synchronize_one_slot to skip acquiring and releasing the
slot if it's locally found inactive. With this, inactive_since won't
get updated for invalidated sync slots on the standby as we don't
acquire and release the slot.

CR1 ===

Yeah, I can see:

@@ -575,6 +575,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
" name slot \"%s\" already exists on the standby",
remote_slot->name));

+               /*
+                * Skip the sync if the local slot is already invalidated. We do this
+                * beforehand to save on slot acquire and release.
+                */
+               if (slot->data.invalidated != RS_INVAL_NONE)
+                       return false;

Thanks to the drop_local_obsolete_slots() call I think we are not missing the case
where the slot has been invalidated on the primary, invalidation reason has been
synced on the standby and later the slot is dropped/ recreated manually on the
primary (then it should be dropped/recreated on the standby too).

Also it seems we are not missing the case where a sync slot is invalidated
locally due to wal removal (it should be dropped/recreated).

CR5 ===
+       /*
+        * This function isn't expected to be called for inactive timeout based
+        * invalidation. A separate function InvalidateInactiveReplicationSlot is
+        * to be used for that.
Do you think it's worth to explain why?
Hm, I just wanted to point out the actual function here. I modified it
to something like the following, if others feel we don't need that, I
can remove it.

Sorry If I was not clear but I meant to say "Do you think it's worth to explain
why we decided to create a dedicated function"? (currently we "just" explain that
we created one).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#209

Masahiko Sawada

sawada.mshk@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#207)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Apr 3, 2024 at 11:58 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Please find the attached v33 patches.

@@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
    if (SlotSyncCtx->pid == InvalidPid)
    {
        SpinLockRelease(&SlotSyncCtx->mutex);
+       update_synced_slots_inactive_since();
        return;
    }
    SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
    }

    SpinLockRelease(&SlotSyncCtx->mutex);
+
+   update_synced_slots_inactive_since();
 }

Why do we want to update all synced slots' inactive_since values at
shutdown in spite of updating the value every time when releasing the
slot? It seems to contradict the fact that inactive_since is updated
when releasing or restoring the slot.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#210

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Masahiko Sawada (#209)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Apr 4, 2024 at 9:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

@@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
if (SlotSyncCtx->pid == InvalidPid)
{
SpinLockRelease(&SlotSyncCtx->mutex);
+       update_synced_slots_inactive_since();
return;
}
SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
}
SpinLockRelease(&SlotSyncCtx->mutex);
+
+   update_synced_slots_inactive_since();
}
Why do we want to update all synced slots' inactive_since values at
shutdown in spite of updating the value every time when releasing the
slot? It seems to contradict the fact that inactive_since is updated
when releasing or restoring the slot.

It is to get the inactive_since right for the cases where the standby
is promoted without a restart similar to when a standby is promoted
with restart in which case the inactive_since is set to current time
in RestoreSlotFromDisk.

Imagine the slot is synced last time at time t1 and then a few hours
passed, the standby is promoted without a restart. If we don't set
inactive_since to current time in this case in ShutDownSlotSync, the
inactive timeout invalidation mechanism can kick in immediately.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#211

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#207)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Apr 3, 2024 at 8:28 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Apr 3, 2024 at 6:46 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
Just one comment on v32-0001:
+# Synced slot on the standby must get its own inactive_since.
+is( $standby1->safe_psql(
+               'postgres',
+               "SELECT '$inactive_since_on_primary'::timestamptz <= '$inactive_since_on_standby'::timestamptz AND
+                       '$inactive_since_on_standby'::timestamptz <= '$slot_sync_time'::timestamptz;"
+       ),
+       "t",
+       'synchronized slot has got its own inactive_since');
+
By using <= we are not testing that it must get its own inactive_since (as we
allow them to be equal in the test). I think we should just add some usleep()
where appropriate and deny equality during the tests on inactive_since.
Thanks. It looks like we can ignore the equality in all of the
inactive_since comparisons. IIUC, all the TAP tests do run with
primary and standbys on the single BF animals. And, it looks like
assigning the inactive_since timestamps to perl variables is giving
the microseconds precision level
(./tmp_check/log/regress_log_040_standby_failover_slots_sync:inactive_since
2024-04-03 14:30:09.691648+00). FWIW, we already have some TAP and SQL
tests relying on stats_reset timestamps without equality. So, I've
left the equality for the inactive_since tests.

Except for the above, v32-0001 LGTM.

Thanks. Please see the attached v33-0001 patch after removing equality
on inactive_since TAP tests.

The v33-0001 looks good to me. I have made minor changes in the
comments/commit message and removed one part of the test which was a
bit confusing and didn't seem to add much value. Let me know what you
think of the attached?

--
With Regards,
Amit Kapila.

Attachments:

v34-0001-Allow-synced-slots-to-have-their-inactive_since.patchapplication/octet-stream; name=v34-0001-Allow-synced-slots-to-have-their-inactive_since.patchDownload

From fd3aef5c67fb1ba361b9d1a70a47ffadf125b8d0 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 3 Apr 2024 14:38:38 +0000
Subject: [PATCH v34] Allow synced slots to have their inactive_since.

This commit does two things:
1) Maintains inactive_since for sync slots whenever the slot is released
just like any other regular slot.

2) Ensures the value is set to the current timestamp during the promotion
of standby to help correctly interpret the time after promotion. Whoever
acquires the slot i.e. makes the slot active will reset it to NULL.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik, Masahiko Sawada
Discussion: https://postgr.es/m/CAA4eK1KrPGwfZV9LYGidjxHeW+rxJ=E2ThjXvwRGLO=iLNuo=Q@mail.gmail.com
Discussion: https://postgr.es/m/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://postgr.es/m/CA+Tgmob_Ta-t2ty8QrKHBGnNLrf4ZYcwhGHGFsuUoFrAEDw4sA@mail.gmail.com
---
 doc/src/sgml/system-views.sgml                |  7 +++
 src/backend/replication/logical/slotsync.c    | 48 ++++++++++++++++
 src/backend/replication/slot.c                | 22 +++-----
 src/test/perl/PostgreSQL/Test/Cluster.pm      | 31 ++++++++++
 src/test/recovery/t/019_replslot_limit.pl     | 26 +--------
 .../t/040_standby_failover_slots_sync.pl      | 56 +++++++++++++++++++
 6 files changed, 151 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..7ed617170f 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2530,6 +2530,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       <para>
         The time since the slot has become inactive.
         <literal>NULL</literal> if the slot is currently being used.
+        Note that for slots on the standby that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>), the
+        <structfield>inactive_since</structfield> indicates the last
+        synchronization (see
+        <xref linkend="logicaldecoding-replication-slots-synchronization"/>)
+        time.
       </para></entry>
      </row>
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 9ac847b780..ea770944e9 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -150,6 +150,7 @@ typedef struct RemoteSlot
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_since(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -584,6 +585,11 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * overwriting 'invalidated' flag to remote_slot's value. See
 		 * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
 		 * if the slot is not acquired by other processes.
+		 *
+		 * XXX: If it ever turns out that slot acquire/release is costly for
+		 * cases when none of the slot properties is changed then we can do a
+		 * pre-check to ensure that at least one of the slot properties is
+		 * changed before acquiring the slot.
 		 */
 		ReplicationSlotAcquire(remote_slot->name, true);
 
@@ -1355,6 +1361,45 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Update the inactive_since property for synced slots.
+ *
+ * Note that this function is currently called when we shutdown the slot sync
+ * machinery. This helps correctly interpret the inactive_since if the standby
+ * gets promoted without a restart.
+ */
+static void
+update_synced_slots_inactive_since(void)
+{
+	TimestampTz now = 0;
+
+	/* The slot sync worker mustn't be running by now */
+	Assert(SlotSyncCtx->pid == InvalidPid);
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			Assert(SlotIsLogical(s));
+
+			/* Use the same inactive_since time for all the slots. */
+			if (now == 0)
+				now = GetCurrentTimestamp();
+
+			SpinLockAcquire(&s->mutex);
+			s->inactive_since = now;
+			SpinLockRelease(&s->mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1368,6 +1413,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(&SlotSyncCtx->mutex);
+		update_synced_slots_inactive_since();
 		return;
 	}
 	SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1400,6 +1446,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(&SlotSyncCtx->mutex);
+
+	update_synced_slots_inactive_since();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d778c0b921..3bddaae022 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -690,13 +690,10 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has become inactive. We get the current
+	 * time beforehand to avoid system call while holding the spinlock.
 	 */
-	if (!(RecoveryInProgress() && slot->data.synced))
-		now = GetCurrentTimestamp();
+	now = GetCurrentTimestamp();
 
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
@@ -2369,16 +2366,11 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * Set the time since the slot has become inactive after loading the
+		 * slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it.
 		 */
-		if (!(RecoveryInProgress() && slot->data.synced))
-			slot->inactive_since = GetCurrentTimestamp();
-		else
-			slot->inactive_since = 0;
+		slot->inactive_since = GetCurrentTimestamp();
 
 		restored = true;
 		break;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b08296605c..54e1008ae5 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3276,6 +3276,37 @@ sub create_logical_slot_on_standby
 
 =pod
 
+=item $node->validate_slot_inactive_since(self, slot_name, reference_time)
+
+Validate inactive_since value of a given replication slot against the reference
+time and return it.
+
+=cut
+
+sub validate_slot_inactive_since
+{
+	my ($self, $slot_name, $reference_time) = @_;
+	my $name = $self->name;
+
+	my $inactive_since = $self->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
+		);
+
+	# Check that the inactive_since is sane
+	is($self->safe_psql('postgres',
+		qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+				'$inactive_since'::timestamptz > '$reference_time'::timestamptz;]
+		),
+		't',
+		"last inactive time for slot $slot_name is valid on node $name")
+		or die "could not validate captured inactive_since for slot $slot_name";
+
+	return $inactive_since;
+}
+
+=pod
+
 =item $node->advance_wal(num)
 
 Advance WAL of node by given number of segments.
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 3b9a306a8b..96b60cedbb 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -443,7 +443,7 @@ $primary4->safe_psql(
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the standby below.
 my $inactive_since =
-	capture_and_validate_slot_inactive_since($primary4, $sb4_slot, $slot_creation_time);
+	$primary4->validate_slot_inactive_since($sb4_slot, $slot_creation_time);
 
 $standby4->start;
 
@@ -502,7 +502,7 @@ $publisher4->safe_psql('postgres',
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the subscriber below.
 $inactive_since =
-	capture_and_validate_slot_inactive_since($publisher4, $lsub4_slot, $slot_creation_time);
+	$publisher4->validate_slot_inactive_since($lsub4_slot, $slot_creation_time);
 
 $subscriber4->start;
 $subscriber4->safe_psql('postgres',
@@ -540,26 +540,4 @@ is( $publisher4->safe_psql(
 $publisher4->stop;
 $subscriber4->stop;
 
-# Capture and validate inactive_since of a given slot.
-sub capture_and_validate_slot_inactive_since
-{
-	my ($node, $slot_name, $slot_creation_time) = @_;
-
-	my $inactive_since = $node->safe_psql('postgres',
-		qq(SELECT inactive_since FROM pg_replication_slots
-			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
-		);
-
-	# Check that the captured time is sane
-	is( $node->safe_psql(
-			'postgres',
-			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
-				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
-		),
-		't',
-		"last inactive time for an active slot $slot_name is sane");
-
-	return $inactive_since;
-}
-
 done_testing();
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 869e3d2e91..ea05ea2769 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -35,6 +35,13 @@ my $subscriber1 = PostgreSQL::Test::Cluster->new('subscriber1');
 $subscriber1->init;
 $subscriber1->start;
 
+# Capture the time before the logical failover slot is created on the
+# primary. We later call this publisher as primary anyway.
+my $slot_creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Create a slot on the publisher with failover disabled
 $publisher->safe_psql('postgres',
 	"SELECT 'init' FROM pg_create_logical_replication_slot('lsub1_slot', 'pgoutput', false, false, false);"
@@ -174,6 +181,11 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary. Note that the slot
+# will be inactive since the corresponding subscription is disabled..
+my $inactive_since_on_primary =
+	$primary->validate_slot_inactive_since('lsub1_slot', $slot_creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -190,6 +202,18 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Capture the inactive_since of the synced slot on the standby
+my $inactive_since_on_standby =
+	$standby1->validate_slot_inactive_since('lsub1_slot', $slot_creation_time_on_primary);
+
+# Synced slot on the standby must get its own inactive_since.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_primary'::timestamptz < '$inactive_since_on_standby'::timestamptz;"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
@@ -237,6 +261,13 @@ is( $standby1->safe_psql(
 $standby1->append_conf('postgresql.conf', 'max_slot_wal_keep_size = -1');
 $standby1->reload;
 
+# Capture the time before the logical failover slot is created on the primary.
+# Note that the subscription creates the slot again on the primary.
+$slot_creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # To ensure that restart_lsn has moved to a recent WAL position, we re-create
 # the subscription and the logical slot.
 $subscriber1->safe_psql(
@@ -257,6 +288,11 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary. Note that the slot
+# will be inactive since the corresponding subscription is disabled.
+$inactive_since_on_primary =
+	$primary->validate_slot_inactive_since('lsub1_slot', $slot_creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -808,8 +844,28 @@ $primary->reload;
 $standby1->start;
 $primary->wait_for_replay_catchup($standby1);
 
+# Capture the time before the standby is promoted
+my $promotion_time_on_primary = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 $standby1->promote;
 
+# Capture the inactive_since of the synced slot after the promotion.
+# The expectation here is that the slot gets its inactive_since as part of the
+# promotion. We do this check before the slot is enabled on the new primary
+# below, otherwise, the slot gets active setting inactive_since to NULL.
+my $inactive_since_on_new_primary =
+	$standby1->validate_slot_inactive_since('lsub1_slot', $promotion_time_on_primary);
+
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_new_primary'::timestamptz > '$inactive_since_on_primary'::timestamptz"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since on the new primary after promotion');
+
 # Update subscription with the new primary's connection info
 my $standby1_conninfo = $standby1->connstr . ' dbname=postgres';
 $subscriber1->safe_psql('postgres',
-- 
2.28.0.windows.1

#212

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#211)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Apr 4, 2024 at 10:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

The v33-0001 looks good to me. I have made minor changes in the
comments/commit message and removed one part of the test which was a
bit confusing and didn't seem to add much value. Let me know what you
think of the attached?

Thanks for the changes. v34-0001 LGTM.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#213

Masahiko Sawada

sawada.mshk@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#210)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Apr 4, 2024 at 1:34 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Apr 4, 2024 at 9:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
@@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
if (SlotSyncCtx->pid == InvalidPid)
{
SpinLockRelease(&SlotSyncCtx->mutex);
+       update_synced_slots_inactive_since();
return;
}
SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
}
SpinLockRelease(&SlotSyncCtx->mutex);
+
+   update_synced_slots_inactive_since();
}
Why do we want to update all synced slots' inactive_since values at
shutdown in spite of updating the value every time when releasing the
slot? It seems to contradict the fact that inactive_since is updated
when releasing or restoring the slot.
It is to get the inactive_since right for the cases where the standby
is promoted without a restart similar to when a standby is promoted
with restart in which case the inactive_since is set to current time
in RestoreSlotFromDisk.

Imagine the slot is synced last time at time t1 and then a few hours
passed, the standby is promoted without a restart. If we don't set
inactive_since to current time in this case in ShutDownSlotSync, the
inactive timeout invalidation mechanism can kick in immediately.

Thank you for the explanation! I understood the needs.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#214

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Masahiko Sawada (#213)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Apr 4, 2024 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Apr 4, 2024 at 1:34 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Thu, Apr 4, 2024 at 9:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
@@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
if (SlotSyncCtx->pid == InvalidPid)
{
SpinLockRelease(&SlotSyncCtx->mutex);
+       update_synced_slots_inactive_since();
return;
}
SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
}
SpinLockRelease(&SlotSyncCtx->mutex);
+
+   update_synced_slots_inactive_since();
}
Why do we want to update all synced slots' inactive_since values at
shutdown in spite of updating the value every time when releasing the
slot? It seems to contradict the fact that inactive_since is updated
when releasing or restoring the slot.
It is to get the inactive_since right for the cases where the standby
is promoted without a restart similar to when a standby is promoted
with restart in which case the inactive_since is set to current time
in RestoreSlotFromDisk.

Imagine the slot is synced last time at time t1 and then a few hours
passed, the standby is promoted without a restart. If we don't set
inactive_since to current time in this case in ShutDownSlotSync, the
inactive timeout invalidation mechanism can kick in immediately.
Thank you for the explanation! I understood the needs.

Do you want to review the v34_0001* further or shall I proceed with
the commit of the same?

--
With Regards,
Amit Kapila.

#215

Masahiko Sawada

sawada.mshk@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#214)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Apr 4, 2024 at 5:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Apr 4, 2024 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Thu, Apr 4, 2024 at 1:34 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Thu, Apr 4, 2024 at 9:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
@@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
if (SlotSyncCtx->pid == InvalidPid)
{
SpinLockRelease(&SlotSyncCtx->mutex);
+       update_synced_slots_inactive_since();
return;
}
SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
}
SpinLockRelease(&SlotSyncCtx->mutex);
+
+   update_synced_slots_inactive_since();
}
Why do we want to update all synced slots' inactive_since values at
shutdown in spite of updating the value every time when releasing the
slot? It seems to contradict the fact that inactive_since is updated
when releasing or restoring the slot.
It is to get the inactive_since right for the cases where the standby
is promoted without a restart similar to when a standby is promoted
with restart in which case the inactive_since is set to current time
in RestoreSlotFromDisk.

Imagine the slot is synced last time at time t1 and then a few hours
passed, the standby is promoted without a restart. If we don't set
inactive_since to current time in this case in ShutDownSlotSync, the
inactive timeout invalidation mechanism can kick in immediately.
Thank you for the explanation! I understood the needs.
Do you want to review the v34_0001* further or shall I proceed with
the commit of the same?

Thanks for asking. The v34-0001 patch looks good to me.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#216

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#212)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Apr 4, 2024 at 11:12 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Apr 4, 2024 at 10:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

The v33-0001 looks good to me. I have made minor changes in the
comments/commit message and removed one part of the test which was a
bit confusing and didn't seem to add much value. Let me know what you
think of the attached?

Thanks for the changes. v34-0001 LGTM.

I was doing a final review before pushing 0001 and found that
'inactive_since' could be set twice during startup after promotion,
once while restoring slots and then via ShutDownSlotSync(). The reason
is that ShutDownSlotSync() will be invoked in normal startup on
primary though it won't do anything apart from setting inactive_since
if we have synced slots. I think you need to check 'StandbyMode' in
update_synced_slots_inactive_since() and return if the same is not
set. We can't use 'InRecovery' flag as that will be set even during
crash recovery.

Can you please test this once unless you don't agree with the above theory?

--
With Regards,
Amit Kapila.

#217

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#216)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Apr 4, 2024 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Thanks for the changes. v34-0001 LGTM.

I was doing a final review before pushing 0001 and found that
'inactive_since' could be set twice during startup after promotion,
once while restoring slots and then via ShutDownSlotSync(). The reason
is that ShutDownSlotSync() will be invoked in normal startup on
primary though it won't do anything apart from setting inactive_since
if we have synced slots. I think you need to check 'StandbyMode' in
update_synced_slots_inactive_since() and return if the same is not
set. We can't use 'InRecovery' flag as that will be set even during
crash recovery.

Can you please test this once unless you don't agree with the above theory?

Nice catch. I've verified that update_synced_slots_inactive_since is
called even for normal server startups/crash recovery. I've added a
check to exit if the StandbyMode isn't set.

Please find the attached v35 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v35-0001-Allow-synced-slots-to-have-their-inactive_since.patchapplication/x-patch; name=v35-0001-Allow-synced-slots-to-have-their-inactive_since.patchDownload

From 03a08bd5ab3a305d199d9427491be37d1a1fd61b Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 4 Apr 2024 12:06:17 +0000
Subject: [PATCH v35] Allow synced slots to have their inactive_since.

This commit does two things:
1) Maintains inactive_since for sync slots whenever the slot is released
just like any other regular slot.

2) Ensures the value is set to the current timestamp during the promotion
of standby to help correctly interpret the time after promotion. Whoever
acquires the slot i.e. makes the slot active will reset it to NULL.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik, Masahiko Sawada
Discussion: https://postgr.es/m/CAA4eK1KrPGwfZV9LYGidjxHeW+rxJ=E2ThjXvwRGLO=iLNuo=Q@mail.gmail.com
Discussion: https://postgr.es/m/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://postgr.es/m/CA+Tgmob_Ta-t2ty8QrKHBGnNLrf4ZYcwhGHGFsuUoFrAEDw4sA@mail.gmail.com
---
 doc/src/sgml/system-views.sgml                |  7 +++
 src/backend/replication/logical/slotsync.c    | 51 +++++++++++++++++
 src/backend/replication/slot.c                | 22 +++-----
 src/test/perl/PostgreSQL/Test/Cluster.pm      | 31 ++++++++++
 src/test/recovery/t/019_replslot_limit.pl     | 26 +--------
 .../t/040_standby_failover_slots_sync.pl      | 56 +++++++++++++++++++
 6 files changed, 154 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 3c8dca8ca3..7ed617170f 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2530,6 +2530,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       <para>
         The time since the slot has become inactive.
         <literal>NULL</literal> if the slot is currently being used.
+        Note that for slots on the standby that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>), the
+        <structfield>inactive_since</structfield> indicates the last
+        synchronization (see
+        <xref linkend="logicaldecoding-replication-slots-synchronization"/>)
+        time.
       </para></entry>
      </row>
 
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 9ac847b780..75e67b966d 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -150,6 +150,7 @@ typedef struct RemoteSlot
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void update_synced_slots_inactive_since(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -584,6 +585,11 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * overwriting 'invalidated' flag to remote_slot's value. See
 		 * InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
 		 * if the slot is not acquired by other processes.
+		 *
+		 * XXX: If it ever turns out that slot acquire/release is costly for
+		 * cases when none of the slot properties is changed then we can do a
+		 * pre-check to ensure that at least one of the slot properties is
+		 * changed before acquiring the slot.
 		 */
 		ReplicationSlotAcquire(remote_slot->name, true);
 
@@ -1355,6 +1361,48 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Update the inactive_since property for synced slots.
+ *
+ * Note that this function is currently called when we shutdown the slot sync
+ * machinery during standby promotion. This helps correctly interpret the
+ * inactive_since if the standby gets promoted without a restart.
+ */
+static void
+update_synced_slots_inactive_since(void)
+{
+	TimestampTz now = 0;
+
+	if (!StandbyMode)
+		return;
+
+	/* The slot sync worker mustn't be running by now */
+	Assert(SlotSyncCtx->pid == InvalidPid);
+
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			Assert(SlotIsLogical(s));
+
+			/* Use the same inactive_since time for all the slots. */
+			if (now == 0)
+				now = GetCurrentTimestamp();
+
+			SpinLockAcquire(&s->mutex);
+			s->inactive_since = now;
+			SpinLockRelease(&s->mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1368,6 +1416,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(&SlotSyncCtx->mutex);
+		update_synced_slots_inactive_since();
 		return;
 	}
 	SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1400,6 +1449,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(&SlotSyncCtx->mutex);
+
+	update_synced_slots_inactive_since();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d778c0b921..3bddaae022 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -690,13 +690,10 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has become inactive. We get the current
+	 * time beforehand to avoid system call while holding the spinlock.
 	 */
-	if (!(RecoveryInProgress() && slot->data.synced))
-		now = GetCurrentTimestamp();
+	now = GetCurrentTimestamp();
 
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
@@ -2369,16 +2366,11 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * Set the time since the slot has become inactive after loading the
+		 * slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it.
 		 */
-		if (!(RecoveryInProgress() && slot->data.synced))
-			slot->inactive_since = GetCurrentTimestamp();
-		else
-			slot->inactive_since = 0;
+		slot->inactive_since = GetCurrentTimestamp();
 
 		restored = true;
 		break;
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index b08296605c..54e1008ae5 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3276,6 +3276,37 @@ sub create_logical_slot_on_standby
 
 =pod
 
+=item $node->validate_slot_inactive_since(self, slot_name, reference_time)
+
+Validate inactive_since value of a given replication slot against the reference
+time and return it.
+
+=cut
+
+sub validate_slot_inactive_since
+{
+	my ($self, $slot_name, $reference_time) = @_;
+	my $name = $self->name;
+
+	my $inactive_since = $self->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
+		);
+
+	# Check that the inactive_since is sane
+	is($self->safe_psql('postgres',
+		qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+				'$inactive_since'::timestamptz > '$reference_time'::timestamptz;]
+		),
+		't',
+		"last inactive time for slot $slot_name is valid on node $name")
+		or die "could not validate captured inactive_since for slot $slot_name";
+
+	return $inactive_since;
+}
+
+=pod
+
 =item $node->advance_wal(num)
 
 Advance WAL of node by given number of segments.
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 3b9a306a8b..96b60cedbb 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -443,7 +443,7 @@ $primary4->safe_psql(
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the standby below.
 my $inactive_since =
-	capture_and_validate_slot_inactive_since($primary4, $sb4_slot, $slot_creation_time);
+	$primary4->validate_slot_inactive_since($sb4_slot, $slot_creation_time);
 
 $standby4->start;
 
@@ -502,7 +502,7 @@ $publisher4->safe_psql('postgres',
 # Get inactive_since value after the slot's creation. Note that the slot is
 # still inactive till it's used by the subscriber below.
 $inactive_since =
-	capture_and_validate_slot_inactive_since($publisher4, $lsub4_slot, $slot_creation_time);
+	$publisher4->validate_slot_inactive_since($lsub4_slot, $slot_creation_time);
 
 $subscriber4->start;
 $subscriber4->safe_psql('postgres',
@@ -540,26 +540,4 @@ is( $publisher4->safe_psql(
 $publisher4->stop;
 $subscriber4->stop;
 
-# Capture and validate inactive_since of a given slot.
-sub capture_and_validate_slot_inactive_since
-{
-	my ($node, $slot_name, $slot_creation_time) = @_;
-
-	my $inactive_since = $node->safe_psql('postgres',
-		qq(SELECT inactive_since FROM pg_replication_slots
-			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
-		);
-
-	# Check that the captured time is sane
-	is( $node->safe_psql(
-			'postgres',
-			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
-				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
-		),
-		't',
-		"last inactive time for an active slot $slot_name is sane");
-
-	return $inactive_since;
-}
-
 done_testing();
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index 869e3d2e91..ea05ea2769 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -35,6 +35,13 @@ my $subscriber1 = PostgreSQL::Test::Cluster->new('subscriber1');
 $subscriber1->init;
 $subscriber1->start;
 
+# Capture the time before the logical failover slot is created on the
+# primary. We later call this publisher as primary anyway.
+my $slot_creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Create a slot on the publisher with failover disabled
 $publisher->safe_psql('postgres',
 	"SELECT 'init' FROM pg_create_logical_replication_slot('lsub1_slot', 'pgoutput', false, false, false);"
@@ -174,6 +181,11 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary. Note that the slot
+# will be inactive since the corresponding subscription is disabled..
+my $inactive_since_on_primary =
+	$primary->validate_slot_inactive_since('lsub1_slot', $slot_creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -190,6 +202,18 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Capture the inactive_since of the synced slot on the standby
+my $inactive_since_on_standby =
+	$standby1->validate_slot_inactive_since('lsub1_slot', $slot_creation_time_on_primary);
+
+# Synced slot on the standby must get its own inactive_since.
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_primary'::timestamptz < '$inactive_since_on_standby'::timestamptz;"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since');
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
@@ -237,6 +261,13 @@ is( $standby1->safe_psql(
 $standby1->append_conf('postgresql.conf', 'max_slot_wal_keep_size = -1');
 $standby1->reload;
 
+# Capture the time before the logical failover slot is created on the primary.
+# Note that the subscription creates the slot again on the primary.
+$slot_creation_time_on_primary = $publisher->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # To ensure that restart_lsn has moved to a recent WAL position, we re-create
 # the subscription and the logical slot.
 $subscriber1->safe_psql(
@@ -257,6 +288,11 @@ $primary->poll_query_until(
 	"SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'lsub1_slot' AND active = 'f'",
 	1);
 
+# Capture the inactive_since of the slot from the primary. Note that the slot
+# will be inactive since the corresponding subscription is disabled.
+$inactive_since_on_primary =
+	$primary->validate_slot_inactive_since('lsub1_slot', $slot_creation_time_on_primary);
+
 # Wait for the standby to catch up so that the standby is not lagging behind
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
@@ -808,8 +844,28 @@ $primary->reload;
 $standby1->start;
 $primary->wait_for_replay_catchup($standby1);
 
+# Capture the time before the standby is promoted
+my $promotion_time_on_primary = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 $standby1->promote;
 
+# Capture the inactive_since of the synced slot after the promotion.
+# The expectation here is that the slot gets its inactive_since as part of the
+# promotion. We do this check before the slot is enabled on the new primary
+# below, otherwise, the slot gets active setting inactive_since to NULL.
+my $inactive_since_on_new_primary =
+	$standby1->validate_slot_inactive_since('lsub1_slot', $promotion_time_on_primary);
+
+is( $standby1->safe_psql(
+		'postgres',
+		"SELECT '$inactive_since_on_new_primary'::timestamptz > '$inactive_since_on_primary'::timestamptz"
+	),
+	"t",
+	'synchronized slot has got its own inactive_since on the new primary after promotion');
+
 # Update subscription with the new primary's connection info
 my $standby1_conninfo = $standby1->connstr . ' dbname=postgres';
 $subscriber1->safe_psql('postgres',
-- 
2.34.1

#218

shveta malik

shveta.malik@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#217)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Apr 4, 2024 at 5:53 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Apr 4, 2024 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Thanks for the changes. v34-0001 LGTM.

I was doing a final review before pushing 0001 and found that
'inactive_since' could be set twice during startup after promotion,
once while restoring slots and then via ShutDownSlotSync(). The reason
is that ShutDownSlotSync() will be invoked in normal startup on
primary though it won't do anything apart from setting inactive_since
if we have synced slots. I think you need to check 'StandbyMode' in
update_synced_slots_inactive_since() and return if the same is not
set. We can't use 'InRecovery' flag as that will be set even during
crash recovery.

Can you please test this once unless you don't agree with the above theory?

Nice catch. I've verified that update_synced_slots_inactive_since is
called even for normal server startups/crash recovery. I've added a
check to exit if the StandbyMode isn't set.

Please find the attached v35 patch.

Thanks for the patch. Tested it , works well. Few cosmetic changes needed:

in 040 test file:
1)
# Capture the inactive_since of the slot from the primary. Note that the slot
# will be inactive since the corresponding subscription is disabled..

2 .. at the end. Replace with one.

2)
# Synced slot on the standby must get its own inactive_since.

. not needed in single line comment (to be consistent with
neighbouring comments)

3)
update_synced_slots_inactive_since():

if (!StandbyMode)
return;

It will be good to add comments here.

thanks
Shveta

#219

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#208)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Apr 3, 2024 at 9:57 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

shouldn't the slot be dropped/recreated instead of updating inactive_since?

The sync slots that are invalidated on the primary aren't dropped and
recreated on the standby.

Yeah, right (I was confused with synced slots that are invalidated locally).

However, I
found that the synced slot is acquired and released unnecessarily
after the invalidation_reason is synced from the primary. I added a
skip check in synchronize_one_slot to skip acquiring and releasing the
slot if it's locally found inactive. With this, inactive_since won't
get updated for invalidated sync slots on the standby as we don't
acquire and release the slot.

CR1 ===

Yeah, I can see:

@@ -575,6 +575,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
" name slot \"%s\" already exists on the standby",
remote_slot->name));
+               /*
+                * Skip the sync if the local slot is already invalidated. We do this
+                * beforehand to save on slot acquire and release.
+                */
+               if (slot->data.invalidated != RS_INVAL_NONE)
+                       return false;
Thanks to the drop_local_obsolete_slots() call I think we are not missing the case
where the slot has been invalidated on the primary, invalidation reason has been
synced on the standby and later the slot is dropped/ recreated manually on the
primary (then it should be dropped/recreated on the standby too).

Also it seems we are not missing the case where a sync slot is invalidated
locally due to wal removal (it should be dropped/recreated).

Right.

CR5 ===
+       /*
+        * This function isn't expected to be called for inactive timeout based
+        * invalidation. A separate function InvalidateInactiveReplicationSlot is
+        * to be used for that.
Do you think it's worth to explain why?
Hm, I just wanted to point out the actual function here. I modified it
to something like the following, if others feel we don't need that, I
can remove it.
Sorry If I was not clear but I meant to say "Do you think it's worth to explain
why we decided to create a dedicated function"? (currently we "just" explain that
we created one).

We added a new function (InvalidateInactiveReplicationSlot) to
invalidate slot based on inactive timeout because 1) we do the
inactive timeout invalidation at slot level as opposed to
InvalidateObsoleteReplicationSlots which does loop over all the slots,
2)
InvalidatePossiblyObsoleteSlot does release the lock in some cases,
has a lot of unneeded code for inactive timeout invalidation check, 3)
we want some control over saving the slot to disk because we hook the
inactive timeout invalidation into the loop that checkpoints the slot
info to the disk in CheckPointReplicationSlots.

I've added a comment atop InvalidateInactiveReplicationSlot.

Please find the attached v36 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v36-0001-Add-inactive_timeout-based-replication-slot-inva.patchapplication/octet-stream; name=v36-0001-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 1410136f2d7a5b6cf0b52dba63ded96f177046a7 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 5 Apr 2024 05:48:14 +0000
Subject: [PATCH v36] Add inactive_timeout based replication slot invalidation.

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days at slot level, after which the inactive slots get
invalidated.

To achieve the above, postgres introduces a GUC allowing users
set inactive timeout. The replication slots that are inactive
for longer than specified amount of time get invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint

Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/config.sgml                      |  32 ++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |  11 +-
 src/backend/replication/slot.c                | 215 +++++++++++++-
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   8 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/044_invalidate_slots.pl   | 274 ++++++++++++++++++
 13 files changed, 547 insertions(+), 24 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 624518e0b0..626eac7125 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4547,6 +4547,38 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidates replication slots that are inactive for longer than
+        specified amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the timeout mechanism. This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+
+       <para>
+        The timeout is measured from the time since the slot has become
+        inactive (known from its
+        <structfield>inactive_since</structfield> value) until it gets
+        used (i.e., its <structfield>active</structfield> is set to true).
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>).
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 7ed617170f..063638beda 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2580,6 +2580,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index d18e2c7342..e92f559539 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -362,7 +362,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -575,6 +575,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 						   " name slot \"%s\" already exists on the standby",
 						   remote_slot->name));
 
+		/*
+		 * Skip the sync if the local slot is already invalidated. We do this
+		 * beforehand to avoid slot acquire and release.
+		 */
+		if (slot->data.invalidated != RS_INVAL_NONE)
+			return false;
+
 		/*
 		 * The slot has been synchronized before.
 		 *
@@ -591,7 +598,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 3bddaae022..7006c2e2fd 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -158,6 +160,7 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool InvalidatePossiblyInactiveSlot(ReplicationSlot *slot);
 
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
@@ -535,9 +538,14 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * If check_for_timeout_invalidation is true, the slot is checked for
+ * invalidation based on replication_slot_inactive_timeout GUC, and an error is
+ * raised after making the slot ours.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_timeout_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +623,34 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * Check if the given slot can be invalidated based on its inactive
+	 * timeout. If yes, persist the invalidated state to disk and then error
+	 * out. We do this only after making the slot ours to avoid anyone else
+	 * acquiring it while we check for its invalidation.
+	 */
+	if (check_for_timeout_invalidation)
+	{
+		/* The slot is ours by now */
+		Assert(s->active_pid == MyProcPid);
+
+		if (InvalidateInactiveReplicationSlot(s, true))
+		{
+			/*
+			 * If the slot has been invalidated, recalculate the resource
+			 * limits.
+			 */
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(MyReplicationSlot->data.name)),
+					 errdetail("This slot has been invalidated because it was inactive for more than replication_slot_inactive_timeout.")));
+		}
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -781,7 +817,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -804,7 +840,7 @@ ReplicationSlotAlter(const char *name, bool failover)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -980,6 +1016,20 @@ ReplicationSlotDropPtr(ReplicationSlot *slot)
 	LWLockRelease(ReplicationSlotAllocationLock);
 }
 
+/*
+ * Helper for ReplicationSlotSave
+ */
+static inline void
+SaveGivenReplicationSlot(ReplicationSlot *slot, int elevel)
+{
+	char		path[MAXPGPATH];
+
+	Assert(slot != NULL);
+
+	sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+	SaveSlotToPath(slot, path, elevel);
+}
+
 /*
  * Serialize the currently acquired slot's state from memory to disk, thereby
  * guaranteeing the current state will survive a crash.
@@ -987,12 +1037,21 @@ ReplicationSlotDropPtr(ReplicationSlot *slot)
 void
 ReplicationSlotSave(void)
 {
-	char		path[MAXPGPATH];
+	SaveGivenReplicationSlot(MyReplicationSlot, ERROR);
+}
 
-	Assert(MyReplicationSlot != NULL);
+/*
+ * Helper for ReplicationSlotMarkDirty
+ */
+static inline void
+MarkGivenReplicationSlotDirty(ReplicationSlot *slot)
+{
+	Assert(slot != NULL);
 
-	sprintf(path, "pg_replslot/%s", NameStr(MyReplicationSlot->data.name));
-	SaveSlotToPath(MyReplicationSlot, path, ERROR);
+	SpinLockAcquire(&slot->mutex);
+	slot->just_dirtied = true;
+	slot->dirty = true;
+	SpinLockRelease(&slot->mutex);
 }
 
 /*
@@ -1005,14 +1064,7 @@ ReplicationSlotSave(void)
 void
 ReplicationSlotMarkDirty(void)
 {
-	ReplicationSlot *slot = MyReplicationSlot;
-
-	Assert(MyReplicationSlot != NULL);
-
-	SpinLockAcquire(&slot->mutex);
-	MyReplicationSlot->just_dirtied = true;
-	MyReplicationSlot->dirty = true;
-	SpinLockRelease(&slot->mutex);
+	MarkGivenReplicationSlotDirty(MyReplicationSlot);
 }
 
 /*
@@ -1506,6 +1558,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than replication_slot_inactive_timeout."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1550,6 +1605,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
 
+	/*
+	 * Use InvalidateInactiveReplicationSlot for inactive timeout based
+	 * invalidation.
+	 */
+	Assert(cause != RS_INVAL_INACTIVE_TIMEOUT);
+
 	for (;;)
 	{
 		XLogRecPtr	restart_lsn;
@@ -1619,6 +1680,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					/* not reachable */
+					Assert(false);
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1787,6 +1852,12 @@ InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
 	Assert(cause != RS_INVAL_WAL_REMOVED || oldestSegno > 0);
 	Assert(cause != RS_INVAL_NONE);
 
+	/*
+	 * Use InvalidateInactiveReplicationSlot for inactive timeout based
+	 * invalidation.
+	 */
+	Assert(cause != RS_INVAL_INACTIVE_TIMEOUT);
+
 	if (max_replication_slots == 0)
 		return invalidated;
 
@@ -1823,6 +1894,103 @@ restart:
 	return invalidated;
 }
 
+/*
+ * Invalidate given slot based on replication_slot_inactive_timeout GUC.
+ *
+ * Returns true if the slot has got invalidated.
+ *
+ * Whether the given slot needs to be invalidated depends on the cause:
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
+ *
+ * NB - this function also runs as part of checkpoint, so avoid raising errors
+ * if possible.
+ *
+ * Note that having a new function for RS_INVAL_INACTIVE_TIMEOUT cause instead
+ * of using InvalidateObsoleteReplicationSlots provides flexibility in calling
+ * it at slot-level opportunistically, and choosing to persist the slot info to
+ * disk.
+ */
+bool
+InvalidateInactiveReplicationSlot(ReplicationSlot *slot, bool persist_state)
+{
+	if (!InvalidatePossiblyInactiveSlot(slot))
+		return false;
+
+	/* Make sure the invalidated state persists across server restart */
+	MarkGivenReplicationSlotDirty(slot);
+
+	if (persist_state)
+		SaveGivenReplicationSlot(slot, ERROR);
+
+	ReportSlotInvalidation(RS_INVAL_INACTIVE_TIMEOUT, false, 0,
+						   slot->data.name, InvalidXLogRecPtr,
+						   InvalidXLogRecPtr, InvalidTransactionId);
+
+	return true;
+}
+
+/*
+ * Helper for InvalidateInactiveReplicationSlot
+ */
+static bool
+InvalidatePossiblyInactiveSlot(ReplicationSlot *slot)
+{
+	ReplicationSlotInvalidationCause inavidation_cause = RS_INVAL_NONE;
+
+	/*
+	 * Note that we don't invalidate slot on the standby that's currently
+	 * being synced from the primary, because such slots are typically
+	 * considered not active as they don't actually perform logical decoding.
+	 */
+	if (RecoveryInProgress() && slot->data.synced)
+		return false;
+
+	if (replication_slot_inactive_timeout == 0)
+		return false;
+
+	if (slot->inactive_since > 0)
+	{
+		TimestampTz now;
+
+		/*
+		 * Do not invalidate the slots which are currently being synced from
+		 * the primary to the standby.
+		 */
+		if (RecoveryInProgress() && slot->data.synced)
+			return false;
+
+		LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+		/*
+		 * We get the current time beforehand to avoid system call while
+		 * holding the spinlock.
+		 */
+		now = GetCurrentTimestamp();
+
+		SpinLockAcquire(&slot->mutex);
+
+		/*
+		 * Check if the slot needs to be invalidated due to
+		 * replication_slot_inactive_timeout GUC. We do this with the spinlock
+		 * held to avoid race conditions -- for example the inactive_since
+		 * could change, or the slot could be dropped.
+		 */
+		if (TimestampDifferenceExceeds(slot->inactive_since, now,
+									   replication_slot_inactive_timeout * 1000))
+		{
+			inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+			slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT;
+		}
+
+		SpinLockRelease(&slot->mutex);
+		LWLockRelease(ReplicationSlotControlLock);
+
+		return (inavidation_cause == RS_INVAL_INACTIVE_TIMEOUT);
+	}
+
+	return false;
+}
+
 /*
  * Flush all replication slots to disk.
  *
@@ -1835,6 +2003,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated = false;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -1858,6 +2027,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		/* save the slot to disk, locking is handled in SaveSlotToPath() */
 		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
 
+		/*
+		 * Here's an opportunity to invalidate inactive replication slots
+		 * based on timeout, so let's do it.
+		 */
+		if (InvalidateInactiveReplicationSlot(s, false))
+			invalidated = true;
+
 		/*
 		 * Slot's data is not flushed each time the confirmed_flush LSN is
 		 * updated as that could lead to frequent writes.  However, we decide
@@ -1884,6 +2060,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	/* If the slot has been invalidated, recalculate the resource limits */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index dd6c1d5a7e..9ad3e55704 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -539,7 +539,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bc40c454de..96eeb8b7d2 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -846,7 +846,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1459,7 +1459,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index c12784cbec..4149ff1ffe 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2971,6 +2971,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index baecde2841..2e1ad2eaca 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -335,6 +335,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7b937d1a0c..f0ac324ce9 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -230,6 +232,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -245,7 +248,8 @@ extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, bool failover);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_timeout_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
@@ -264,6 +268,8 @@ extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
+extern bool InvalidateInactiveReplicationSlot(ReplicationSlot *slot,
+											  bool persist_state);
 extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock);
 extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 712924c2fa..0437ab5c46 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_wal_replay_wait.pl',
+      't/044_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_slots.pl b/src/test/recovery/t/044_invalidate_slots.pl
new file mode 100644
index 0000000000..6f1cccea55
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_slots.pl
@@ -0,0 +1,274 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to replication_slot_inactive_timeout. Also,
+# check the logical failover slot synced on to the standby doesn't invalidate
+# the slot on its own, but gets the invalidated state from the remote slot on
+# the primary.
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr_1 = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb1_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('lsub1_sync_slot', 'test_decoding', false, false, true);}
+);
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot');
+]);
+
+$standby1->start;
+
+my $standby1_logstart = -s $standby1->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Synchronize the primary server slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced'.
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub1_sync_slot' AND synced AND NOT temporary;}
+	),
+	"t",
+	'logical slot has synced as true on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$primary->reload;
+
+# Wait for the logical failover slot to become inactive on the primary. Note
+# that nobody has acquired that slot yet, so due to
+# replication_slot_inactive_timeout setting above it must get invalidated.
+wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart);
+
+# Set timeout on the standby also to check the synced slots don't get
+# invalidated due to timeout on the standby.
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$standby1->reload;
+
+# Now, sync the logical failover slot from the remote slot on the primary.
+# Note that the remote slot has already been invalidated due to inactive
+# timeout. Now, the standby must also see it as invalidated.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Wait for the inactive replication slot to be invalidated.
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'lsub1_sync_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for replication slot lsub1_sync_slot invalidation to be synced on standby";
+
+# Synced slot mustn't get invalidated on the standby, it must sync invalidation
+# from the primary. So, we must not see the slot's invalidation message in server
+# log.
+ok( !$standby1->log_contains(
+		"invalidating obsolete replication slot \"lsub1_sync_slot\"",
+		$standby1_logstart),
+	'check that syned slot has not been invalidated on the standby');
+
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
+wait_for_slot_invalidation($primary, 'sb1_slot', $logstart);
+
+# Testcase end: Invalidate streaming standby's slot as well as logical failover
+# slot on primary due to replication_slot_inactive_timeout. Also, check the
+# logical failover slot synced on to the standby doesn't invalidate the slot on
+# its own, but gets the invalidated state from the remote slot on the primary.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+
+my $publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$subscriber->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart);
+
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+# =============================================================================
+
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $name = $node->name;
+
+	# Wait for the replication slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for replication slot to become inactive";
+
+	# Wait for the replication slot info to be updated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE inactive_since IS NOT NULL
+				AND slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for info of replication slot $slot_name to be updated on node $name";
+
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+
+	# Wait for the inactive replication slot to be invalidated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for inactive replication slot $slot_name to be invalidated on node $name";
+
+	# Check that the invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+
+	($result, $stdout, $stderr) = $primary->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot_name', '0/1');
+	]);
+
+	ok( $stderr =~
+		  /can no longer get changes from replication slot "$slot_name"/,
+		"detected error upon trying to acquire invalidated slot $slot_name on node $name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot_name";
+}
+
+# Check for invalidation of slot in server log
+sub check_for_slot_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot_name invalidation has been logged");
+}
+
+done_testing();
-- 
2.34.1

#220

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#219)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Fri, Apr 05, 2024 at 11:21:43AM +0530, Bharath Rupireddy wrote:

On Wed, Apr 3, 2024 at 9:57 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
Please find the attached v36 patch.

Thanks!

A few comments:

1 ===

+       <para>
+        The timeout is measured from the time since the slot has become
+        inactive (known from its
+        <structfield>inactive_since</structfield> value) until it gets
+        used (i.e., its <structfield>active</structfield> is set to true).
+       </para>

That's right except when it's invalidated during the checkpoint (as the slot
is not acquired in CheckPointReplicationSlots()).

So, what about adding: "or a checkpoint occurs"? That would also explain that
the invalidation could occur during checkpoint.

2 ===

+       /* If the slot has been invalidated, recalculate the resource limits */
+       if (invalidated)
+       {

/If the slot/If a slot/?

3 ===

+ * NB - this function also runs as part of checkpoint, so avoid raising errors

s/NB - this/NB: This function/? (that looks more consistent with other comments
in the code)

4 ===

+ * Note that having a new function for RS_INVAL_INACTIVE_TIMEOUT cause instead

I understand it's "the RS_INVAL_INACTIVE_TIMEOUT cause" but reading "cause instead"
looks weird to me. Maybe it would make sense to reword this a bit.

5 ===

+ * considered not active as they don't actually perform logical decoding.

Not sure that's 100% accurate as we switched in fast forward logical
in 2ec005b4e2.

"as they perform only fast forward logical decoding (or not at all)", maybe?

6 ===

+       if (RecoveryInProgress() && slot->data.synced)
+               return false;
+
+       if (replication_slot_inactive_timeout == 0)
+               return false;

What about just using one if? It's more a matter of taste but it also probably
reduces the object file size a bit for non optimized build.

7 ===

+               /*
+                * Do not invalidate the slots which are currently being synced from
+                * the primary to the standby.
+                */
+               if (RecoveryInProgress() && slot->data.synced)
+                       return false;

I think we don't need this check as the exact same one is done just before.

8 ===

+sub check_for_slot_invalidation_in_server_log
+{
+       my ($node, $slot_name, $offset) = @_;
+       my $invalidated = 0;
+
+       for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+       {
+               $node->safe_psql('postgres', "CHECKPOINT");

Wouldn't be better to wait for the replication_slot_inactive_timeout time before
instead of triggering all those checkpoints? (it could be passed as an extra arg
to wait_for_slot_invalidation()).

9 ===

# Synced slot mustn't get invalidated on the standby, it must sync invalidation
# from the primary. So, we must not see the slot's invalidation message in server
# log.
ok( !$standby1->log_contains(
"invalidating obsolete replication slot \"lsub1_sync_slot\"",
$standby1_logstart),
'check that syned slot has not been invalidated on the standby');

Would that make sense to trigger a checkpoint on the standby before this test?
I mean I think that without a checkpoint on the standby we should not see the
invalidation in the log anyway.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#221

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bertrand Drouvot (#220)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Apr 5, 2024 at 1:14 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Please find the attached v36 patch.

A few comments:

1 ===
+       <para>
+        The timeout is measured from the time since the slot has become
+        inactive (known from its
+        <structfield>inactive_since</structfield> value) until it gets
+        used (i.e., its <structfield>active</structfield> is set to true).
+       </para>
That's right except when it's invalidated during the checkpoint (as the slot
is not acquired in CheckPointReplicationSlots()).

So, what about adding: "or a checkpoint occurs"? That would also explain that
the invalidation could occur during checkpoint.

Reworded.

2 ===

+       /* If the slot has been invalidated, recalculate the resource limits */
+       if (invalidated)
+       {

/If the slot/If a slot/?

Modified it to be like elsewhere.

3 ===

+ * NB - this function also runs as part of checkpoint, so avoid raising errors

s/NB - this/NB: This function/? (that looks more consistent with other comments
in the code)

Done.

4 ===

+ * Note that having a new function for RS_INVAL_INACTIVE_TIMEOUT cause instead

I understand it's "the RS_INVAL_INACTIVE_TIMEOUT cause" but reading "cause instead"
looks weird to me. Maybe it would make sense to reword this a bit.

Reworded.

5 ===

+ * considered not active as they don't actually perform logical decoding.

Not sure that's 100% accurate as we switched in fast forward logical
in 2ec005b4e2.

"as they perform only fast forward logical decoding (or not at all)", maybe?

Changed it to "as they don't perform logical decoding to produce the
changes". In fast_forward mode no changes are produced.

6 ===
+       if (RecoveryInProgress() && slot->data.synced)
+               return false;
+
+       if (replication_slot_inactive_timeout == 0)
+               return false;
What about just using one if? It's more a matter of taste but it also probably
reduces the object file size a bit for non optimized build.

Changed.

7 ===

+               /*
+                * Do not invalidate the slots which are currently being synced from
+                * the primary to the standby.
+                */
+               if (RecoveryInProgress() && slot->data.synced)
+                       return false;

I think we don't need this check as the exact same one is done just before.

Right. Removed.

8 ===
+sub check_for_slot_invalidation_in_server_log
+{
+       my ($node, $slot_name, $offset) = @_;
+       my $invalidated = 0;
+
+       for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+       {
+               $node->safe_psql('postgres', "CHECKPOINT");
Wouldn't be better to wait for the replication_slot_inactive_timeout time before
instead of triggering all those checkpoints? (it could be passed as an extra arg
to wait_for_slot_invalidation()).

Done.

9 ===

# Synced slot mustn't get invalidated on the standby, it must sync invalidation
# from the primary. So, we must not see the slot's invalidation message in server
# log.
ok( !$standby1->log_contains(
"invalidating obsolete replication slot \"lsub1_sync_slot\"",
$standby1_logstart),
'check that syned slot has not been invalidated on the standby');

Would that make sense to trigger a checkpoint on the standby before this test?
I mean I think that without a checkpoint on the standby we should not see the
invalidation in the log anyway.

Done.

Please find the attached v37 patch for further review.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v37-0001-Add-inactive_timeout-based-replication-slot-inva.patchapplication/octet-stream; name=v37-0001-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 648d6f55d4b1593d0f09ab3ee8dcc91d57bf4961 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 6 Apr 2024 06:22:28 +0000
Subject: [PATCH v37] Add inactive_timeout based replication slot invalidation.

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days at slot level, after which the inactive slots get
invalidated.

To achieve the above, postgres introduces a GUC allowing users
set inactive timeout. The replication slots that are inactive
for longer than specified amount of time get invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint

Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/config.sgml                      |  33 ++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |  11 +-
 src/backend/replication/slot.c                | 211 ++++++++++++-
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   8 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/044_invalidate_slots.pl   | 283 ++++++++++++++++++
 13 files changed, 553 insertions(+), 24 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 624518e0b0..42f7b15aa7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4547,6 +4547,39 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidates replication slots that are inactive for longer than
+        specified amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the timeout mechanism. This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during a checkpoint. The time since the slot has become
+        inactive is known from its
+        <structfield>inactive_since</structfield> value using which the
+        timeout is measured.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>).
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 7ed617170f..063638beda 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2580,6 +2580,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index d18e2c7342..e92f559539 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -362,7 +362,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -575,6 +575,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 						   " name slot \"%s\" already exists on the standby",
 						   remote_slot->name));
 
+		/*
+		 * Skip the sync if the local slot is already invalidated. We do this
+		 * beforehand to avoid slot acquire and release.
+		 */
+		if (slot->data.invalidated != RS_INVAL_NONE)
+			return false;
+
 		/*
 		 * The slot has been synchronized before.
 		 *
@@ -591,7 +598,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 3bddaae022..4a43b29d89 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -158,6 +160,7 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool InvalidatePossiblyInactiveSlot(ReplicationSlot *slot);
 
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
@@ -535,9 +538,14 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * If check_for_timeout_invalidation is true, the slot is checked for
+ * invalidation based on replication_slot_inactive_timeout GUC, and an error is
+ * raised after making the slot ours.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_timeout_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +623,34 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * Check if the given slot can be invalidated based on its inactive
+	 * timeout. If yes, persist the invalidated state to disk and then error
+	 * out. We do this only after making the slot ours to avoid anyone else
+	 * acquiring it while we check for its invalidation.
+	 */
+	if (check_for_timeout_invalidation)
+	{
+		/* The slot is ours by now */
+		Assert(s->active_pid == MyProcPid);
+
+		if (InvalidateInactiveReplicationSlot(s, true))
+		{
+			/*
+			 * If the slot has been invalidated, recalculate the resource
+			 * limits.
+			 */
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(MyReplicationSlot->data.name)),
+					 errdetail("This slot has been invalidated because it was inactive for more than replication_slot_inactive_timeout.")));
+		}
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -781,7 +817,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -804,7 +840,7 @@ ReplicationSlotAlter(const char *name, bool failover)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -980,6 +1016,20 @@ ReplicationSlotDropPtr(ReplicationSlot *slot)
 	LWLockRelease(ReplicationSlotAllocationLock);
 }
 
+/*
+ * Helper for ReplicationSlotSave
+ */
+static inline void
+SaveGivenReplicationSlot(ReplicationSlot *slot, int elevel)
+{
+	char		path[MAXPGPATH];
+
+	Assert(slot != NULL);
+
+	sprintf(path, "pg_replslot/%s", NameStr(slot->data.name));
+	SaveSlotToPath(slot, path, elevel);
+}
+
 /*
  * Serialize the currently acquired slot's state from memory to disk, thereby
  * guaranteeing the current state will survive a crash.
@@ -987,12 +1037,21 @@ ReplicationSlotDropPtr(ReplicationSlot *slot)
 void
 ReplicationSlotSave(void)
 {
-	char		path[MAXPGPATH];
+	SaveGivenReplicationSlot(MyReplicationSlot, ERROR);
+}
 
-	Assert(MyReplicationSlot != NULL);
+/*
+ * Helper for ReplicationSlotMarkDirty
+ */
+static inline void
+MarkGivenReplicationSlotDirty(ReplicationSlot *slot)
+{
+	Assert(slot != NULL);
 
-	sprintf(path, "pg_replslot/%s", NameStr(MyReplicationSlot->data.name));
-	SaveSlotToPath(MyReplicationSlot, path, ERROR);
+	SpinLockAcquire(&slot->mutex);
+	slot->just_dirtied = true;
+	slot->dirty = true;
+	SpinLockRelease(&slot->mutex);
 }
 
 /*
@@ -1005,14 +1064,7 @@ ReplicationSlotSave(void)
 void
 ReplicationSlotMarkDirty(void)
 {
-	ReplicationSlot *slot = MyReplicationSlot;
-
-	Assert(MyReplicationSlot != NULL);
-
-	SpinLockAcquire(&slot->mutex);
-	MyReplicationSlot->just_dirtied = true;
-	MyReplicationSlot->dirty = true;
-	SpinLockRelease(&slot->mutex);
+	MarkGivenReplicationSlotDirty(MyReplicationSlot);
 }
 
 /*
@@ -1506,6 +1558,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than replication_slot_inactive_timeout."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1550,6 +1605,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
 
+	/*
+	 * Use InvalidateInactiveReplicationSlot for inactive timeout based
+	 * invalidation.
+	 */
+	Assert(cause != RS_INVAL_INACTIVE_TIMEOUT);
+
 	for (;;)
 	{
 		XLogRecPtr	restart_lsn;
@@ -1619,6 +1680,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					/* not reachable */
+					Assert(false);
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1787,6 +1852,12 @@ InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
 	Assert(cause != RS_INVAL_WAL_REMOVED || oldestSegno > 0);
 	Assert(cause != RS_INVAL_NONE);
 
+	/*
+	 * Use InvalidateInactiveReplicationSlot for inactive timeout based
+	 * invalidation.
+	 */
+	Assert(cause != RS_INVAL_INACTIVE_TIMEOUT);
+
 	if (max_replication_slots == 0)
 		return invalidated;
 
@@ -1823,6 +1894,97 @@ restart:
 	return invalidated;
 }
 
+/*
+ * Invalidate given slot based on replication_slot_inactive_timeout GUC.
+ *
+ * Returns true if the slot has got invalidated.
+ *
+ * Whether the given slot needs to be invalidated depends on the cause:
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout occurs
+ *
+ * NB: This function also runs as part of checkpoint, so avoid raising errors
+ * if possible.
+ *
+ * Note that having a new function for inactive timeout invalidation mechanism
+ * instead of using InvalidateObsoleteReplicationSlots provides flexibility in
+ * calling it at slot-level opportunistically, and choosing whether or not to
+ * persist the slot info to disk.
+ */
+bool
+InvalidateInactiveReplicationSlot(ReplicationSlot *slot, bool persist_state)
+{
+	if (!InvalidatePossiblyInactiveSlot(slot))
+		return false;
+
+	/* Make sure the invalidated state persists across server restart */
+	MarkGivenReplicationSlotDirty(slot);
+
+	if (persist_state)
+		SaveGivenReplicationSlot(slot, ERROR);
+
+	ReportSlotInvalidation(RS_INVAL_INACTIVE_TIMEOUT, false, 0,
+						   slot->data.name, InvalidXLogRecPtr,
+						   InvalidXLogRecPtr, InvalidTransactionId);
+
+	return true;
+}
+
+/*
+ * Helper for InvalidateInactiveReplicationSlot
+ */
+static bool
+InvalidatePossiblyInactiveSlot(ReplicationSlot *slot)
+{
+	ReplicationSlotInvalidationCause inavidation_cause = RS_INVAL_NONE;
+
+	/*
+	 * Quick exit if inactive timeout invalidation mechanism is disabled or
+	 * the slot on standby is currently being synced from the primary.
+	 *
+	 * Note that we don't invalidate synced slots because, they are typically
+	 * considered not active as they don't perform logical decoding to produce
+	 * the changes.
+	 */
+	if (replication_slot_inactive_timeout == 0 ||
+		(RecoveryInProgress() && slot->data.synced))
+		return false;
+
+	if (slot->inactive_since > 0)
+	{
+		TimestampTz now;
+
+		LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+		/*
+		 * We get the current time beforehand to avoid system call while
+		 * holding the spinlock.
+		 */
+		now = GetCurrentTimestamp();
+
+		SpinLockAcquire(&slot->mutex);
+
+		/*
+		 * Check if the slot needs to be invalidated due to
+		 * replication_slot_inactive_timeout GUC. We do this with the spinlock
+		 * held to avoid race conditions -- for example the inactive_since
+		 * could change, or the slot could be dropped.
+		 */
+		if (TimestampDifferenceExceeds(slot->inactive_since, now,
+									   replication_slot_inactive_timeout * 1000))
+		{
+			inavidation_cause = RS_INVAL_INACTIVE_TIMEOUT;
+			slot->data.invalidated = RS_INVAL_INACTIVE_TIMEOUT;
+		}
+
+		SpinLockRelease(&slot->mutex);
+		LWLockRelease(ReplicationSlotControlLock);
+
+		return (inavidation_cause == RS_INVAL_INACTIVE_TIMEOUT);
+	}
+
+	return false;
+}
+
 /*
  * Flush all replication slots to disk.
  *
@@ -1835,6 +1997,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated = false;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -1858,6 +2021,13 @@ CheckPointReplicationSlots(bool is_shutdown)
 		/* save the slot to disk, locking is handled in SaveSlotToPath() */
 		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
 
+		/*
+		 * Here's an opportunity to invalidate inactive replication slots
+		 * based on timeout, so let's do it.
+		 */
+		if (InvalidateInactiveReplicationSlot(s, false))
+			invalidated = true;
+
 		/*
 		 * Slot's data is not flushed each time the confirmed_flush LSN is
 		 * updated as that could lead to frequent writes.  However, we decide
@@ -1884,6 +2054,15 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	/*
+	 * If any slots have been invalidated, recalculate the resource limits.
+	 */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index dd6c1d5a7e..9ad3e55704 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -539,7 +539,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bc40c454de..96eeb8b7d2 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -846,7 +846,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1459,7 +1459,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index c12784cbec..4149ff1ffe 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2971,6 +2971,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index baecde2841..2e1ad2eaca 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -335,6 +335,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7b937d1a0c..f0ac324ce9 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -230,6 +232,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -245,7 +248,8 @@ extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, bool failover);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_timeout_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
@@ -264,6 +268,8 @@ extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
+extern bool InvalidateInactiveReplicationSlot(ReplicationSlot *slot,
+											  bool persist_state);
 extern ReplicationSlot *SearchNamedReplicationSlot(const char *name, bool need_lock);
 extern int	ReplicationSlotIndex(ReplicationSlot *slot);
 extern bool ReplicationSlotName(int index, Name name);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 712924c2fa..0437ab5c46 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_wal_replay_wait.pl',
+      't/044_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_slots.pl b/src/test/recovery/t/044_invalidate_slots.pl
new file mode 100644
index 0000000000..1e30ea05ef
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_slots.pl
@@ -0,0 +1,283 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to replication_slot_inactive_timeout. Also,
+# check the logical failover slot synced on to the standby doesn't invalidate
+# the slot on its own, but gets the invalidated state from the remote slot on
+# the primary.
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr_1 = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb1_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('lsub1_sync_slot', 'test_decoding', false, false, true);}
+);
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot');
+]);
+
+$standby1->start;
+
+my $standby1_logstart = -s $standby1->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Synchronize the primary server slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced'.
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub1_sync_slot' AND synced AND NOT temporary;}
+	),
+	"t",
+	'logical slot has synced as true on standby');
+
+my $logstart = -s $primary->logfile;
+my $inactive_timeout = 2;
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$primary->reload;
+
+# Wait for the logical failover slot to become inactive on the primary. Note
+# that nobody has acquired that slot yet, so due to
+# replication_slot_inactive_timeout setting above it must get invalidated.
+wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart,
+	$inactive_timeout);
+
+# Set timeout on the standby also to check the synced slots don't get
+# invalidated due to timeout on the standby.
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$standby1->reload;
+
+# Now, sync the logical failover slot from the remote slot on the primary.
+# Note that the remote slot has already been invalidated due to inactive
+# timeout. Now, the standby must also see it as invalidated.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Wait for the inactive replication slot to be invalidated.
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'lsub1_sync_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for replication slot lsub1_sync_slot invalidation to be synced on standby";
+
+# Synced slot mustn't get invalidated on the standby even after a checkpoint,
+# it must sync invalidation from the primary. So, we must not see the slot's
+# invalidation message in server log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+ok( !$standby1->log_contains(
+		"invalidating obsolete replication slot \"lsub1_sync_slot\"",
+		$standby1_logstart),
+	'check that syned slot has not been invalidated on the standby');
+
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
+wait_for_slot_invalidation($primary, 'sb1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end: Invalidate streaming standby's slot as well as logical failover
+# slot on primary due to replication_slot_inactive_timeout. Also, check the
+# logical failover slot synced on to the standby doesn't invalidate the slot on
+# its own, but gets the invalidated state from the remote slot on the primary.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+
+my $publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$subscriber->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO ' ${inactive_timeout}s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+# =============================================================================
+
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset, $inactive_timeout) = @_;
+	my $name = $node->name;
+
+	# Wait for the replication slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for replication slot to become inactive";
+
+	# Wait for the replication slot info to be updated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE inactive_since IS NOT NULL
+				AND slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for info of replication slot $slot_name to be updated on node $name";
+
+	# Sleep at least $inactive_timeout duration to avoid multiple checkpoints
+	# for the slot to get invalidated.
+	sleep($inactive_timeout);
+
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+
+	# Wait for the inactive replication slot to be invalidated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for inactive replication slot $slot_name to be invalidated on node $name";
+
+	# Check that the invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+
+	($result, $stdout, $stderr) = $primary->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot_name', '0/1');
+	]);
+
+	ok( $stderr =~
+		  /can no longer get changes from replication slot "$slot_name"/,
+		"detected error upon trying to acquire invalidated slot $slot_name on node $name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot_name";
+}
+
+# Check for invalidation of slot in server log
+sub check_for_slot_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot_name invalidation has been logged");
+}
+
+done_testing();
-- 
2.34.1

#222

Amit Kapila

amit.kapila16@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#221)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sat, Apr 6, 2024 at 11:55 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Why the handling w.r.t active_pid in InvalidatePossiblyInactiveSlot()
is not similar to InvalidatePossiblyObsoleteSlot(). Won't we need to
ensure that there is no other active slot user? Is it sufficient to
check inactive_since for the same? If so, we need some comments to
explain the same.

Can we avoid introducing the new functions like
SaveGivenReplicationSlot() and MarkGivenReplicationSlotDirty(), if we
do the required work in the caller?

--
With Regards,
Amit Kapila.

#223

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Amit Kapila (#222)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sat, Apr 6, 2024 at 12:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Why the handling w.r.t active_pid in InvalidatePossiblyInactiveSlot()
is not similar to InvalidatePossiblyObsoleteSlot(). Won't we need to
ensure that there is no other active slot user? Is it sufficient to
check inactive_since for the same? If so, we need some comments to
explain the same.

I removed the separate functions and with minimal changes, I've now
placed the RS_INVAL_INACTIVE_TIMEOUT logic into
InvalidatePossiblyObsoleteSlot and use that even in
CheckPointReplicationSlots.

Can we avoid introducing the new functions like
SaveGivenReplicationSlot() and MarkGivenReplicationSlotDirty(), if we
do the required work in the caller?

Hm. Removed them now.

Please see the attached v38 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v38-0001-Add-inactive_timeout-based-replication-slot-inva.patchapplication/octet-stream; name=v38-0001-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From a2c66c711da7118eae48446c87e5595f85c8e45a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 6 Apr 2024 11:13:02 +0000
Subject: [PATCH v38] Add inactive_timeout based replication slot invalidation.

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days at slot level, after which the inactive slots get
invalidated.

To achieve the above, postgres introduces a GUC allowing users
set inactive timeout. The replication slots that are inactive
for longer than specified amount of time get invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint

Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
---
 doc/src/sgml/config.sgml                      |  33 ++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |  11 +-
 src/backend/replication/slot.c                | 174 ++++++++++-
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   6 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/044_invalidate_slots.pl   | 283 ++++++++++++++++++
 13 files changed, 523 insertions(+), 15 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 624518e0b0..42f7b15aa7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4547,6 +4547,39 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidates replication slots that are inactive for longer than
+        specified amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the timeout mechanism. This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during a checkpoint. The time since the slot has become
+        inactive is known from its
+        <structfield>inactive_since</structfield> value using which the
+        timeout is measured.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>).
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 7ed617170f..063638beda 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2580,6 +2580,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index d18e2c7342..e92f559539 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -362,7 +362,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -575,6 +575,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 						   " name slot \"%s\" already exists on the standby",
 						   remote_slot->name));
 
+		/*
+		 * Skip the sync if the local slot is already invalidated. We do this
+		 * beforehand to avoid slot acquire and release.
+		 */
+		if (slot->data.invalidated != RS_INVAL_NONE)
+			return false;
+
 		/*
 		 * The slot has been synchronized before.
 		 *
@@ -591,7 +598,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 3bddaae022..767d818706 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -159,6 +161,14 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 
+static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
+										   ReplicationSlot *s,
+										   XLogRecPtr oldestLSN,
+										   Oid dboid,
+										   TransactionId snapshotConflictHorizon,
+										   bool need_lock,
+										   bool *invalidated);
+
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
@@ -535,12 +545,18 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * If check_for_timeout_invalidation is true, the slot is checked for
+ * invalidation based on replication_slot_inactive_timeout GUC, and an error is
+ * raised after making the slot ours.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_timeout_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
+	bool		released_lock = false;
 
 	Assert(name != NULL);
 
@@ -583,7 +599,60 @@ retry:
 	}
 	else
 		active_pid = MyProcPid;
-	LWLockRelease(ReplicationSlotControlLock);
+
+	/*
+	 * When the slot is not active under other process, check if the given
+	 * slot is to be invalidated based on inactive timeout before we acquire
+	 * it.
+	 */
+	if (active_pid == MyProcPid &&
+		check_for_timeout_invalidation)
+	{
+		bool		invalidated = false;
+
+		released_lock = InvalidatePossiblyObsoleteSlot(RS_INVAL_INACTIVE_TIMEOUT,
+													   s, 0, InvalidOid,
+													   InvalidTransactionId,
+													   false,
+													   &invalidated);
+
+		/*
+		 * If the slot has been invalidated, recalculate the resource limits.
+		 */
+		if (invalidated)
+		{
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+		}
+
+		/*
+		 * If the slot has been invalidated now or previously, error out as
+		 * there's no point in acquiring the slot.
+		 *
+		 * XXX: We might need to check for all invalidations and error out
+		 * here.
+		 */
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		{
+			/*
+			 * Release the lock if it's not yet and make this slot ours to
+			 * keep the cleanup path on error is happy.
+			 */
+			if (!released_lock)
+				LWLockRelease(ReplicationSlotControlLock);
+
+			MyReplicationSlot = s;
+
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(s->data.name)),
+					 errdetail("This slot has been invalidated because it was inactive for more than replication_slot_inactive_timeout.")));
+		}
+	}
+
+	if (!released_lock)
+		LWLockRelease(ReplicationSlotControlLock);
 
 	/*
 	 * If we found the slot but it's already active in another process, we
@@ -781,7 +850,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -804,7 +873,7 @@ ReplicationSlotAlter(const char *name, bool failover)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1506,6 +1575,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			appendStringInfoString(&err_detail, _("The slot has been inactive for more than replication_slot_inactive_timeout."));
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1540,7 +1612,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 							   ReplicationSlot *s,
 							   XLogRecPtr oldestLSN,
 							   Oid dboid, TransactionId snapshotConflictHorizon,
-							   bool *invalidated)
+							   bool need_lock, bool *invalidated)
 {
 	int			last_signaled_pid = 0;
 	bool		released_lock = false;
@@ -1550,12 +1622,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
 
+	if (need_lock)
+		LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
 	for (;;)
 	{
 		XLogRecPtr	restart_lsn;
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1566,6 +1642,18 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT &&
+			(replication_slot_inactive_timeout > 0 &&
+			 s->inactive_since > 0 &&
+			 !(RecoveryInProgress() && s->data.synced)))
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1619,6 +1707,41 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					{
+						/*
+						 * Quick exit if inactive timeout invalidation
+						 * mechanism is disabled or slot is currently being
+						 * used or the slot on standby is currently being
+						 * synced from the primary.
+						 *
+						 * Note that we don't invalidate synced slots because,
+						 * they are typically considered not active as they
+						 * don't perform logical decoding to produce the
+						 * changes.
+						 */
+						if (replication_slot_inactive_timeout == 0 ||
+							s->inactive_since == 0 ||
+							(RecoveryInProgress() && s->data.synced))
+							break;
+
+						/*
+						 * Check if the slot needs to be invalidated due to
+						 * replication_slot_inactive_timeout GUC.
+						 */
+						if (TimestampDifferenceExceeds(s->inactive_since, now,
+													   replication_slot_inactive_timeout * 1000))
+						{
+							invalidation_cause = cause;
+
+							/*
+							 * Invalidation due to inactive timeout implies no
+							 * one using the slot.
+							 */
+							Assert(s->active_pid == 0);
+						}
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1756,6 +1879,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		}
 	}
 
+	/*
+	 * Release the lock if we have acquired it at the beginning, and have not
+	 * released it yet.
+	 */
+	if (need_lock && !released_lock)
+	{
+		LWLockRelease(ReplicationSlotControlLock);
+		released_lock = true;
+	}
+
 	Assert(released_lock == !LWLockHeldByMe(ReplicationSlotControlLock));
 
 	return released_lock;
@@ -1772,6 +1905,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1803,7 +1937,7 @@ restart:
 
 		if (InvalidatePossiblyObsoleteSlot(cause, s, oldestLSN, dboid,
 										   snapshotConflictHorizon,
-										   &invalidated))
+										   false, &invalidated))
 		{
 			/* if the lock was released, start from scratch */
 			goto restart;
@@ -1835,6 +1969,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated = false;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -1851,10 +1986,26 @@ CheckPointReplicationSlots(bool is_shutdown)
 	{
 		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
 		char		path[MAXPGPATH];
+		bool		released_lock PG_USED_FOR_ASSERTS_ONLY = false;
 
 		if (!s->in_use)
 			continue;
 
+		/*
+		 * Here's an opportunity to invalidate inactive replication slots
+		 * based on timeout, so let's do it.
+		 */
+		released_lock = InvalidatePossiblyObsoleteSlot(RS_INVAL_INACTIVE_TIMEOUT, s,
+													   0, InvalidOid,
+													   InvalidTransactionId,
+													   true, &invalidated);
+
+		/*
+		 * InvalidatePossiblyObsoleteSlot must have released the lock as we
+		 * have told it explicitly acquire the lock.
+		 */
+		Assert(released_lock);
+
 		/* save the slot to disk, locking is handled in SaveSlotToPath() */
 		sprintf(path, "pg_replslot/%s", NameStr(s->data.name));
 
@@ -1884,6 +2035,15 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	/*
+	 * If any slots have been invalidated, recalculate the resource limits.
+	 */
+	if (invalidated)
+	{
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index dd6c1d5a7e..9ad3e55704 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -539,7 +539,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bc40c454de..96eeb8b7d2 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -846,7 +846,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1459,7 +1459,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index c12784cbec..4149ff1ffe 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2971,6 +2971,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index baecde2841..2e1ad2eaca 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -335,6 +335,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7b937d1a0c..3f3bdf7441 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -230,6 +232,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -245,7 +248,8 @@ extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, bool failover);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_timeout_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 712924c2fa..0437ab5c46 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_wal_replay_wait.pl',
+      't/044_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_slots.pl b/src/test/recovery/t/044_invalidate_slots.pl
new file mode 100644
index 0000000000..7796d87963
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_slots.pl
@@ -0,0 +1,283 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to replication_slot_inactive_timeout. Also,
+# check the logical failover slot synced on to the standby doesn't invalidate
+# the slot on its own, but gets the invalidated state from the remote slot on
+# the primary.
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr_1 = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb1_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('lsub1_sync_slot', 'test_decoding', false, false, true);}
+);
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot');
+]);
+
+$standby1->start;
+
+my $standby1_logstart = -s $standby1->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Synchronize the primary server slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced'.
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub1_sync_slot' AND synced AND NOT temporary;}
+	),
+	"t",
+	'logical slot has synced as true on standby');
+
+my $logstart = -s $primary->logfile;
+my $inactive_timeout = 2;
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$primary->reload;
+
+# Wait for the logical failover slot to become inactive on the primary. Note
+# that nobody has acquired that slot yet, so due to
+# replication_slot_inactive_timeout setting above it must get invalidated.
+wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart,
+	$inactive_timeout);
+
+# Set timeout on the standby also to check the synced slots don't get
+# invalidated due to timeout on the standby.
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$standby1->reload;
+
+# Now, sync the logical failover slot from the remote slot on the primary.
+# Note that the remote slot has already been invalidated due to inactive
+# timeout. Now, the standby must also see it as invalidated.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Wait for the inactive replication slot to be invalidated.
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'lsub1_sync_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for replication slot lsub1_sync_slot invalidation to be synced on standby";
+
+# Synced slot mustn't get invalidated on the standby even after a checkpoint,
+# it must sync invalidation from the primary. So, we must not see the slot's
+# invalidation message in server log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+ok( !$standby1->log_contains(
+		"invalidating obsolete replication slot \"lsub1_sync_slot\"",
+		$standby1_logstart),
+	'check that syned slot has not been invalidated on the standby');
+
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
+wait_for_slot_invalidation($primary, 'sb1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end: Invalidate streaming standby's slot as well as logical failover
+# slot on primary due to replication_slot_inactive_timeout. Also, check the
+# logical failover slot synced on to the standby doesn't invalidate the slot on
+# its own, but gets the invalidated state from the remote slot on the primary.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+
+my $publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$subscriber->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO ' ${inactive_timeout}s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+# =============================================================================
+
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset, $inactive_timeout) = @_;
+	my $name = $node->name;
+
+	# Wait for the replication slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for replication slot to become inactive";
+
+	# Wait for the replication slot info to be updated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE inactive_since IS NOT NULL
+				AND slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for info of replication slot $slot_name to be updated on node $name";
+
+	# Sleep at least $inactive_timeout duration to avoid multiple checkpoints
+	# for the slot to get invalidated.
+	sleep($inactive_timeout);
+
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+
+	# Wait for the inactive replication slot to be invalidated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for inactive replication slot $slot_name to be invalidated on node $name";
+
+	# Check that the invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot_name', '0/1');
+	]);
+
+	ok( $stderr =~
+		  /can no longer get changes from replication slot "$slot_name"/,
+		"detected error upon trying to acquire invalidated slot $slot_name on node $name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot_name";
+}
+
+# Check for invalidation of slot in server log
+sub check_for_slot_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot_name invalidation has been logged");
+}
+
+done_testing();
-- 
2.34.1

#224

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 2 years ago

In reply to: Bharath Rupireddy (#223)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sat, Apr 6, 2024 at 5:10 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Please see the attached v38 patch.

Hi, thanks everyone for reviewing the design and patches so far. Here
I'm with the v39 patches implementing inactive timeout based (0001)
and XID age based (0002) invalidation mechanisms.

I'm quoting the hackers who are okay with inactive timeout based
invalidation mechanism:
Bertrand Drouvot -
/messages/by-id/ZgL0N+xVJNkyqsKL@ip-10-97-1-34.eu-west-3.compute.internal
and /messages/by-id/ZgPHDAlM79iLtGIH@ip-10-97-1-34.eu-west-3.compute.internal
Amit Kapila - /messages/by-id/CAA4eK1L3awyzWMuymLJUm8SoFEQe=Da9KUwCcAfC31RNJ1xdJA@mail.gmail.com
Nathan Bossart -
/messages/by-id/20240325195443.GA2923888@nathanxps13
Robert Haas - /messages/by-id/CA+TgmoZTbaaEjSZUG1FL0mzxAdN3qmXksO3O9_PZhEuXTkVnRQ@mail.gmail.com

I'm quoting the hackers who are okay with XID age based invalidation mechanism:
Nathan Bossart -
/messages/by-id/20240326150918.GB3181099@nathanxps13
and /messages/by-id/20240327150557.GA3994937@nathanxps13
Alvaro Herrera -
/messages/by-id/202403261539.xcjfle7sksz7@alvherre.pgsql
Bertrand Drouvot -
/messages/by-id/ZgPHDAlM79iLtGIH@ip-10-97-1-34.eu-west-3.compute.internal
Amit Kapila - /messages/by-id/CAA4eK1L3awyzWMuymLJUm8SoFEQe=Da9KUwCcAfC31RNJ1xdJA@mail.gmail.com

There was a point raised by Robert
/messages/by-id/CA+TgmoaRECcnyqxAxUhP5dk2S4HX=pGh-p-PkA3uc+jG_9hiMw@mail.gmail.com
for XID age based invalidation. An issue related to
vacuum_defer_cleanup_age
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=be504a3e974d75be6f95c8f9b7367126034f2d12
led to the removal of the GUC
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1118cd37eb61e6a2428f457a8b2026a7bb3f801a.
The same issue may not happen for the XID age based invaliation. This
is because the XID age is not calculated using FullTransactionId but
using TransactionId as the slot's xmin and catalog_xmin are tracked as
TransactionId.

There was a point raised by Amit
/messages/by-id/CAA4eK1K8wqLsMw6j0hE_SFoWAeo3Kw8UNnMfhsWaYDF1GWYQ+g@mail.gmail.com
on when to do the XID age based invalidation - whether in checkpointer
or when vacuum is being run or whenever ComputeXIDHorizons gets called
or in autovacuum process. For now, I've chosen the design to do these
new invalidation checks in two places - 1) whenever the slot is
acquired and the slot acquisition errors out if invalidated, 2) during
checkpoint. However, I'm open to suggestions on this.

I've also verified the case whether the replication_slot_xid_age
setting can help in case of server inching towards the XID wraparound.
I've created a primary and streaming standby setup with
hot_standby_feedback set to on (so that the slot gets an xmin). Then,
I've set replication_slot_xid_age to 2 billion on the primary, and
used xid_wraparound extension to reach XID wraparound on the primary.
Once I start receiving the WARNINGs about VACUUM, I did a checkpoint
after which the slot got invalidated enabling my VACUUM to freeze XIDs
saving my database from XID wraparound problem.

Thanks a lot Masahiko Sawada for an offlist chat about the XID age
calculation logic.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v39-0001-Add-inactive_timeout-based-replication-slot-inva.patchapplication/x-patch; name=v39-0001-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From f3ba2562ba7d9c4f13e283740260025b8d1c9b0f Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 12 Apr 2024 14:52:35 +0000
Subject: [PATCH v39 1/2] Add inactive_timeout based replication slot
 invalidation.

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get invalidated.

To achieve the above, postgres introduces a GUC allowing users
set inactive timeout. The replication slots that are inactive
for longer than specified amount of time get invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint

Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby, because such synced slots are typically considered not
active (for them to be later considered as inactive) as they don't
perform logical decoding to produce the changes.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CA%2BTgmoZTbaaEjSZUG1FL0mzxAdN3qmXksO3O9_PZhEuXTkVnRQ%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/202403260841.5jcv7ihniccy%40alvherre.pgsql
---
 doc/src/sgml/config.sgml                      |  33 ++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |  11 +-
 src/backend/replication/slot.c                | 188 +++++++++++-
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   6 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 286 ++++++++++++++++++
 13 files changed, 535 insertions(+), 20 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d8e1282e12..a73677b98b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4547,6 +4547,39 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidates replication slots that are inactive for longer than
+        specified amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the timeout mechanism. This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during a checkpoint. The time since the slot has become
+        inactive is known from its
+        <structfield>inactive_since</structfield> value using which the
+        timeout is measured.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>).
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 7ed617170f..063638beda 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2580,6 +2580,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index bda0de52db..c47e56f78f 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -450,7 +450,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -653,6 +653,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 						   " name slot \"%s\" already exists on the standby",
 						   remote_slot->name));
 
+		/*
+		 * Skip the sync if the local slot is already invalidated. We do this
+		 * beforehand to avoid slot acquire and release.
+		 */
+		if (slot->data.invalidated != RS_INVAL_NONE)
+			return false;
+
 		/*
 		 * The slot has been synchronized before.
 		 *
@@ -669,7 +676,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index cebf44bb0f..7cfbc2dfff 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -159,6 +161,13 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 
+static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
+										   ReplicationSlot *s,
+										   XLogRecPtr oldestLSN,
+										   Oid dboid,
+										   TransactionId snapshotConflictHorizon,
+										   bool *invalidated);
+
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
@@ -535,12 +544,17 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if check_for_invalidation is true and the slot gets
+ * invalidated now or has been invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
+	bool		released_lock = false;
 
 	Assert(name != NULL);
 
@@ -615,6 +629,57 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	/*
+	 * Check if the acquired slot needs to be invalidated. And, error out if
+	 * it gets invalidated now or has been invalidated previously, because
+	 * there's no use in acquiring the invalidated slot.
+	 *
+	 * XXX: Currently we check for inactive_timeout invalidation here. We
+	 * might need to check for other invalidations too.
+	 */
+	if (check_for_invalidation)
+	{
+		bool		invalidated = false;
+
+		released_lock = InvalidatePossiblyObsoleteSlot(RS_INVAL_INACTIVE_TIMEOUT,
+													   s, 0, InvalidOid,
+													   InvalidTransactionId,
+													   &invalidated);
+
+		/*
+		 * If the slot has been invalidated, recalculate the resource limits.
+		 */
+		if (invalidated)
+		{
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+		}
+
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		{
+			/*
+			 * Release the lock if it's not yet to keep the cleanup path on
+			 * error happy.
+			 */
+			if (!released_lock)
+				LWLockRelease(ReplicationSlotControlLock);
+
+			Assert(s->inactive_since > 0);
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(s->data.name)),
+					 errdetail("This slot has been invalidated because it was inactive since %s for more than replication_slot_inactive_timeout = %d seconds.",
+							   timestamptz_to_str(s->inactive_since),
+							   replication_slot_inactive_timeout)));
+		}
+	}
+
+	if (!released_lock)
+		LWLockRelease(ReplicationSlotControlLock);
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -781,7 +846,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -804,7 +869,7 @@ ReplicationSlotAlter(const char *name, bool failover)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1476,7 +1541,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1506,6 +1572,13 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires wal_level >= logical on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			Assert(inactive_since > 0);
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s for more than replication_slot_inactive_timeout = %d seconds."),
+							 timestamptz_to_str(inactive_since),
+							 replication_slot_inactive_timeout);
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1549,6 +1622,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1556,6 +1630,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1566,6 +1641,18 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT &&
+			(replication_slot_inactive_timeout > 0 &&
+			 s->inactive_since > 0 &&
+			 !(RecoveryInProgress() && s->data.synced)))
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1619,6 +1706,39 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+
+					/*
+					 * Quick exit if inactive timeout invalidation mechanism
+					 * is disabled or slot is currently being used or the slot
+					 * on standby is currently being synced from the primary.
+					 *
+					 * Note that we don't invalidate synced slots because,
+					 * they are typically considered not active as they don't
+					 * perform logical decoding to produce the changes.
+					 */
+					if (replication_slot_inactive_timeout == 0 ||
+						s->inactive_since == 0 ||
+						(RecoveryInProgress() && s->data.synced))
+						break;
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * replication_slot_inactive_timeout GUC.
+					 */
+					if (TimestampDifferenceExceeds(s->inactive_since, now,
+												   replication_slot_inactive_timeout * 1000))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+
+						/*
+						 * Invalidation due to inactive timeout implies that
+						 * no one is using the slot.
+						 */
+						Assert(s->active_pid == 0);
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1644,11 +1764,14 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so or if the slot is already ours,
+		 * then mark it invalidated.  Otherwise we'll signal the owning
+		 * process, below, and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot != NULL &&
+			 MyReplicationSlot == s &&
+			 active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1703,7 +1826,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1749,7 +1873,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1772,6 +1897,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1824,7 +1950,7 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk and invalidate slots.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1835,6 +1961,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated = false;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -1884,6 +2011,43 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	elog(DEBUG1, "performing replication slot invalidation");
+
+	/*
+	 * Note that we will make another pass over replication slots for
+	 * invalidations to keep the code simple. The assumption here is that the
+	 * traversal over replication slots isn't that costly even with hundreds
+	 * of replication slots. If it ever turns out that this assumption is
+	 * wrong, we might have to put the invalidation check logic in the above
+	 * loop, for that we might need to do the following:
+	 *
+	 * - Acqure ControlLock lock once before the loop.
+	 *
+	 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+	 *
+	 * - Handle the cases in which ControlLock gets released just like
+	 * InvalidateObsoleteReplicationSlots does.
+	 *
+	 * - Avoid saving slot info to disk two times for each invalidated slot.
+	 *
+	 * XXX: Should we move inactive_timeout inavalidation check closer to
+	 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+	 */
+	invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+													 0,
+													 InvalidOid,
+													 InvalidTransactionId);
+
+	if (invalidated)
+	{
+		/*
+		 * If any slots have been invalidated, recalculate the resource
+		 * limits.
+		 */
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index dd6c1d5a7e..9ad3e55704 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -539,7 +539,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bc40c454de..96eeb8b7d2 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -846,7 +846,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1459,7 +1459,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index c68fdc008b..79e7637ec9 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2982,6 +2982,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 2166ea4a87..819310b0a7 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -335,6 +335,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 7b937d1a0c..8727b7b58b 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -230,6 +232,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -245,7 +248,8 @@ extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, bool failover);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(void);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..b4c5ce2875 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -10,6 +10,7 @@ tests += {
        'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
     },
     'tests': [
+      't/050_invalidate_slots.pl',
       't/001_stream_rep.pl',
       't/002_archiving.pl',
       't/003_recovery_targets.pl',
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..4663019c16
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,286 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to replication_slot_inactive_timeout. Also,
+# check the logical failover slot synced on to the standby doesn't invalidate
+# the slot on its own, but gets the invalidated state from the remote slot on
+# the primary.
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr_1 = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb1_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('lsub1_sync_slot', 'test_decoding', false, false, true);}
+);
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+my $standby1_logstart = -s $standby1->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Synchronize the primary server slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced'.
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub1_sync_slot' AND synced AND NOT temporary;}
+	),
+	"t",
+	'logical slot lsub1_sync_slot has synced as true on standby');
+
+my $logstart = -s $primary->logfile;
+my $inactive_timeout = 2;
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$primary->reload;
+
+# Wait for the logical failover slot to become inactive on the primary. Note
+# that nobody has acquired that slot yet, so due to
+# replication_slot_inactive_timeout setting above it must get invalidated.
+wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart,
+	$inactive_timeout);
+
+# Set timeout on the standby also to check the synced slots don't get
+# invalidated due to timeout on the standby.
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$standby1->reload;
+
+# Now, sync the logical failover slot from the remote slot on the primary.
+# Note that the remote slot has already been invalidated due to inactive
+# timeout. Now, the standby must also see it as invalidated.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Wait for the inactive replication slot to be invalidated.
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'lsub1_sync_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for replication slot lsub1_sync_slot invalidation to be synced on standby";
+
+# Synced slot mustn't get invalidated on the standby even after a checkpoint,
+# it must sync invalidation from the primary. So, we must not see the slot's
+# invalidation message in server log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+ok( !$standby1->log_contains(
+		"invalidating obsolete replication slot \"lsub1_sync_slot\"",
+		$standby1_logstart),
+	'check that syned slot lsub1_sync_slot has not been invalidated on the standby'
+);
+
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
+wait_for_slot_invalidation($primary, 'sb1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end: Invalidate streaming standby's slot as well as logical failover
+# slot on primary due to replication_slot_inactive_timeout. Also, check the
+# logical failover slot synced on to the standby doesn't invalidate the slot on
+# its own, but gets the invalidated state from the remote slot on the primary.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+
+my $publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO ' ${inactive_timeout}s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+# =============================================================================
+
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset, $inactive_timeout) = @_;
+	my $name = $node->name;
+
+	# Wait for the replication slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for slot $slot_name to become inactive on node $name";
+
+	# Wait for the replication slot info to be updated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE inactive_since IS NOT NULL
+				AND slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for info of slot $slot_name to be updated on node $name";
+
+	# Sleep at least $inactive_timeout duration to avoid multiple checkpoints
+	# for the slot to get invalidated.
+	sleep($inactive_timeout);
+
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+
+	# Wait for the inactive replication slot to be invalidated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name";
+
+	# Check that the invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot_name', '0/1');
+	]);
+
+	ok( $stderr =~
+		  /can no longer get changes from replication slot "$slot_name"/,
+		"detected error upon trying to acquire invalidated slot $slot_name on node $name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot_name on node $name";
+}
+
+# Check for invalidation of slot in server log
+sub check_for_slot_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $name = $node->name;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot_name invalidation has been logged on node $name"
+	);
+}
+
+done_testing();
-- 
2.34.1

v39-0002-Add-XID-age-based-replication-slot-invalidation.patchapplication/x-patch; name=v39-0002-Add-XID-age-based-replication-slot-invalidation.patchDownload

From c6cee7b246583c05e55b1ed5b14d4d786c2d8ddd Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 12 Apr 2024 15:00:05 +0000
Subject: [PATCH v39 2/2] Add XID age based replication slot invalidation.

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres introduces a GUC allowing users
set slot XID age. The replication slots whose xmin or catalog_xmin
has reached the age specified by this setting get invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/20240327150557.GA3994937%40nathanxps13
Discussion: https://www.postgresql.org/message-id/CA%2BTgmoaRECcnyqxAxUhP5dk2S4HX%3DpGh-p-PkA3uc%2BjG_9hiMw%40mail.gmail.com
---
 doc/src/sgml/config.sgml                      |  26 ++
 doc/src/sgml/system-views.sgml                |   8 +
 src/backend/replication/slot.c                | 160 +++++++++-
 src/backend/utils/misc/guc_tables.c           |  10 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 296 +++++++++++++++++-
 7 files changed, 490 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a73677b98b..f7aee4663f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4580,6 +4580,32 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-xid-age" xreflabel="replication_slot_xid_age">
+      <term><varname>replication_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during a checkpoint.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 063638beda..05a11a0fe3 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2587,6 +2587,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>xid_aged</literal> means that the slot's
+          <literal>xmin</literal> or <literal>catalog_xmin</literal>
+          has reached the age specified by
+          <xref linkend="guc-replication-slot-xid-age"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 7cfbc2dfff..2029efe5a6 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -108,10 +108,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
 	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
+	[RS_INVAL_XID_AGE] = "xid_aged",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -142,6 +143,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 int			replication_slot_inactive_timeout = 0;
+int			replication_slot_xid_age = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -160,6 +162,9 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool ReplicationSlotIsXIDAged(ReplicationSlot *slot,
+									 TransactionId *xmin,
+									 TransactionId *catalog_xmin);
 
 static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 										   ReplicationSlot *s,
@@ -636,8 +641,8 @@ retry:
 	 * it gets invalidated now or has been invalidated previously, because
 	 * there's no use in acquiring the invalidated slot.
 	 *
-	 * XXX: Currently we check for inactive_timeout invalidation here. We
-	 * might need to check for other invalidations too.
+	 * XXX: Currently we check for inactive_timeout and xid_aged invalidations
+	 * here. We might need to check for other invalidations too.
 	 */
 	if (check_for_invalidation)
 	{
@@ -648,6 +653,22 @@ retry:
 													   InvalidTransactionId,
 													   &invalidated);
 
+		if (!invalidated && released_lock)
+		{
+			/* The slot is still ours */
+			Assert(s->active_pid == MyProcPid);
+
+			/* Reacquire the ControlLock */
+			LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+			released_lock = false;
+		}
+
+		if (!invalidated)
+			released_lock = InvalidatePossiblyObsoleteSlot(RS_INVAL_XID_AGE,
+														   s, 0, InvalidOid,
+														   InvalidTransactionId,
+														   &invalidated);
+
 		/*
 		 * If the slot has been invalidated, recalculate the resource limits.
 		 */
@@ -657,7 +678,8 @@ retry:
 			ReplicationSlotsComputeRequiredLSN();
 		}
 
-		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT ||
+			s->data.invalidated == RS_INVAL_XID_AGE)
 		{
 			/*
 			 * Release the lock if it's not yet to keep the cleanup path on
@@ -665,7 +687,10 @@ retry:
 			 */
 			if (!released_lock)
 				LWLockRelease(ReplicationSlotControlLock);
+		}
 
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		{
 			Assert(s->inactive_since > 0);
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -675,6 +700,20 @@ retry:
 							   timestamptz_to_str(s->inactive_since),
 							   replication_slot_inactive_timeout)));
 		}
+
+		if (s->data.invalidated == RS_INVAL_XID_AGE)
+		{
+			Assert(TransactionIdIsValid(s->data.xmin) ||
+				   TransactionIdIsValid(s->data.catalog_xmin));
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(s->data.name)),
+					 errdetail("The slot's xmin %u or catalog_xmin %u has reached the age %d specified by replication_slot_xid_age.",
+							   s->data.xmin,
+							   s->data.catalog_xmin,
+							   replication_slot_xid_age)));
+		}
 	}
 
 	if (!released_lock)
@@ -1542,7 +1581,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
 					   TransactionId snapshotConflictHorizon,
-					   TimestampTz inactive_since)
+					   TimestampTz inactive_since,
+					   TransactionId xmin,
+					   TransactionId catalog_xmin)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1579,6 +1620,27 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 							 timestamptz_to_str(inactive_since),
 							 replication_slot_inactive_timeout);
 			break;
+		case RS_INVAL_XID_AGE:
+			Assert(TransactionIdIsValid(xmin) ||
+				   TransactionIdIsValid(catalog_xmin));
+
+			if (TransactionIdIsValid(xmin))
+			{
+				appendStringInfo(&err_detail, _("The slot's xmin %u has reached the age %d specified by replication_slot_xid_age."),
+								 xmin,
+								 replication_slot_xid_age);
+				break;
+			}
+
+			if (TransactionIdIsValid(catalog_xmin))
+			{
+				appendStringInfo(&err_detail, _("The slot's catalog_xmin %u has reached the age %d specified by replication_slot_xid_age."),
+								 catalog_xmin,
+								 replication_slot_xid_age);
+				break;
+			}
+
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1623,6 +1685,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
 	TimestampTz inactive_since = 0;
+	TransactionId aged_xmin = InvalidTransactionId;
+	TransactionId aged_catalog_xmin = InvalidTransactionId;
 
 	for (;;)
 	{
@@ -1739,6 +1803,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 						Assert(s->active_pid == 0);
 					}
 					break;
+				case RS_INVAL_XID_AGE:
+					if (ReplicationSlotIsXIDAged(s, &aged_xmin, &aged_catalog_xmin))
+					{
+						Assert(TransactionIdIsValid(aged_xmin) ||
+							   TransactionIdIsValid(aged_catalog_xmin));
+
+						invalidation_cause = cause;
+						break;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1827,7 +1901,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
 									   oldestLSN, snapshotConflictHorizon,
-									   inactive_since);
+									   inactive_since, aged_xmin,
+									   aged_catalog_xmin);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1874,7 +1949,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
 								   oldestLSN, snapshotConflictHorizon,
-								   inactive_since);
+								   inactive_since, aged_xmin,
+								   aged_catalog_xmin);
 
 			/* done with this slot for now */
 			break;
@@ -1898,6 +1974,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -2031,14 +2108,20 @@ CheckPointReplicationSlots(bool is_shutdown)
 	 *
 	 * - Avoid saving slot info to disk two times for each invalidated slot.
 	 *
-	 * XXX: Should we move inactive_timeout inavalidation check closer to
-	 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+	 * XXX: Should we move inactive_timeout and xid_aged inavalidation checks
+	 * closer to wal_removed in CreateCheckPoint and CreateRestartPoint?
 	 */
 	invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
 													 0,
 													 InvalidOid,
 													 InvalidTransactionId);
 
+	if (!invalidated)
+		invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE,
+														 0,
+														 InvalidOid,
+														 InvalidTransactionId);
+
 	if (invalidated)
 	{
 		/*
@@ -2050,6 +2133,65 @@ CheckPointReplicationSlots(bool is_shutdown)
 	}
 }
 
+/*
+ * Returns true if the given replication slot's xmin or catalog_xmin age is
+ * more than replication_slot_xid_age.
+ *
+ * Note that the caller must hold the replication slot's spinlock to avoid
+ * race conditions while this function reads xmin and catalog_xmin.
+ */
+static bool
+ReplicationSlotIsXIDAged(ReplicationSlot *slot, TransactionId *xmin,
+						 TransactionId *catalog_xmin)
+{
+	TransactionId cutoff;
+	TransactionId curr;
+
+	if (replication_slot_xid_age == 0)
+		return false;
+
+	curr = ReadNextTransactionId();
+
+	/*
+	 * Replication slot's xmin and catalog_xmin can never be larger than the
+	 * current transaction id even in the case of transaction ID wraparound.
+	 */
+	Assert(slot->data.xmin <= curr);
+	Assert(slot->data.catalog_xmin <= curr);
+
+	/*
+	 * The cutoff can tell how far we can go back from the current transaction
+	 * id till the age. And then, we check whether or not the xmin or
+	 * catalog_xmin falls within the cutoff; if yes, return true, otherwise
+	 * false.
+	 */
+	cutoff = curr - replication_slot_xid_age;
+
+	if (!TransactionIdIsNormal(cutoff))
+	{
+		cutoff = FirstNormalTransactionId;
+	}
+
+	*xmin = InvalidTransactionId;
+	*catalog_xmin = InvalidTransactionId;
+
+	if (TransactionIdIsNormal(slot->data.xmin) &&
+		TransactionIdPrecedesOrEquals(slot->data.xmin, cutoff))
+	{
+		*xmin = slot->data.xmin;
+		return true;
+	}
+
+	if (TransactionIdIsNormal(slot->data.catalog_xmin) &&
+		TransactionIdPrecedesOrEquals(slot->data.catalog_xmin, cutoff))
+	{
+		*catalog_xmin = slot->data.catalog_xmin;
+		return true;
+	}
+
+	return false;
+}
+
 /*
  * Load all replication slots from disk into memory at server startup. This
  * needs to be run before we start crash recovery.
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 79e7637ec9..ea70e83350 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2994,6 +2994,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&replication_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 819310b0a7..a2387ebd33 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -336,6 +336,7 @@
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
 #replication_slot_inactive_timeout = 0	# in seconds; 0 disables
+#replication_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 8727b7b58b..19e5dbfb36 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -55,6 +55,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* inactive slot timeout has occurred */
 	RS_INVAL_INACTIVE_TIMEOUT,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -233,6 +235,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
 extern PGDLLIMPORT int replication_slot_inactive_timeout;
+extern PGDLLIMPORT int replication_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index 4663019c16..da05350df4 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -89,7 +89,7 @@ $primary->reload;
 # that nobody has acquired that slot yet, so due to
 # replication_slot_inactive_timeout setting above it must get invalidated.
 wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart,
-	$inactive_timeout);
+	$inactive_timeout, 'inactive_timeout');
 
 # Set timeout on the standby also to check the synced slots don't get
 # invalidated due to timeout on the standby.
@@ -129,7 +129,7 @@ $standby1->stop;
 
 # Wait for the standby's replication slot to become inactive
 wait_for_slot_invalidation($primary, 'sb1_slot', $logstart,
-	$inactive_timeout);
+	$inactive_timeout, 'inactive_timeout');
 
 # Testcase end: Invalidate streaming standby's slot as well as logical failover
 # slot on primary due to replication_slot_inactive_timeout. Also, check the
@@ -197,15 +197,280 @@ $subscriber->stop;
 # Wait for the replication slot to become inactive and then invalidated due to
 # timeout.
 wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
-	$inactive_timeout);
+	$inactive_timeout, 'inactive_timeout');
 
 # Testcase end: Invalidate logical subscriber's slot due to
 # replication_slot_inactive_timeout.
 # =============================================================================
 
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot due to replication_slot_xid_age
+# GUC.
+
+# Prepare for the next test
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby2->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb2_slot', immediately_reserve := true);
+]);
+
+$standby2->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NOT NULL AND catalog_xmin IS NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb2_slot';
+]) or die "Timed out waiting for slot sb2_slot xmin to advance";
+
+$primary->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby2->stop;
+
+$logstart = -s $primary->logfile;
+
+# Do some work to advance xids on primary
+advance_xids($primary, 'tab_int');
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+wait_for_slot_invalidation($primary, 'sb2_slot', $logstart, 0, 'xid_aged');
+
+# Testcase end: Invalidate streaming standby's slot due to replication_slot_xid_age
+# GUC.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to
+# replication_slot_xid_age GUC.
+
+$publisher = $primary;
+$publisher->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 500;
+]);
+$publisher->reload;
+
+$subscriber->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+));
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl2 (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl2 (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl2 VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+$publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub2 FOR TABLE test_tbl2");
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub2 WITH (slot_name = 'lsub2_slot')"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub2');
+
+$result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl2");
+
+is($result, qq(5), "check initial copy was done");
+
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NULL AND catalog_xmin IS NOT NULL
+	FROM pg_catalog.pg_replication_slots
+	WHERE slot_name = 'lsub2_slot';
+]) or die "Timed out waiting for slot lsub2_slot catalog_xmin to advance";
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Do some work to advance xids on publisher
+advance_xids($publisher, 'test_tbl2');
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+wait_for_slot_invalidation($publisher, 'lsub2_slot', $logstart, 0,
+	'xid_aged');
+
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_xid_age GUC.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical slot on standby that's being synced from
+# the primary due to replication_slot_xid_age GUC.
+
+$publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 0;
+]);
+$publisher->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby3 = PostgreSQL::Test::Cluster->new('standby3');
+$standby3->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+$standby3->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb3_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb3_slot', immediately_reserve := true);
+]);
+
+$standby3->start;
+
+my $standby3_logstart = -s $standby3->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby3);
+
+$subscriber->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+));
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl3 (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl3 (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl3 VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub3 FOR TABLE test_tbl3");
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub3 CONNECTION '$publisher_connstr' PUBLICATION pub3 WITH (slot_name = 'lsub3_sync_slot', failover = true)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub3');
+
+$result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl3");
+
+is($result, qq(5), "check initial copy was done");
+
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NULL AND catalog_xmin IS NOT NULL
+	FROM pg_catalog.pg_replication_slots
+	WHERE slot_name = 'lsub3_sync_slot';
+])
+  or die "Timed out waiting for slot lsub3_sync_slot catalog_xmin to advance";
+
+# Synchronize the primary server slots to the standby
+$standby3->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced' and has got catalog_xmin from the primary.
+is( $standby3->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub3_sync_slot' AND synced AND NOT temporary AND
+			xmin IS NULL AND catalog_xmin IS NOT NULL;}
+	),
+	"t",
+	'logical slot has synced as true on standby');
+
+my $primary_catalog_xmin = $primary->safe_psql('postgres',
+	"SELECT catalog_xmin FROM pg_replication_slots WHERE slot_name = 'lsub3_sync_slot' AND catalog_xmin IS NOT NULL;"
+);
+
+my $stabdby3_catalog_xmin = $standby3->safe_psql('postgres',
+	"SELECT catalog_xmin FROM pg_replication_slots WHERE slot_name = 'lsub3_sync_slot' AND catalog_xmin IS NOT NULL;"
+);
+
+is($primary_catalog_xmin, $stabdby3_catalog_xmin,
+	"check catalog_xmin are same for primary slot and synced slot");
+
+# Enable XID age based invalidation on the standby. Note that we disabled the
+# same on the primary to check if the invalidation occurs for synced slot on
+# the standby.
+$standby3->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 500;
+]);
+$standby3->reload;
+
+$logstart = -s $standby3->logfile;
+
+# Do some work to advance xids on primary
+advance_xids($primary, 'test_tbl3');
+
+# Wait for standby to catch up with the above work
+$primary->wait_for_catchup($standby3);
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+wait_for_slot_invalidation($standby3, 'lsub3_sync_slot', $logstart, 0,
+	'xid_aged');
+
+# Note that the replication slot on the primary is still active
+$result = $primary->safe_psql('postgres',
+	"SELECT COUNT(slot_name) = 1 FROM pg_replication_slots WHERE slot_name = 'lsub3_sync_slot' AND invalidation_reason IS NULL;"
+);
+
+is($result, 't', "check lsub3_sync_slot is still active on primary");
+
+# Testcase end: Invalidate logical slot on standby that's being synced from
+# the primary due to replication_slot_xid_age GUC.
+# =============================================================================
+
 sub wait_for_slot_invalidation
 {
-	my ($node, $slot_name, $offset, $inactive_timeout) = @_;
+	my ($node, $slot_name, $offset, $inactive_timeout, $reason) = @_;
 	my $name = $node->name;
 
 	# Wait for the replication slot to become inactive
@@ -238,7 +503,7 @@ sub wait_for_slot_invalidation
 		'postgres', qq[
 		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
 			WHERE slot_name = '$slot_name' AND
-			invalidation_reason = 'inactive_timeout';
+			invalidation_reason = '$reason';
 	])
 	  or die
 	  "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name";
@@ -283,4 +548,25 @@ sub check_for_slot_invalidation_in_server_log
 	);
 }
 
+# Do some work for advancing xids on a given node
+sub advance_xids
+{
+	my ($node, $table_name) = @_;
+
+	$node->safe_psql(
+		'postgres', qq[
+		do \$\$
+		begin
+		for i in 10000..11000 loop
+			-- use an exception block so that each iteration eats an XID
+			begin
+			insert into $table_name values (i);
+			exception
+			when division_by_zero then null;
+			end;
+		end loop;
+		end\$\$;
+	]);
+}
+
 done_testing();
-- 
2.34.1

#225

Masahiko Sawada

sawada.mshk@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#217)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Thu, Apr 4, 2024 at 9:23 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Apr 4, 2024 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Thanks for the changes. v34-0001 LGTM.

I was doing a final review before pushing 0001 and found that
'inactive_since' could be set twice during startup after promotion,
once while restoring slots and then via ShutDownSlotSync(). The reason
is that ShutDownSlotSync() will be invoked in normal startup on
primary though it won't do anything apart from setting inactive_since
if we have synced slots. I think you need to check 'StandbyMode' in
update_synced_slots_inactive_since() and return if the same is not
set. We can't use 'InRecovery' flag as that will be set even during
crash recovery.

Can you please test this once unless you don't agree with the above theory?

Nice catch. I've verified that update_synced_slots_inactive_since is
called even for normal server startups/crash recovery. I've added a
check to exit if the StandbyMode isn't set.

Please find the attached v35 patch.

The documentation says about both 'active' and 'inactive_since'
columns of pg_replication_slots say:

---
active bool
True if this slot is currently actively being used

inactive_since timestamptz
The time since the slot has become inactive. NULL if the slot is
currently being used. Note that for slots on the standby that are
being synced from a primary server (whose synced field is true), the
inactive_since indicates the last synchronization (see Section 47.2.3)
time.
---

When reading the description I thought if 'active' is true,
'inactive_since' is NULL, but it doesn't seem to apply for temporary
slots. Since we don't reset the active_pid field of temporary slots
when the release, the 'active' is still true in the view but
'inactive_since' is not NULL. Do you think we need to mention it in
the documentation?

As for the timeout-based slot invalidation feature, we could end up
invalidating the temporary slots even if they are shown as active,
which could confuse users. Do we want to somehow deal with it?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#226

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 1 year ago

In reply to: Masahiko Sawada (#225)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Apr 22, 2024 at 7:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Please find the attached v35 patch.

The documentation says about both 'active' and 'inactive_since'
columns of pg_replication_slots say:

---
active bool
True if this slot is currently actively being used

inactive_since timestamptz
The time since the slot has become inactive. NULL if the slot is
currently being used. Note that for slots on the standby that are
being synced from a primary server (whose synced field is true), the
inactive_since indicates the last synchronization (see Section 47.2.3)
time.
---

When reading the description I thought if 'active' is true,
'inactive_since' is NULL, but it doesn't seem to apply for temporary
slots.

Right.

Since we don't reset the active_pid field of temporary slots
when the release, the 'active' is still true in the view but
'inactive_since' is not NULL.

Right. inactive_since is reset whenever the temporary slot is acquired
again within the same backend that created the temporary slot.

Do you think we need to mention it in
the documentation?

I think that's the reason we dropped "active" from the statement. It
was earlier "NULL if the slot is currently actively being used.". But,
per Bertrand's comment
/messages/by-id/ZehE2IJcsetSJMHC@ip-10-97-1-34.eu-west-3.compute.internal
changed it to ""NULL if the slot is currently being used.".

Temporary slots retain the active = true and active_pid = <pid of the
backend that created it> even when the slot is not being used until
the lifetime of the backend process. We haven't tied active or
active_pid flags to inactive_since, doing so now to represent the
temporary slot behaviour for active and active_pid will confuse users
more. As far as the inactive_since of a slot is concerned, it is set
to 0 when the slot is being used (acquired) and set to current
timestamp when the slot is not being used (released).

As for the timeout-based slot invalidation feature, we could end up
invalidating the temporary slots even if they are shown as active,
which could confuse users. Do we want to somehow deal with it?

Yes. As long as the temporary slot is lying unused holding up
resources for more than the specified
replication_slot_inactive_timeout, it is bound to get invalidated.
This keeps behaviour consistent and less-confusing to the users.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#227

Amit Kapila

amit.kapila16@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#226)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Apr 25, 2024 at 11:11 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Mon, Apr 22, 2024 at 7:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Please find the attached v35 patch.

The documentation says about both 'active' and 'inactive_since'
columns of pg_replication_slots say:

---
active bool
True if this slot is currently actively being used

inactive_since timestamptz
The time since the slot has become inactive. NULL if the slot is
currently being used. Note that for slots on the standby that are
being synced from a primary server (whose synced field is true), the
inactive_since indicates the last synchronization (see Section 47.2.3)
time.
---

When reading the description I thought if 'active' is true,
'inactive_since' is NULL, but it doesn't seem to apply for temporary
slots.

Right.

Since we don't reset the active_pid field of temporary slots
when the release, the 'active' is still true in the view but
'inactive_since' is not NULL.

Right. inactive_since is reset whenever the temporary slot is acquired
again within the same backend that created the temporary slot.

Do you think we need to mention it in
the documentation?

I think that's the reason we dropped "active" from the statement. It
was earlier "NULL if the slot is currently actively being used.". But,
per Bertrand's comment
/messages/by-id/ZehE2IJcsetSJMHC@ip-10-97-1-34.eu-west-3.compute.internal
changed it to ""NULL if the slot is currently being used.".

Temporary slots retain the active = true and active_pid = <pid of the
backend that created it> even when the slot is not being used until
the lifetime of the backend process. We haven't tied active or
active_pid flags to inactive_since, doing so now to represent the
temporary slot behaviour for active and active_pid will confuse users
more.

This is true and it's probably easy for us to understand as we
developed this feature but the same may not be true for others. I
wonder if we can be explicit about the difference of
active/inactive_since by adding something like the following for
inactive_since: Note that this field is not related to the active flag
as temporary slots can remain active till the session ends even when
they are not being used.

Sawada-San, do you have any suggestions on the wording?

As far as the inactive_since of a slot is concerned, it is set

to 0 when the slot is being used (acquired) and set to current
timestamp when the slot is not being used (released).

As for the timeout-based slot invalidation feature, we could end up
invalidating the temporary slots even if they are shown as active,
which could confuse users. Do we want to somehow deal with it?

Yes. As long as the temporary slot is lying unused holding up
resources for more than the specified
replication_slot_inactive_timeout, it is bound to get invalidated.
This keeps behaviour consistent and less-confusing to the users.

Agreed. We may want to add something in the docs for this to avoid
confusion with the active flag.

--
With Regards,
Amit Kapila.

#228

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#224)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Sat, Apr 13, 2024 at 9:36 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

There was a point raised by Amit
/messages/by-id/CAA4eK1K8wqLsMw6j0hE_SFoWAeo3Kw8UNnMfhsWaYDF1GWYQ+g@mail.gmail.com
on when to do the XID age based invalidation - whether in checkpointer
or when vacuum is being run or whenever ComputeXIDHorizons gets called
or in autovacuum process. For now, I've chosen the design to do these
new invalidation checks in two places - 1) whenever the slot is
acquired and the slot acquisition errors out if invalidated, 2) during
checkpoint. However, I'm open to suggestions on this.

Here are my thoughts on when to do the XID age invalidation. In all
the patches sent so far, the XID age invalidation happens in two
places - one during the slot acquisition, and another during the
checkpoint. As the suggestion is to do it during the vacuum (manual
and auto), so that even if the checkpoint isn't happening in the
database for whatever reasons, a vacuum command or autovacuum can
invalidate the slots whose XID is aged.

An idea is to check for XID age based invalidation for all the slots
in ComputeXidHorizons() before it reads replication_slot_xmin and
replication_slot_catalog_xmin, and obviously before the proc array
lock is acquired. A potential problem with this approach is that the
invalidation check can become too aggressive as XID horizons are
computed from many places.

Another idea is to check for XID age based invalidation for all the
slots in higher levels than ComputeXidHorizons(), for example in
vacuum() which is an entry point for both vacuum command and
autovacuum. This approach seems similar to vacuum_failsafe_age GUC
which checks each relation for the failsafe age before vacuum gets
triggered on it.

Does anyone see any issues or risks with the above two approaches or
have any other ideas? Thoughts?

I attached v40 patches here. I reworded some of the ERROR messages,
and did some code clean-up. Note that I haven't implemented any of the
above approaches yet.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v40-0001-Add-inactive_timeout-based-replication-slot-inva.patchapplication/x-patch; name=v40-0001-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 7a920d4f1a4d6a10776ff597d6d931b10340417c Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 17 Jun 2024 06:55:27 +0000
Subject: [PATCH v40 1/2] Add inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get invalidated.

To achieve the above, postgres introduces a GUC allowing users
set inactive timeout. The replication slots that are inactive
for longer than specified amount of time get invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint

Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby, because such synced slots are typically considered not
active (for them to be later considered as inactive) as they don't
perform logical decoding to produce the changes.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CA%2BTgmoZTbaaEjSZUG1FL0mzxAdN3qmXksO3O9_PZhEuXTkVnRQ%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/202403260841.5jcv7ihniccy%40alvherre.pgsql
---
 doc/src/sgml/config.sgml                      |  33 ++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |  11 +-
 src/backend/replication/slot.c                | 188 +++++++++++-
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   6 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 286 ++++++++++++++++++
 13 files changed, 535 insertions(+), 20 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 698169afdb..a01f6d2d14 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4554,6 +4554,39 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidates replication slots that are inactive for longer than
+        specified amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the timeout mechanism. This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during a checkpoint. The time since the slot has become
+        inactive is known from its
+        <structfield>inactive_since</structfield> value using which the
+        timeout is measured.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>).
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 8c18bea902..4867af1b61 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2580,6 +2580,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 56d3fb5d0e..a5967d400f 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -448,7 +448,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -651,6 +651,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 						   " name slot \"%s\" already exists on the standby",
 						   remote_slot->name));
 
+		/*
+		 * Skip the sync if the local slot is already invalidated. We do this
+		 * beforehand to avoid slot acquire and release.
+		 */
+		if (slot->data.invalidated != RS_INVAL_NONE)
+			return false;
+
 		/*
 		 * The slot has been synchronized before.
 		 *
@@ -667,7 +674,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 564cfee127..e80872f27b 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -159,6 +161,13 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 
+static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
+										   ReplicationSlot *s,
+										   XLogRecPtr oldestLSN,
+										   Oid dboid,
+										   TransactionId snapshotConflictHorizon,
+										   bool *invalidated);
+
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
@@ -535,12 +544,17 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if check_for_invalidation is true and the slot gets
+ * invalidated now or has been invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
+	bool		released_lock = false;
 
 	Assert(name != NULL);
 
@@ -615,6 +629,57 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	/*
+	 * Check if the acquired slot needs to be invalidated. And, error out if
+	 * it gets invalidated now or has been invalidated previously, because
+	 * there's no use in acquiring the invalidated slot.
+	 *
+	 * XXX: Currently we check for inactive_timeout invalidation here. We
+	 * might need to check for other invalidations too.
+	 */
+	if (check_for_invalidation)
+	{
+		bool		invalidated = false;
+
+		released_lock = InvalidatePossiblyObsoleteSlot(RS_INVAL_INACTIVE_TIMEOUT,
+													   s, 0, InvalidOid,
+													   InvalidTransactionId,
+													   &invalidated);
+
+		/*
+		 * If the slot has been invalidated, recalculate the resource limits.
+		 */
+		if (invalidated)
+		{
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+		}
+
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		{
+			/*
+			 * Release the lock if it's not yet to keep the cleanup path on
+			 * error happy.
+			 */
+			if (!released_lock)
+				LWLockRelease(ReplicationSlotControlLock);
+
+			Assert(s->inactive_since > 0);
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(s->data.name)),
+					 errdetail("This slot has been invalidated because it was inactive since %s for more than %d seconds specified by \"replication_slot_inactive_timeout\".",
+							   timestamptz_to_str(s->inactive_since),
+							   replication_slot_inactive_timeout)));
+		}
+	}
+
+	if (!released_lock)
+		LWLockRelease(ReplicationSlotControlLock);
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +850,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -808,7 +873,7 @@ ReplicationSlotAlter(const char *name, bool failover)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1480,7 +1545,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1510,6 +1576,13 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			Assert(inactive_since > 0);
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s for more than %d seconds specified by \"replication_slot_inactive_timeout\"."),
+							 timestamptz_to_str(inactive_since),
+							 replication_slot_inactive_timeout);
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1553,6 +1626,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1560,6 +1634,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1570,6 +1645,18 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT &&
+			(replication_slot_inactive_timeout > 0 &&
+			 s->inactive_since > 0 &&
+			 !(RecoveryInProgress() && s->data.synced)))
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1623,6 +1710,39 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+
+					/*
+					 * Quick exit if inactive timeout invalidation mechanism
+					 * is disabled or slot is currently being used or the slot
+					 * on standby is currently being synced from the primary.
+					 *
+					 * Note that we don't invalidate synced slots because,
+					 * they are typically considered not active as they don't
+					 * perform logical decoding to produce the changes.
+					 */
+					if (replication_slot_inactive_timeout == 0 ||
+						s->inactive_since == 0 ||
+						(RecoveryInProgress() && s->data.synced))
+						break;
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * replication_slot_inactive_timeout GUC.
+					 */
+					if (TimestampDifferenceExceeds(s->inactive_since, now,
+												   replication_slot_inactive_timeout * 1000))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+
+						/*
+						 * Invalidation due to inactive timeout implies that
+						 * no one is using the slot.
+						 */
+						Assert(s->active_pid == 0);
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1648,11 +1768,14 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so or if the slot is already ours,
+		 * then mark it invalidated.  Otherwise we'll signal the owning
+		 * process, below, and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot != NULL &&
+			 MyReplicationSlot == s &&
+			 active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1707,7 +1830,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1753,7 +1877,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1776,6 +1901,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1828,7 +1954,7 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk and invalidate slots.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1839,6 +1965,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated = false;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -1886,6 +2013,43 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	elog(DEBUG1, "performing replication slot invalidation");
+
+	/*
+	 * Note that we will make another pass over replication slots for
+	 * invalidations to keep the code simple. The assumption here is that the
+	 * traversal over replication slots isn't that costly even with hundreds
+	 * of replication slots. If it ever turns out that this assumption is
+	 * wrong, we might have to put the invalidation check logic in the above
+	 * loop, for that we might have to do the following:
+	 *
+	 * - Acqure ControlLock lock once before the loop.
+	 *
+	 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+	 *
+	 * - Handle the cases in which ControlLock gets released just like
+	 * InvalidateObsoleteReplicationSlots does.
+	 *
+	 * - Avoid saving slot info to disk two times for each invalidated slot.
+	 *
+	 * XXX: Should we move inactive_timeout inavalidation check closer to
+	 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+	 */
+	invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+													 0,
+													 InvalidOid,
+													 InvalidTransactionId);
+
+	if (invalidated)
+	{
+		/*
+		 * If any slots have been invalidated, recalculate the resource
+		 * limits.
+		 */
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index dd6c1d5a7e..9ad3e55704 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -539,7 +539,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index c623b07cf0..1741e09259 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -846,7 +846,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1459,7 +1459,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 46c258be28..4990e73c97 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2961,6 +2961,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e0567de219..535fb07385 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -335,6 +335,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 1bc80960ef..56d20e1a78 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -230,6 +232,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -245,7 +248,8 @@ extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, bool failover);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..b4c5ce2875 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -10,6 +10,7 @@ tests += {
        'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
     },
     'tests': [
+      't/050_invalidate_slots.pl',
       't/001_stream_rep.pl',
       't/002_archiving.pl',
       't/003_recovery_targets.pl',
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..4663019c16
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,286 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to replication_slot_inactive_timeout. Also,
+# check the logical failover slot synced on to the standby doesn't invalidate
+# the slot on its own, but gets the invalidated state from the remote slot on
+# the primary.
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr_1 = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb1_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('lsub1_sync_slot', 'test_decoding', false, false, true);}
+);
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+my $standby1_logstart = -s $standby1->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Synchronize the primary server slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced'.
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub1_sync_slot' AND synced AND NOT temporary;}
+	),
+	"t",
+	'logical slot lsub1_sync_slot has synced as true on standby');
+
+my $logstart = -s $primary->logfile;
+my $inactive_timeout = 2;
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$primary->reload;
+
+# Wait for the logical failover slot to become inactive on the primary. Note
+# that nobody has acquired that slot yet, so due to
+# replication_slot_inactive_timeout setting above it must get invalidated.
+wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart,
+	$inactive_timeout);
+
+# Set timeout on the standby also to check the synced slots don't get
+# invalidated due to timeout on the standby.
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$standby1->reload;
+
+# Now, sync the logical failover slot from the remote slot on the primary.
+# Note that the remote slot has already been invalidated due to inactive
+# timeout. Now, the standby must also see it as invalidated.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Wait for the inactive replication slot to be invalidated.
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'lsub1_sync_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for replication slot lsub1_sync_slot invalidation to be synced on standby";
+
+# Synced slot mustn't get invalidated on the standby even after a checkpoint,
+# it must sync invalidation from the primary. So, we must not see the slot's
+# invalidation message in server log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+ok( !$standby1->log_contains(
+		"invalidating obsolete replication slot \"lsub1_sync_slot\"",
+		$standby1_logstart),
+	'check that syned slot lsub1_sync_slot has not been invalidated on the standby'
+);
+
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
+wait_for_slot_invalidation($primary, 'sb1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end: Invalidate streaming standby's slot as well as logical failover
+# slot on primary due to replication_slot_inactive_timeout. Also, check the
+# logical failover slot synced on to the standby doesn't invalidate the slot on
+# its own, but gets the invalidated state from the remote slot on the primary.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+
+my $publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO ' ${inactive_timeout}s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+# =============================================================================
+
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset, $inactive_timeout) = @_;
+	my $name = $node->name;
+
+	# Wait for the replication slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for slot $slot_name to become inactive on node $name";
+
+	# Wait for the replication slot info to be updated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE inactive_since IS NOT NULL
+				AND slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for info of slot $slot_name to be updated on node $name";
+
+	# Sleep at least $inactive_timeout duration to avoid multiple checkpoints
+	# for the slot to get invalidated.
+	sleep($inactive_timeout);
+
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+
+	# Wait for the inactive replication slot to be invalidated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name";
+
+	# Check that the invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot_name', '0/1');
+	]);
+
+	ok( $stderr =~
+		  /can no longer get changes from replication slot "$slot_name"/,
+		"detected error upon trying to acquire invalidated slot $slot_name on node $name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot_name on node $name";
+}
+
+# Check for invalidation of slot in server log
+sub check_for_slot_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $name = $node->name;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot_name invalidation has been logged on node $name"
+	);
+}
+
+done_testing();
-- 
2.34.1

v40-0002-Add-XID-age-based-replication-slot-invalidation.patchapplication/x-patch; name=v40-0002-Add-XID-age-based-replication-slot-invalidation.patchDownload

From 18185885fb3132187a2552116ee143c4518c1c4a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 17 Jun 2024 09:54:43 +0000
Subject: [PATCH v40 2/2] Add XID age based replication slot invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres introduces a GUC allowing users
set slot XID age. The replication slots whose xmin or catalog_xmin
has reached the age specified by this setting get invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/20240327150557.GA3994937%40nathanxps13
Discussion: https://www.postgresql.org/message-id/CA%2BTgmoaRECcnyqxAxUhP5dk2S4HX%3DpGh-p-PkA3uc%2BjG_9hiMw%40mail.gmail.com
---
 doc/src/sgml/config.sgml                      |  26 ++
 doc/src/sgml/system-views.sgml                |   8 +
 src/backend/replication/slot.c                | 151 ++++++++-
 src/backend/utils/misc/guc_tables.c           |  10 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 296 +++++++++++++++++-
 7 files changed, 481 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a01f6d2d14..114b48e41e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4587,6 +4587,32 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-xid-age" xreflabel="replication_slot_xid_age">
+      <term><varname>replication_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during a checkpoint.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 4867af1b61..0490f9f156 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2587,6 +2587,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>xid_aged</literal> means that the slot's
+          <literal>xmin</literal> or <literal>catalog_xmin</literal>
+          has reached the age specified by
+          <xref linkend="guc-replication-slot-xid-age"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index e80872f27b..79ac412d8e 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -108,10 +108,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
 	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
+	[RS_INVAL_XID_AGE] = "xid_aged",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -142,6 +143,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 int			replication_slot_inactive_timeout = 0;
+int			replication_slot_xid_age = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -160,6 +162,9 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool ReplicationSlotIsXIDAged(ReplicationSlot *slot,
+									 TransactionId *xmin,
+									 TransactionId *catalog_xmin);
 
 static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 										   ReplicationSlot *s,
@@ -636,8 +641,8 @@ retry:
 	 * it gets invalidated now or has been invalidated previously, because
 	 * there's no use in acquiring the invalidated slot.
 	 *
-	 * XXX: Currently we check for inactive_timeout invalidation here. We
-	 * might need to check for other invalidations too.
+	 * XXX: Currently we check for inactive_timeout and xid_aged invalidations
+	 * here. We might need to check for other invalidations too.
 	 */
 	if (check_for_invalidation)
 	{
@@ -648,6 +653,22 @@ retry:
 													   InvalidTransactionId,
 													   &invalidated);
 
+		if (!invalidated && released_lock)
+		{
+			/* The slot is still ours */
+			Assert(s->active_pid == MyProcPid);
+
+			/* Reacquire the ControlLock */
+			LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+			released_lock = false;
+		}
+
+		if (!invalidated)
+			released_lock = InvalidatePossiblyObsoleteSlot(RS_INVAL_XID_AGE,
+														   s, 0, InvalidOid,
+														   InvalidTransactionId,
+														   &invalidated);
+
 		/*
 		 * If the slot has been invalidated, recalculate the resource limits.
 		 */
@@ -657,7 +678,8 @@ retry:
 			ReplicationSlotsComputeRequiredLSN();
 		}
 
-		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT ||
+			s->data.invalidated == RS_INVAL_XID_AGE)
 		{
 			/*
 			 * Release the lock if it's not yet to keep the cleanup path on
@@ -665,7 +687,10 @@ retry:
 			 */
 			if (!released_lock)
 				LWLockRelease(ReplicationSlotControlLock);
+		}
 
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		{
 			Assert(s->inactive_since > 0);
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -675,6 +700,20 @@ retry:
 							   timestamptz_to_str(s->inactive_since),
 							   replication_slot_inactive_timeout)));
 		}
+
+		if (s->data.invalidated == RS_INVAL_XID_AGE)
+		{
+			Assert(TransactionIdIsValid(s->data.xmin) ||
+				   TransactionIdIsValid(s->data.catalog_xmin));
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(s->data.name)),
+					 errdetail("The slot's xmin %u or catalog_xmin %u has reached the age %d specified by \"replication_slot_xid_age\".",
+							   s->data.xmin,
+							   s->data.catalog_xmin,
+							   replication_slot_xid_age)));
+		}
 	}
 
 	if (!released_lock)
@@ -1546,7 +1585,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
 					   TransactionId snapshotConflictHorizon,
-					   TimestampTz inactive_since)
+					   TimestampTz inactive_since,
+					   TransactionId xmin,
+					   TransactionId catalog_xmin)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1583,6 +1624,20 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 							 timestamptz_to_str(inactive_since),
 							 replication_slot_inactive_timeout);
 			break;
+		case RS_INVAL_XID_AGE:
+			Assert(TransactionIdIsValid(xmin) ||
+				   TransactionIdIsValid(catalog_xmin));
+
+			if (TransactionIdIsValid(xmin))
+				appendStringInfo(&err_detail, _("The slot's xmin %u has reached the age %d specified by \"replication_slot_xid_age\"."),
+								 xmin,
+								 replication_slot_xid_age);
+			else if (TransactionIdIsValid(catalog_xmin))
+				appendStringInfo(&err_detail, _("The slot's catalog_xmin %u has reached the age %d specified by \"replication_slot_xid_age\"."),
+								 catalog_xmin,
+								 replication_slot_xid_age);
+
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1627,6 +1682,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
 	TimestampTz inactive_since = 0;
+	TransactionId aged_xmin = InvalidTransactionId;
+	TransactionId aged_catalog_xmin = InvalidTransactionId;
 
 	for (;;)
 	{
@@ -1743,6 +1800,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 						Assert(s->active_pid == 0);
 					}
 					break;
+				case RS_INVAL_XID_AGE:
+					if (ReplicationSlotIsXIDAged(s, &aged_xmin, &aged_catalog_xmin))
+					{
+						Assert(TransactionIdIsValid(aged_xmin) ||
+							   TransactionIdIsValid(aged_catalog_xmin));
+
+						invalidation_cause = cause;
+						break;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1831,7 +1898,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
 									   oldestLSN, snapshotConflictHorizon,
-									   inactive_since);
+									   inactive_since, aged_xmin,
+									   aged_catalog_xmin);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1878,7 +1946,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
 								   oldestLSN, snapshotConflictHorizon,
-								   inactive_since);
+								   inactive_since, aged_xmin,
+								   aged_catalog_xmin);
 
 			/* done with this slot for now */
 			break;
@@ -1902,6 +1971,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -2033,14 +2103,20 @@ CheckPointReplicationSlots(bool is_shutdown)
 	 *
 	 * - Avoid saving slot info to disk two times for each invalidated slot.
 	 *
-	 * XXX: Should we move inactive_timeout inavalidation check closer to
-	 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+	 * XXX: Should we move inactive_timeout and xid_aged inavalidation checks
+	 * closer to wal_removed in CreateCheckPoint and CreateRestartPoint?
 	 */
 	invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
 													 0,
 													 InvalidOid,
 													 InvalidTransactionId);
 
+	if (!invalidated)
+		invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE,
+														 0,
+														 InvalidOid,
+														 InvalidTransactionId);
+
 	if (invalidated)
 	{
 		/*
@@ -2052,6 +2128,63 @@ CheckPointReplicationSlots(bool is_shutdown)
 	}
 }
 
+/*
+ * Returns true if the given replication slot's xmin or catalog_xmin age is
+ * more than replication_slot_xid_age.
+ *
+ * Note that the caller must hold the replication slot's spinlock to avoid
+ * race conditions while this function reads xmin and catalog_xmin.
+ */
+static bool
+ReplicationSlotIsXIDAged(ReplicationSlot *slot, TransactionId *xmin,
+						 TransactionId *catalog_xmin)
+{
+	TransactionId cutoff;
+	TransactionId curr;
+
+	if (replication_slot_xid_age == 0)
+		return false;
+
+	curr = ReadNextTransactionId();
+
+	/*
+	 * Replication slot's xmin and catalog_xmin can never be larger than the
+	 * current transaction id even in the case of transaction ID wraparound.
+	 */
+	Assert(slot->data.xmin <= curr);
+	Assert(slot->data.catalog_xmin <= curr);
+
+	/*
+	 * The cutoff can tell how far we can go back from the current transaction
+	 * id till the age. And then, we check whether or not the xmin or
+	 * catalog_xmin falls within the cutoff; if yes, return true, otherwise
+	 * false.
+	 */
+	cutoff = curr - replication_slot_xid_age;
+
+	if (!TransactionIdIsNormal(cutoff))
+		cutoff = FirstNormalTransactionId;
+
+	*xmin = InvalidTransactionId;
+	*catalog_xmin = InvalidTransactionId;
+
+	if (TransactionIdIsNormal(slot->data.xmin) &&
+		TransactionIdPrecedesOrEquals(slot->data.xmin, cutoff))
+	{
+		*xmin = slot->data.xmin;
+		return true;
+	}
+
+	if (TransactionIdIsNormal(slot->data.catalog_xmin) &&
+		TransactionIdPrecedesOrEquals(slot->data.catalog_xmin, cutoff))
+	{
+		*catalog_xmin = slot->data.catalog_xmin;
+		return true;
+	}
+
+	return false;
+}
+
 /*
  * Load all replication slots from disk into memory at server startup. This
  * needs to be run before we start crash recovery.
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 4990e73c97..ca210c6bf9 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2973,6 +2973,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&replication_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 535fb07385..f04771d65c 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -336,6 +336,7 @@
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
 #replication_slot_inactive_timeout = 0	# in seconds; 0 disables
+#replication_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 56d20e1a78..e757b836c5 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -55,6 +55,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* inactive slot timeout has occurred */
 	RS_INVAL_INACTIVE_TIMEOUT,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -233,6 +235,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
 extern PGDLLIMPORT int replication_slot_inactive_timeout;
+extern PGDLLIMPORT int replication_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index 4663019c16..da05350df4 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -89,7 +89,7 @@ $primary->reload;
 # that nobody has acquired that slot yet, so due to
 # replication_slot_inactive_timeout setting above it must get invalidated.
 wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart,
-	$inactive_timeout);
+	$inactive_timeout, 'inactive_timeout');
 
 # Set timeout on the standby also to check the synced slots don't get
 # invalidated due to timeout on the standby.
@@ -129,7 +129,7 @@ $standby1->stop;
 
 # Wait for the standby's replication slot to become inactive
 wait_for_slot_invalidation($primary, 'sb1_slot', $logstart,
-	$inactive_timeout);
+	$inactive_timeout, 'inactive_timeout');
 
 # Testcase end: Invalidate streaming standby's slot as well as logical failover
 # slot on primary due to replication_slot_inactive_timeout. Also, check the
@@ -197,15 +197,280 @@ $subscriber->stop;
 # Wait for the replication slot to become inactive and then invalidated due to
 # timeout.
 wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
-	$inactive_timeout);
+	$inactive_timeout, 'inactive_timeout');
 
 # Testcase end: Invalidate logical subscriber's slot due to
 # replication_slot_inactive_timeout.
 # =============================================================================
 
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot due to replication_slot_xid_age
+# GUC.
+
+# Prepare for the next test
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby2->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb2_slot', immediately_reserve := true);
+]);
+
+$standby2->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NOT NULL AND catalog_xmin IS NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb2_slot';
+]) or die "Timed out waiting for slot sb2_slot xmin to advance";
+
+$primary->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby2->stop;
+
+$logstart = -s $primary->logfile;
+
+# Do some work to advance xids on primary
+advance_xids($primary, 'tab_int');
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+wait_for_slot_invalidation($primary, 'sb2_slot', $logstart, 0, 'xid_aged');
+
+# Testcase end: Invalidate streaming standby's slot due to replication_slot_xid_age
+# GUC.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to
+# replication_slot_xid_age GUC.
+
+$publisher = $primary;
+$publisher->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 500;
+]);
+$publisher->reload;
+
+$subscriber->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+));
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl2 (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl2 (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl2 VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+$publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub2 FOR TABLE test_tbl2");
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub2 WITH (slot_name = 'lsub2_slot')"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub2');
+
+$result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl2");
+
+is($result, qq(5), "check initial copy was done");
+
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NULL AND catalog_xmin IS NOT NULL
+	FROM pg_catalog.pg_replication_slots
+	WHERE slot_name = 'lsub2_slot';
+]) or die "Timed out waiting for slot lsub2_slot catalog_xmin to advance";
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Do some work to advance xids on publisher
+advance_xids($publisher, 'test_tbl2');
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+wait_for_slot_invalidation($publisher, 'lsub2_slot', $logstart, 0,
+	'xid_aged');
+
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_xid_age GUC.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical slot on standby that's being synced from
+# the primary due to replication_slot_xid_age GUC.
+
+$publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 0;
+]);
+$publisher->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby3 = PostgreSQL::Test::Cluster->new('standby3');
+$standby3->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+$standby3->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb3_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb3_slot', immediately_reserve := true);
+]);
+
+$standby3->start;
+
+my $standby3_logstart = -s $standby3->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby3);
+
+$subscriber->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+));
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl3 (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl3 (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl3 VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub3 FOR TABLE test_tbl3");
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub3 CONNECTION '$publisher_connstr' PUBLICATION pub3 WITH (slot_name = 'lsub3_sync_slot', failover = true)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub3');
+
+$result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl3");
+
+is($result, qq(5), "check initial copy was done");
+
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NULL AND catalog_xmin IS NOT NULL
+	FROM pg_catalog.pg_replication_slots
+	WHERE slot_name = 'lsub3_sync_slot';
+])
+  or die "Timed out waiting for slot lsub3_sync_slot catalog_xmin to advance";
+
+# Synchronize the primary server slots to the standby
+$standby3->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced' and has got catalog_xmin from the primary.
+is( $standby3->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub3_sync_slot' AND synced AND NOT temporary AND
+			xmin IS NULL AND catalog_xmin IS NOT NULL;}
+	),
+	"t",
+	'logical slot has synced as true on standby');
+
+my $primary_catalog_xmin = $primary->safe_psql('postgres',
+	"SELECT catalog_xmin FROM pg_replication_slots WHERE slot_name = 'lsub3_sync_slot' AND catalog_xmin IS NOT NULL;"
+);
+
+my $stabdby3_catalog_xmin = $standby3->safe_psql('postgres',
+	"SELECT catalog_xmin FROM pg_replication_slots WHERE slot_name = 'lsub3_sync_slot' AND catalog_xmin IS NOT NULL;"
+);
+
+is($primary_catalog_xmin, $stabdby3_catalog_xmin,
+	"check catalog_xmin are same for primary slot and synced slot");
+
+# Enable XID age based invalidation on the standby. Note that we disabled the
+# same on the primary to check if the invalidation occurs for synced slot on
+# the standby.
+$standby3->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 500;
+]);
+$standby3->reload;
+
+$logstart = -s $standby3->logfile;
+
+# Do some work to advance xids on primary
+advance_xids($primary, 'test_tbl3');
+
+# Wait for standby to catch up with the above work
+$primary->wait_for_catchup($standby3);
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+wait_for_slot_invalidation($standby3, 'lsub3_sync_slot', $logstart, 0,
+	'xid_aged');
+
+# Note that the replication slot on the primary is still active
+$result = $primary->safe_psql('postgres',
+	"SELECT COUNT(slot_name) = 1 FROM pg_replication_slots WHERE slot_name = 'lsub3_sync_slot' AND invalidation_reason IS NULL;"
+);
+
+is($result, 't', "check lsub3_sync_slot is still active on primary");
+
+# Testcase end: Invalidate logical slot on standby that's being synced from
+# the primary due to replication_slot_xid_age GUC.
+# =============================================================================
+
 sub wait_for_slot_invalidation
 {
-	my ($node, $slot_name, $offset, $inactive_timeout) = @_;
+	my ($node, $slot_name, $offset, $inactive_timeout, $reason) = @_;
 	my $name = $node->name;
 
 	# Wait for the replication slot to become inactive
@@ -238,7 +503,7 @@ sub wait_for_slot_invalidation
 		'postgres', qq[
 		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
 			WHERE slot_name = '$slot_name' AND
-			invalidation_reason = 'inactive_timeout';
+			invalidation_reason = '$reason';
 	])
 	  or die
 	  "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name";
@@ -283,4 +548,25 @@ sub check_for_slot_invalidation_in_server_log
 	);
 }
 
+# Do some work for advancing xids on a given node
+sub advance_xids
+{
+	my ($node, $table_name) = @_;
+
+	$node->safe_psql(
+		'postgres', qq[
+		do \$\$
+		begin
+		for i in 10000..11000 loop
+			-- use an exception block so that each iteration eats an XID
+			begin
+			insert into $table_name values (i);
+			exception
+			when division_by_zero then null;
+			end;
+		end loop;
+		end\$\$;
+	]);
+}
+
 done_testing();
-- 
2.34.1

#229

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#228)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Jun 17, 2024 at 05:55:04PM +0530, Bharath Rupireddy wrote:

Here are my thoughts on when to do the XID age invalidation. In all
the patches sent so far, the XID age invalidation happens in two
places - one during the slot acquisition, and another during the
checkpoint. As the suggestion is to do it during the vacuum (manual
and auto), so that even if the checkpoint isn't happening in the
database for whatever reasons, a vacuum command or autovacuum can
invalidate the slots whose XID is aged.

+1. IMHO this is a principled choice. The similar max_slot_wal_keep_size
parameter is considered where it arguably matters most: when we are trying
to remove/recycle WAL segments. Since this parameter is intended to
prevent the server from running out of space, it makes sense that we'd
apply it at the point where we are trying to free up space. The proposed
max_slot_xid_age parameter is intended to prevent the server from running
out of transaction IDs, so it follows that we'd apply it at the point where
we reclaim them, which happens to be vacuum.

An idea is to check for XID age based invalidation for all the slots
in ComputeXidHorizons() before it reads replication_slot_xmin and
replication_slot_catalog_xmin, and obviously before the proc array
lock is acquired. A potential problem with this approach is that the
invalidation check can become too aggressive as XID horizons are
computed from many places.

Another idea is to check for XID age based invalidation for all the
slots in higher levels than ComputeXidHorizons(), for example in
vacuum() which is an entry point for both vacuum command and
autovacuum. This approach seems similar to vacuum_failsafe_age GUC
which checks each relation for the failsafe age before vacuum gets
triggered on it.

I don't presently have any strong opinion on where this logic should go,
but in general, I think we should only invalidate slots if invalidating
them would allow us to advance the vacuum cutoff. If the cutoff is held
back by something else, I don't see a point in invalidating slots because
we'll just be breaking replication in return for no additional reclaimed
transaction IDs.

--
nathan

#230

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#228)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Mon, Jun 17, 2024 at 5:55 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Here are my thoughts on when to do the XID age invalidation. In all
the patches sent so far, the XID age invalidation happens in two
places - one during the slot acquisition, and another during the
checkpoint. As the suggestion is to do it during the vacuum (manual
and auto), so that even if the checkpoint isn't happening in the
database for whatever reasons, a vacuum command or autovacuum can
invalidate the slots whose XID is aged.

An idea is to check for XID age based invalidation for all the slots
in ComputeXidHorizons() before it reads replication_slot_xmin and
replication_slot_catalog_xmin, and obviously before the proc array
lock is acquired. A potential problem with this approach is that the
invalidation check can become too aggressive as XID horizons are
computed from many places.

Another idea is to check for XID age based invalidation for all the
slots in higher levels than ComputeXidHorizons(), for example in
vacuum() which is an entry point for both vacuum command and
autovacuum. This approach seems similar to vacuum_failsafe_age GUC
which checks each relation for the failsafe age before vacuum gets
triggered on it.

I am attaching the patches implementing the idea of invalidating
replication slots during vacuum when current slot xmin limits
(procArray->replication_slot_xmin and
procArray->replication_slot_catalog_xmin) are aged as per the new XID
age GUC. When either of these limits are aged, there must be at least
one replication slot that is aged, because the xmin limits, after all,
are the minimum of xmin or catalog_xmin of all replication slots. In
this approach, the new XID age GUC will help vacuum when needed,
because the current slot xmin limits are recalculated after
invalidating replication slots that are holding xmins for longer than
the age. The code is placed in vacuum() which is common for both
vacuum command and autovacuum, and gets executed only once every
vacuum cycle to not be too aggressive in invalidating.

However, there might be some concerns with this approach like the following:
1) Adding more code to vacuum might not be acceptable
2) What if invalidation of replication slots emits an error, will it
block vacuum forever? Currently, InvalidateObsoleteReplicationSlots()
is also called as part of the checkpoint, and emitting ERRORs from
within is avoided already. Therefore, there is no concern here for
now.
3) What if there are more replication slots to be invalidated, will it
delay the vacuum? If yes, by how much? <<TODO>>
4) Will the invalidation based on just current replication slot xmin
limits suffice irrespective of vacuum cutoffs? IOW, if the replication
slots are invalidated but vacuum isn't going to do any work because
vacuum cutoffs are not yet met? Is the invalidation work wasteful
here?
5) Is it okay to take just one more time the proc array lock to get
current replication slot xmin limits via
ProcArrayGetReplicationSlotXmin() once every vacuum cycle? <<TODO>>
6) Vacuum command can't be run on the standby in recovery. So, to help
invalidate replication slots on the standby, I have for now let the
checkpointer also do the XID age based invalidation. I know
invalidating both in checkpointer and vacuum may not be a great idea,
but I'm open to thoughts.

Following are some of the alternative approaches which IMHO don't help
vacuum when needed:
a) Let the checkpointer do the XID age based invalidation, and call it
out in the documentation that if the checkpoint doesn't happen, the
new GUC doesn't help even if the vacuum is run. This has been the
approach until v40 patch.
b) Checkpointer and/or other backends add an autovacuum work item via
AutoVacuumRequestWork(), and autovacuum when it gets to it will
invalidate the replication slots. But, what to do for the vacuum
command here?

Please find the attached v41 patches implementing the idea of vacuum
doing the invalidation.

Thoughts?

Thanks to Sawada-san for a detailed off-list discussion.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v41-0001-Add-inactive_timeout-based-replication-slot-inva.patchapplication/x-patch; name=v41-0001-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From dc1eeed06377c11b139724702370ba47cd5d5be3 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sun, 23 Jun 2024 14:07:25 +0000
Subject: [PATCH v41 1/2] Add inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get invalidated.

To achieve the above, postgres introduces a GUC allowing users
set inactive timeout. The replication slots that are inactive
for longer than specified amount of time get invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint

Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby, because such synced slots are typically considered not
active (for them to be later considered as inactive) as they don't
perform logical decoding to produce the changes.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CA%2BTgmoZTbaaEjSZUG1FL0mzxAdN3qmXksO3O9_PZhEuXTkVnRQ%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/202403260841.5jcv7ihniccy%40alvherre.pgsql
---
 doc/src/sgml/config.sgml                      |  33 ++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |  11 +-
 src/backend/replication/slot.c                | 188 +++++++++++-
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   6 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 286 ++++++++++++++++++
 13 files changed, 535 insertions(+), 20 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0c7a9082c5..5e7a81a1fd 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4554,6 +4554,39 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidates replication slots that are inactive for longer than
+        specified amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the timeout mechanism. This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during a checkpoint. The time since the slot has become
+        inactive is known from its
+        <structfield>inactive_since</structfield> value using which the
+        timeout is measured.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>).
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 8c18bea902..4867af1b61 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2580,6 +2580,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 56d3fb5d0e..a5967d400f 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -448,7 +448,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -651,6 +651,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 						   " name slot \"%s\" already exists on the standby",
 						   remote_slot->name));
 
+		/*
+		 * Skip the sync if the local slot is already invalidated. We do this
+		 * beforehand to avoid slot acquire and release.
+		 */
+		if (slot->data.invalidated != RS_INVAL_NONE)
+			return false;
+
 		/*
 		 * The slot has been synchronized before.
 		 *
@@ -667,7 +674,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 564cfee127..e80872f27b 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -159,6 +161,13 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 
+static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
+										   ReplicationSlot *s,
+										   XLogRecPtr oldestLSN,
+										   Oid dboid,
+										   TransactionId snapshotConflictHorizon,
+										   bool *invalidated);
+
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
@@ -535,12 +544,17 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if check_for_invalidation is true and the slot gets
+ * invalidated now or has been invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
+	bool		released_lock = false;
 
 	Assert(name != NULL);
 
@@ -615,6 +629,57 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	/*
+	 * Check if the acquired slot needs to be invalidated. And, error out if
+	 * it gets invalidated now or has been invalidated previously, because
+	 * there's no use in acquiring the invalidated slot.
+	 *
+	 * XXX: Currently we check for inactive_timeout invalidation here. We
+	 * might need to check for other invalidations too.
+	 */
+	if (check_for_invalidation)
+	{
+		bool		invalidated = false;
+
+		released_lock = InvalidatePossiblyObsoleteSlot(RS_INVAL_INACTIVE_TIMEOUT,
+													   s, 0, InvalidOid,
+													   InvalidTransactionId,
+													   &invalidated);
+
+		/*
+		 * If the slot has been invalidated, recalculate the resource limits.
+		 */
+		if (invalidated)
+		{
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+		}
+
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		{
+			/*
+			 * Release the lock if it's not yet to keep the cleanup path on
+			 * error happy.
+			 */
+			if (!released_lock)
+				LWLockRelease(ReplicationSlotControlLock);
+
+			Assert(s->inactive_since > 0);
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(s->data.name)),
+					 errdetail("This slot has been invalidated because it was inactive since %s for more than %d seconds specified by \"replication_slot_inactive_timeout\".",
+							   timestamptz_to_str(s->inactive_since),
+							   replication_slot_inactive_timeout)));
+		}
+	}
+
+	if (!released_lock)
+		LWLockRelease(ReplicationSlotControlLock);
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +850,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -808,7 +873,7 @@ ReplicationSlotAlter(const char *name, bool failover)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1480,7 +1545,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1510,6 +1576,13 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			Assert(inactive_since > 0);
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s for more than %d seconds specified by \"replication_slot_inactive_timeout\"."),
+							 timestamptz_to_str(inactive_since),
+							 replication_slot_inactive_timeout);
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1553,6 +1626,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1560,6 +1634,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1570,6 +1645,18 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT &&
+			(replication_slot_inactive_timeout > 0 &&
+			 s->inactive_since > 0 &&
+			 !(RecoveryInProgress() && s->data.synced)))
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1623,6 +1710,39 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+
+					/*
+					 * Quick exit if inactive timeout invalidation mechanism
+					 * is disabled or slot is currently being used or the slot
+					 * on standby is currently being synced from the primary.
+					 *
+					 * Note that we don't invalidate synced slots because,
+					 * they are typically considered not active as they don't
+					 * perform logical decoding to produce the changes.
+					 */
+					if (replication_slot_inactive_timeout == 0 ||
+						s->inactive_since == 0 ||
+						(RecoveryInProgress() && s->data.synced))
+						break;
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * replication_slot_inactive_timeout GUC.
+					 */
+					if (TimestampDifferenceExceeds(s->inactive_since, now,
+												   replication_slot_inactive_timeout * 1000))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+
+						/*
+						 * Invalidation due to inactive timeout implies that
+						 * no one is using the slot.
+						 */
+						Assert(s->active_pid == 0);
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1648,11 +1768,14 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so or if the slot is already ours,
+		 * then mark it invalidated.  Otherwise we'll signal the owning
+		 * process, below, and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot != NULL &&
+			 MyReplicationSlot == s &&
+			 active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1707,7 +1830,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1753,7 +1877,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1776,6 +1901,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1828,7 +1954,7 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk and invalidate slots.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1839,6 +1965,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated = false;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -1886,6 +2013,43 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	elog(DEBUG1, "performing replication slot invalidation");
+
+	/*
+	 * Note that we will make another pass over replication slots for
+	 * invalidations to keep the code simple. The assumption here is that the
+	 * traversal over replication slots isn't that costly even with hundreds
+	 * of replication slots. If it ever turns out that this assumption is
+	 * wrong, we might have to put the invalidation check logic in the above
+	 * loop, for that we might have to do the following:
+	 *
+	 * - Acqure ControlLock lock once before the loop.
+	 *
+	 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+	 *
+	 * - Handle the cases in which ControlLock gets released just like
+	 * InvalidateObsoleteReplicationSlots does.
+	 *
+	 * - Avoid saving slot info to disk two times for each invalidated slot.
+	 *
+	 * XXX: Should we move inactive_timeout inavalidation check closer to
+	 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+	 */
+	invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+													 0,
+													 InvalidOid,
+													 InvalidTransactionId);
+
+	if (invalidated)
+	{
+		/*
+		 * If any slots have been invalidated, recalculate the resource
+		 * limits.
+		 */
+		ReplicationSlotsComputeRequiredXmin(false);
+		ReplicationSlotsComputeRequiredLSN();
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index dd6c1d5a7e..9ad3e55704 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -539,7 +539,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index c623b07cf0..1741e09259 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -846,7 +846,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1459,7 +1459,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 46c258be28..4990e73c97 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2961,6 +2961,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e0567de219..535fb07385 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -335,6 +335,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 1bc80960ef..56d20e1a78 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -230,6 +232,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -245,7 +248,8 @@ extern void ReplicationSlotDrop(const char *name, bool nowait);
 extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, bool failover);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..b4c5ce2875 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -10,6 +10,7 @@ tests += {
        'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
     },
     'tests': [
+      't/050_invalidate_slots.pl',
       't/001_stream_rep.pl',
       't/002_archiving.pl',
       't/003_recovery_targets.pl',
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..4663019c16
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,286 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to replication_slot_inactive_timeout. Also,
+# check the logical failover slot synced on to the standby doesn't invalidate
+# the slot on its own, but gets the invalidated state from the remote slot on
+# the primary.
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr_1 = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb1_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('lsub1_sync_slot', 'test_decoding', false, false, true);}
+);
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+my $standby1_logstart = -s $standby1->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Synchronize the primary server slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced'.
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub1_sync_slot' AND synced AND NOT temporary;}
+	),
+	"t",
+	'logical slot lsub1_sync_slot has synced as true on standby');
+
+my $logstart = -s $primary->logfile;
+my $inactive_timeout = 2;
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$primary->reload;
+
+# Wait for the logical failover slot to become inactive on the primary. Note
+# that nobody has acquired that slot yet, so due to
+# replication_slot_inactive_timeout setting above it must get invalidated.
+wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart,
+	$inactive_timeout);
+
+# Set timeout on the standby also to check the synced slots don't get
+# invalidated due to timeout on the standby.
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$standby1->reload;
+
+# Now, sync the logical failover slot from the remote slot on the primary.
+# Note that the remote slot has already been invalidated due to inactive
+# timeout. Now, the standby must also see it as invalidated.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Wait for the inactive replication slot to be invalidated.
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'lsub1_sync_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for replication slot lsub1_sync_slot invalidation to be synced on standby";
+
+# Synced slot mustn't get invalidated on the standby even after a checkpoint,
+# it must sync invalidation from the primary. So, we must not see the slot's
+# invalidation message in server log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+ok( !$standby1->log_contains(
+		"invalidating obsolete replication slot \"lsub1_sync_slot\"",
+		$standby1_logstart),
+	'check that syned slot lsub1_sync_slot has not been invalidated on the standby'
+);
+
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
+wait_for_slot_invalidation($primary, 'sb1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end: Invalidate streaming standby's slot as well as logical failover
+# slot on primary due to replication_slot_inactive_timeout. Also, check the
+# logical failover slot synced on to the standby doesn't invalidate the slot on
+# its own, but gets the invalidated state from the remote slot on the primary.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+
+my $publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO ' ${inactive_timeout}s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+# =============================================================================
+
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset, $inactive_timeout) = @_;
+	my $name = $node->name;
+
+	# Wait for the replication slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for slot $slot_name to become inactive on node $name";
+
+	# Wait for the replication slot info to be updated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE inactive_since IS NOT NULL
+				AND slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for info of slot $slot_name to be updated on node $name";
+
+	# Sleep at least $inactive_timeout duration to avoid multiple checkpoints
+	# for the slot to get invalidated.
+	sleep($inactive_timeout);
+
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+
+	# Wait for the inactive replication slot to be invalidated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name";
+
+	# Check that the invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot_name', '0/1');
+	]);
+
+	ok( $stderr =~
+		  /can no longer get changes from replication slot "$slot_name"/,
+		"detected error upon trying to acquire invalidated slot $slot_name on node $name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot_name on node $name";
+}
+
+# Check for invalidation of slot in server log
+sub check_for_slot_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $name = $node->name;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot_name invalidation has been logged on node $name"
+	);
+}
+
+done_testing();
-- 
2.34.1

v41-0002-Add-XID-age-based-replication-slot-invalidation.patchapplication/x-patch; name=v41-0002-Add-XID-age-based-replication-slot-invalidation.patchDownload

From 3e4e9c8965d6109f71318a337c1f1e10f2ab67b6 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sun, 23 Jun 2024 16:18:21 +0000
Subject: [PATCH v41 2/2] Add XID age based replication slot invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres introduces a GUC allowing users
set slot XID age. The replication slots whose xmin or catalog_xmin
has reached the age specified by this setting get invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint
- During vacuum (both command-based and autovacuum)

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/20240327150557.GA3994937%40nathanxps13
Discussion: https://www.postgresql.org/message-id/CA%2BTgmoaRECcnyqxAxUhP5dk2S4HX%3DpGh-p-PkA3uc%2BjG_9hiMw%40mail.gmail.com
---
 doc/src/sgml/config.sgml                      |  26 ++
 doc/src/sgml/system-views.sgml                |   8 +
 src/backend/commands/vacuum.c                 |  80 +++++
 src/backend/replication/slot.c                | 151 +++++++-
 src/backend/utils/misc/guc_tables.c           |  10 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 321 +++++++++++++++++-
 8 files changed, 583 insertions(+), 17 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5e7a81a1fd..20d800ce0c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4587,6 +4587,32 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-xid-age" xreflabel="replication_slot_xid_age">
+      <term><varname>replication_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during vacuum or during checkpoint.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 4867af1b61..0490f9f156 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2587,6 +2587,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>xid_aged</literal> means that the slot's
+          <literal>xmin</literal> or <literal>catalog_xmin</literal>
+          has reached the age specified by
+          <xref linkend="guc-replication-slot-xid-age"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 48f8eab202..9eeb42ac27 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -47,6 +47,7 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/interrupt.h"
+#include "replication/slot.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -116,6 +117,7 @@ static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
 static double compute_parallel_delay(void);
 static VacOptValue get_vacoptval_from_boolean(DefElem *def);
 static bool vac_tid_reaped(ItemPointer itemptr, void *state);
+static void try_replication_slot_invalidation(void);
 
 /*
  * GUC check function to ensure GUC value specified is within the allowable
@@ -452,6 +454,75 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	MemoryContextDelete(vac_context);
 }
 
+/*
+ * Try invalidating replication slots based on current replication slot xmin
+ * limits once every vacuum cycle.
+ */
+static void
+try_replication_slot_invalidation(void)
+{
+	TransactionId min_slot_xmin;
+	TransactionId min_slot_catalog_xmin;
+	bool		can_invalidate = false;
+	TransactionId cutoff;
+	TransactionId curr;
+
+	curr = ReadNextTransactionId();
+
+	/*
+	 * The cutoff can tell how far we can go back from the current transaction
+	 * id till the age. And then, we check whether or not the xmin or
+	 * catalog_xmin falls within the cutoff; if yes, return true, otherwise
+	 * false.
+	 */
+	cutoff = curr - replication_slot_xid_age;
+
+	if (!TransactionIdIsNormal(cutoff))
+		cutoff = FirstNormalTransactionId;
+
+	ProcArrayGetReplicationSlotXmin(&min_slot_xmin, &min_slot_catalog_xmin);
+
+	/*
+	 * Current replication slot xmin limits can never be larger than the
+	 * current transaction id even in the case of transaction ID wraparound.
+	 */
+	Assert(min_slot_xmin <= curr);
+	Assert(min_slot_catalog_xmin <= curr);
+
+	if (TransactionIdIsNormal(min_slot_xmin) &&
+		TransactionIdPrecedesOrEquals(min_slot_xmin, cutoff))
+		can_invalidate = true;
+	else if (TransactionIdIsNormal(min_slot_catalog_xmin) &&
+			 TransactionIdPrecedesOrEquals(min_slot_catalog_xmin, cutoff))
+		can_invalidate = true;
+
+	if (can_invalidate)
+	{
+		bool		invalidated = false;
+
+		/*
+		 * Note that InvalidateObsoleteReplicationSlots is also called as part
+		 * of CHECKPOINT, and emitting ERRORs from within is avoided already.
+		 * Therefore, there is no concern here that any ERROR from
+		 * invalidating replication slots blocks VACUUM.
+		 */
+		invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE,
+														 0,
+														 InvalidOid,
+														 InvalidTransactionId);
+
+		if (invalidated)
+		{
+			/*
+			 * If any slots have been invalidated, recalculate the resource
+			 * limits.
+			 */
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+		}
+	}
+}
+
 /*
  * Internal entry point for autovacuum and the VACUUM / ANALYZE commands.
  *
@@ -483,6 +554,7 @@ vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
 	const char *stmttype;
 	volatile bool in_outer_xact,
 				use_own_xacts;
+	static bool first_time = true;
 
 	Assert(params != NULL);
 
@@ -594,6 +666,14 @@ vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
 		CommitTransactionCommand();
 	}
 
+	if (params->options & VACOPT_VACUUM &&
+		first_time &&
+		replication_slot_xid_age > 0)
+	{
+		try_replication_slot_invalidation();
+		first_time = false;
+	}
+
 	/* Turn vacuum cost accounting on or off, and set/clear in_vacuum */
 	PG_TRY();
 	{
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index e80872f27b..79ac412d8e 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -108,10 +108,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
 	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
+	[RS_INVAL_XID_AGE] = "xid_aged",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -142,6 +143,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 int			replication_slot_inactive_timeout = 0;
+int			replication_slot_xid_age = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -160,6 +162,9 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool ReplicationSlotIsXIDAged(ReplicationSlot *slot,
+									 TransactionId *xmin,
+									 TransactionId *catalog_xmin);
 
 static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 										   ReplicationSlot *s,
@@ -636,8 +641,8 @@ retry:
 	 * it gets invalidated now or has been invalidated previously, because
 	 * there's no use in acquiring the invalidated slot.
 	 *
-	 * XXX: Currently we check for inactive_timeout invalidation here. We
-	 * might need to check for other invalidations too.
+	 * XXX: Currently we check for inactive_timeout and xid_aged invalidations
+	 * here. We might need to check for other invalidations too.
 	 */
 	if (check_for_invalidation)
 	{
@@ -648,6 +653,22 @@ retry:
 													   InvalidTransactionId,
 													   &invalidated);
 
+		if (!invalidated && released_lock)
+		{
+			/* The slot is still ours */
+			Assert(s->active_pid == MyProcPid);
+
+			/* Reacquire the ControlLock */
+			LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+			released_lock = false;
+		}
+
+		if (!invalidated)
+			released_lock = InvalidatePossiblyObsoleteSlot(RS_INVAL_XID_AGE,
+														   s, 0, InvalidOid,
+														   InvalidTransactionId,
+														   &invalidated);
+
 		/*
 		 * If the slot has been invalidated, recalculate the resource limits.
 		 */
@@ -657,7 +678,8 @@ retry:
 			ReplicationSlotsComputeRequiredLSN();
 		}
 
-		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT ||
+			s->data.invalidated == RS_INVAL_XID_AGE)
 		{
 			/*
 			 * Release the lock if it's not yet to keep the cleanup path on
@@ -665,7 +687,10 @@ retry:
 			 */
 			if (!released_lock)
 				LWLockRelease(ReplicationSlotControlLock);
+		}
 
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		{
 			Assert(s->inactive_since > 0);
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -675,6 +700,20 @@ retry:
 							   timestamptz_to_str(s->inactive_since),
 							   replication_slot_inactive_timeout)));
 		}
+
+		if (s->data.invalidated == RS_INVAL_XID_AGE)
+		{
+			Assert(TransactionIdIsValid(s->data.xmin) ||
+				   TransactionIdIsValid(s->data.catalog_xmin));
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(s->data.name)),
+					 errdetail("The slot's xmin %u or catalog_xmin %u has reached the age %d specified by \"replication_slot_xid_age\".",
+							   s->data.xmin,
+							   s->data.catalog_xmin,
+							   replication_slot_xid_age)));
+		}
 	}
 
 	if (!released_lock)
@@ -1546,7 +1585,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
 					   TransactionId snapshotConflictHorizon,
-					   TimestampTz inactive_since)
+					   TimestampTz inactive_since,
+					   TransactionId xmin,
+					   TransactionId catalog_xmin)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1583,6 +1624,20 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 							 timestamptz_to_str(inactive_since),
 							 replication_slot_inactive_timeout);
 			break;
+		case RS_INVAL_XID_AGE:
+			Assert(TransactionIdIsValid(xmin) ||
+				   TransactionIdIsValid(catalog_xmin));
+
+			if (TransactionIdIsValid(xmin))
+				appendStringInfo(&err_detail, _("The slot's xmin %u has reached the age %d specified by \"replication_slot_xid_age\"."),
+								 xmin,
+								 replication_slot_xid_age);
+			else if (TransactionIdIsValid(catalog_xmin))
+				appendStringInfo(&err_detail, _("The slot's catalog_xmin %u has reached the age %d specified by \"replication_slot_xid_age\"."),
+								 catalog_xmin,
+								 replication_slot_xid_age);
+
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1627,6 +1682,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
 	TimestampTz inactive_since = 0;
+	TransactionId aged_xmin = InvalidTransactionId;
+	TransactionId aged_catalog_xmin = InvalidTransactionId;
 
 	for (;;)
 	{
@@ -1743,6 +1800,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 						Assert(s->active_pid == 0);
 					}
 					break;
+				case RS_INVAL_XID_AGE:
+					if (ReplicationSlotIsXIDAged(s, &aged_xmin, &aged_catalog_xmin))
+					{
+						Assert(TransactionIdIsValid(aged_xmin) ||
+							   TransactionIdIsValid(aged_catalog_xmin));
+
+						invalidation_cause = cause;
+						break;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1831,7 +1898,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
 									   oldestLSN, snapshotConflictHorizon,
-									   inactive_since);
+									   inactive_since, aged_xmin,
+									   aged_catalog_xmin);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1878,7 +1946,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
 								   oldestLSN, snapshotConflictHorizon,
-								   inactive_since);
+								   inactive_since, aged_xmin,
+								   aged_catalog_xmin);
 
 			/* done with this slot for now */
 			break;
@@ -1902,6 +1971,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -2033,14 +2103,20 @@ CheckPointReplicationSlots(bool is_shutdown)
 	 *
 	 * - Avoid saving slot info to disk two times for each invalidated slot.
 	 *
-	 * XXX: Should we move inactive_timeout inavalidation check closer to
-	 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+	 * XXX: Should we move inactive_timeout and xid_aged inavalidation checks
+	 * closer to wal_removed in CreateCheckPoint and CreateRestartPoint?
 	 */
 	invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
 													 0,
 													 InvalidOid,
 													 InvalidTransactionId);
 
+	if (!invalidated)
+		invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE,
+														 0,
+														 InvalidOid,
+														 InvalidTransactionId);
+
 	if (invalidated)
 	{
 		/*
@@ -2052,6 +2128,63 @@ CheckPointReplicationSlots(bool is_shutdown)
 	}
 }
 
+/*
+ * Returns true if the given replication slot's xmin or catalog_xmin age is
+ * more than replication_slot_xid_age.
+ *
+ * Note that the caller must hold the replication slot's spinlock to avoid
+ * race conditions while this function reads xmin and catalog_xmin.
+ */
+static bool
+ReplicationSlotIsXIDAged(ReplicationSlot *slot, TransactionId *xmin,
+						 TransactionId *catalog_xmin)
+{
+	TransactionId cutoff;
+	TransactionId curr;
+
+	if (replication_slot_xid_age == 0)
+		return false;
+
+	curr = ReadNextTransactionId();
+
+	/*
+	 * Replication slot's xmin and catalog_xmin can never be larger than the
+	 * current transaction id even in the case of transaction ID wraparound.
+	 */
+	Assert(slot->data.xmin <= curr);
+	Assert(slot->data.catalog_xmin <= curr);
+
+	/*
+	 * The cutoff can tell how far we can go back from the current transaction
+	 * id till the age. And then, we check whether or not the xmin or
+	 * catalog_xmin falls within the cutoff; if yes, return true, otherwise
+	 * false.
+	 */
+	cutoff = curr - replication_slot_xid_age;
+
+	if (!TransactionIdIsNormal(cutoff))
+		cutoff = FirstNormalTransactionId;
+
+	*xmin = InvalidTransactionId;
+	*catalog_xmin = InvalidTransactionId;
+
+	if (TransactionIdIsNormal(slot->data.xmin) &&
+		TransactionIdPrecedesOrEquals(slot->data.xmin, cutoff))
+	{
+		*xmin = slot->data.xmin;
+		return true;
+	}
+
+	if (TransactionIdIsNormal(slot->data.catalog_xmin) &&
+		TransactionIdPrecedesOrEquals(slot->data.catalog_xmin, cutoff))
+	{
+		*catalog_xmin = slot->data.catalog_xmin;
+		return true;
+	}
+
+	return false;
+}
+
 /*
  * Load all replication slots from disk into memory at server startup. This
  * needs to be run before we start crash recovery.
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 4990e73c97..ca210c6bf9 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2973,6 +2973,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&replication_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 535fb07385..f04771d65c 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -336,6 +336,7 @@
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
 #replication_slot_inactive_timeout = 0	# in seconds; 0 disables
+#replication_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 56d20e1a78..e757b836c5 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -55,6 +55,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* inactive slot timeout has occurred */
 	RS_INVAL_INACTIVE_TIMEOUT,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -233,6 +235,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *standby_slot_names;
 extern PGDLLIMPORT int replication_slot_inactive_timeout;
+extern PGDLLIMPORT int replication_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index 4663019c16..18300cfeca 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -89,7 +89,7 @@ $primary->reload;
 # that nobody has acquired that slot yet, so due to
 # replication_slot_inactive_timeout setting above it must get invalidated.
 wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart,
-	$inactive_timeout);
+	$inactive_timeout, 'inactive_timeout');
 
 # Set timeout on the standby also to check the synced slots don't get
 # invalidated due to timeout on the standby.
@@ -129,7 +129,7 @@ $standby1->stop;
 
 # Wait for the standby's replication slot to become inactive
 wait_for_slot_invalidation($primary, 'sb1_slot', $logstart,
-	$inactive_timeout);
+	$inactive_timeout, 'inactive_timeout');
 
 # Testcase end: Invalidate streaming standby's slot as well as logical failover
 # slot on primary due to replication_slot_inactive_timeout. Also, check the
@@ -197,15 +197,280 @@ $subscriber->stop;
 # Wait for the replication slot to become inactive and then invalidated due to
 # timeout.
 wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
-	$inactive_timeout);
+	$inactive_timeout, 'inactive_timeout');
 
 # Testcase end: Invalidate logical subscriber's slot due to
 # replication_slot_inactive_timeout.
 # =============================================================================
 
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot due to replication_slot_xid_age
+# GUC.
+
+# Prepare for the next test
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby2->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb2_slot', immediately_reserve := true);
+]);
+
+$standby2->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NOT NULL AND catalog_xmin IS NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb2_slot';
+]) or die "Timed out waiting for slot sb2_slot xmin to advance";
+
+$primary->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby2->stop;
+
+$logstart = -s $primary->logfile;
+
+# Do some work to advance xids on primary
+advance_xids($primary, 'tab_int');
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+wait_for_slot_invalidation($primary, 'sb2_slot', $logstart, 0, 'xid_aged');
+
+# Testcase end: Invalidate streaming standby's slot due to replication_slot_xid_age
+# GUC.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to
+# replication_slot_xid_age GUC.
+
+$publisher = $primary;
+$publisher->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 500;
+]);
+$publisher->reload;
+
+$subscriber->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+));
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl2 (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl2 (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl2 VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+$publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub2 FOR TABLE test_tbl2");
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub2 WITH (slot_name = 'lsub2_slot')"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub2');
+
+$result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl2");
+
+is($result, qq(5), "check initial copy was done");
+
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NULL AND catalog_xmin IS NOT NULL
+	FROM pg_catalog.pg_replication_slots
+	WHERE slot_name = 'lsub2_slot';
+]) or die "Timed out waiting for slot lsub2_slot catalog_xmin to advance";
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Do some work to advance xids on publisher
+advance_xids($publisher, 'test_tbl2');
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+wait_for_slot_invalidation($publisher, 'lsub2_slot', $logstart, 0,
+	'xid_aged');
+
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_xid_age GUC.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical slot on standby that's being synced from
+# the primary due to replication_slot_xid_age GUC.
+
+$publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 0;
+]);
+$publisher->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby3 = PostgreSQL::Test::Cluster->new('standby3');
+$standby3->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+$standby3->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb3_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb3_slot', immediately_reserve := true);
+]);
+
+$standby3->start;
+
+my $standby3_logstart = -s $standby3->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby3);
+
+$subscriber->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+));
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl3 (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl3 (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl3 VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub3 FOR TABLE test_tbl3");
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub3 CONNECTION '$publisher_connstr' PUBLICATION pub3 WITH (slot_name = 'lsub3_sync_slot', failover = true)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub3');
+
+$result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl3");
+
+is($result, qq(5), "check initial copy was done");
+
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NULL AND catalog_xmin IS NOT NULL
+	FROM pg_catalog.pg_replication_slots
+	WHERE slot_name = 'lsub3_sync_slot';
+])
+  or die "Timed out waiting for slot lsub3_sync_slot catalog_xmin to advance";
+
+# Synchronize the primary server slots to the standby
+$standby3->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced' and has got catalog_xmin from the primary.
+is( $standby3->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub3_sync_slot' AND synced AND NOT temporary AND
+			xmin IS NULL AND catalog_xmin IS NOT NULL;}
+	),
+	"t",
+	'logical slot has synced as true on standby');
+
+my $primary_catalog_xmin = $primary->safe_psql('postgres',
+	"SELECT catalog_xmin FROM pg_replication_slots WHERE slot_name = 'lsub3_sync_slot' AND catalog_xmin IS NOT NULL;"
+);
+
+my $stabdby3_catalog_xmin = $standby3->safe_psql('postgres',
+	"SELECT catalog_xmin FROM pg_replication_slots WHERE slot_name = 'lsub3_sync_slot' AND catalog_xmin IS NOT NULL;"
+);
+
+is($primary_catalog_xmin, $stabdby3_catalog_xmin,
+	"check catalog_xmin are same for primary slot and synced slot");
+
+# Enable XID age based invalidation on the standby. Note that we disabled the
+# same on the primary to check if the invalidation occurs for synced slot on
+# the standby.
+$standby3->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 500;
+]);
+$standby3->reload;
+
+$logstart = -s $standby3->logfile;
+
+# Do some work to advance xids on primary
+advance_xids($primary, 'test_tbl3');
+
+# Wait for standby to catch up with the above work
+$primary->wait_for_catchup($standby3);
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+wait_for_slot_invalidation($standby3, 'lsub3_sync_slot', $logstart, 0,
+	'xid_aged');
+
+# Note that the replication slot on the primary is still active
+$result = $primary->safe_psql('postgres',
+	"SELECT COUNT(slot_name) = 1 FROM pg_replication_slots WHERE slot_name = 'lsub3_sync_slot' AND invalidation_reason IS NULL;"
+);
+
+is($result, 't', "check lsub3_sync_slot is still active on primary");
+
+# Testcase end: Invalidate logical slot on standby that's being synced from
+# the primary due to replication_slot_xid_age GUC.
+# =============================================================================
+
 sub wait_for_slot_invalidation
 {
-	my ($node, $slot_name, $offset, $inactive_timeout) = @_;
+	my ($node, $slot_name, $offset, $inactive_timeout, $reason) = @_;
 	my $name = $node->name;
 
 	# Wait for the replication slot to become inactive
@@ -231,14 +496,15 @@ sub wait_for_slot_invalidation
 	# for the slot to get invalidated.
 	sleep($inactive_timeout);
 
-	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset,
+		$reason);
 
 	# Wait for the inactive replication slot to be invalidated
 	$node->poll_query_until(
 		'postgres', qq[
 		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
 			WHERE slot_name = '$slot_name' AND
-			invalidation_reason = 'inactive_timeout';
+			invalidation_reason = '$reason';
 	])
 	  or die
 	  "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name";
@@ -262,15 +528,33 @@ sub wait_for_slot_invalidation
 # Check for invalidation of slot in server log
 sub check_for_slot_invalidation_in_server_log
 {
-	my ($node, $slot_name, $offset) = @_;
+	my ($node, $slot_name, $offset, $reason) = @_;
 	my $name = $node->name;
 	my $invalidated = 0;
+	my $isrecovery =
+	  $node->safe_psql('postgres', "SELECT pg_is_in_recovery()");
+
+	chomp($isrecovery);
 
 	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 	{
-		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($reason eq 'xid_aged' && $isrecovery eq 'f')
+		{
+			$node->safe_psql('postgres', "VACUUM");
+		}
+		else
+		{
+			$node->safe_psql('postgres', "CHECKPOINT");
+		}
+
 		if ($node->log_contains(
 				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset)
+			|| $node->log_contains(
+				"The slot's xmin .* has reached the age .* specified by \"replication_slot_xid_age\".",
+				$offset)
+			|| $node->log_contains(
+				"The slot's catalog_xmin .* has reached the age .* specified by \"replication_slot_xid_age\".",
 				$offset))
 		{
 			$invalidated = 1;
@@ -283,4 +567,25 @@ sub check_for_slot_invalidation_in_server_log
 	);
 }
 
+# Do some work for advancing xids on a given node
+sub advance_xids
+{
+	my ($node, $table_name) = @_;
+
+	$node->safe_psql(
+		'postgres', qq[
+		do \$\$
+		begin
+		for i in 10000..11000 loop
+			-- use an exception block so that each iteration eats an XID
+			begin
+			insert into $table_name values (i);
+			exception
+			when division_by_zero then null;
+			end;
+		end loop;
+		end\$\$;
+	]);
+}
+
 done_testing();
-- 
2.34.1

#231

Nathan Bossart

nathandbossart@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#230)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Jun 24, 2024 at 11:30:00AM +0530, Bharath Rupireddy wrote:

6) Vacuum command can't be run on the standby in recovery. So, to help
invalidate replication slots on the standby, I have for now let the
checkpointer also do the XID age based invalidation. I know
invalidating both in checkpointer and vacuum may not be a great idea,
but I'm open to thoughts.

Hm. I hadn't considered this angle.

a) Let the checkpointer do the XID age based invalidation, and call it
out in the documentation that if the checkpoint doesn't happen, the
new GUC doesn't help even if the vacuum is run. This has been the
approach until v40 patch.

My first reaction is that this is probably okay. I guess you might run
into problems if you set max_slot_xid_age to 2B and checkpoint_timeout to 1
day, but even in that case your transaction ID usage rate would need to be
pretty high for wraparound to occur.

--
nathan

#232

Ajin Cherian

itsajin@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#230)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Jun 24, 2024 at 4:01 PM Bharath Rupireddy <
bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

On Mon, Jun 17, 2024 at 5:55 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Here are my thoughts on when to do the XID age invalidation. In all
the patches sent so far, the XID age invalidation happens in two
places - one during the slot acquisition, and another during the
checkpoint. As the suggestion is to do it during the vacuum (manual
and auto), so that even if the checkpoint isn't happening in the
database for whatever reasons, a vacuum command or autovacuum can
invalidate the slots whose XID is aged.

An idea is to check for XID age based invalidation for all the slots
in ComputeXidHorizons() before it reads replication_slot_xmin and
replication_slot_catalog_xmin, and obviously before the proc array
lock is acquired. A potential problem with this approach is that the
invalidation check can become too aggressive as XID horizons are
computed from many places.

Another idea is to check for XID age based invalidation for all the
slots in higher levels than ComputeXidHorizons(), for example in
vacuum() which is an entry point for both vacuum command and
autovacuum. This approach seems similar to vacuum_failsafe_age GUC
which checks each relation for the failsafe age before vacuum gets
triggered on it.

I am attaching the patches implementing the idea of invalidating
replication slots during vacuum when current slot xmin limits
(procArray->replication_slot_xmin and
procArray->replication_slot_catalog_xmin) are aged as per the new XID
age GUC. When either of these limits are aged, there must be at least
one replication slot that is aged, because the xmin limits, after all,
are the minimum of xmin or catalog_xmin of all replication slots. In
this approach, the new XID age GUC will help vacuum when needed,
because the current slot xmin limits are recalculated after
invalidating replication slots that are holding xmins for longer than
the age. The code is placed in vacuum() which is common for both
vacuum command and autovacuum, and gets executed only once every
vacuum cycle to not be too aggressive in invalidating.

However, there might be some concerns with this approach like the
following:
1) Adding more code to vacuum might not be acceptable
2) What if invalidation of replication slots emits an error, will it
block vacuum forever? Currently, InvalidateObsoleteReplicationSlots()
is also called as part of the checkpoint, and emitting ERRORs from
within is avoided already. Therefore, there is no concern here for
now.
3) What if there are more replication slots to be invalidated, will it
delay the vacuum? If yes, by how much? <<TODO>>
4) Will the invalidation based on just current replication slot xmin
limits suffice irrespective of vacuum cutoffs? IOW, if the replication
slots are invalidated but vacuum isn't going to do any work because
vacuum cutoffs are not yet met? Is the invalidation work wasteful
here?
5) Is it okay to take just one more time the proc array lock to get
current replication slot xmin limits via
ProcArrayGetReplicationSlotXmin() once every vacuum cycle? <<TODO>>
6) Vacuum command can't be run on the standby in recovery. So, to help
invalidate replication slots on the standby, I have for now let the
checkpointer also do the XID age based invalidation. I know
invalidating both in checkpointer and vacuum may not be a great idea,
but I'm open to thoughts.

Following are some of the alternative approaches which IMHO don't help
vacuum when needed:
a) Let the checkpointer do the XID age based invalidation, and call it
out in the documentation that if the checkpoint doesn't happen, the
new GUC doesn't help even if the vacuum is run. This has been the
approach until v40 patch.
b) Checkpointer and/or other backends add an autovacuum work item via
AutoVacuumRequestWork(), and autovacuum when it gets to it will
invalidate the replication slots. But, what to do for the vacuum
command here?

Please find the attached v41 patches implementing the idea of vacuum
doing the invalidation.

Thoughts?

Thanks to Sawada-san for a detailed off-list discussion.

The patch no longer applies on HEAD, please rebase.

regards,
Ajin Cherian
Fujitsu Australia

#233

Masahiko Sawada

sawada.mshk@gmail.com

over 1 year ago

In reply to: Nathan Bossart (#231)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Jul 9, 2024 at 3:01 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

On Mon, Jun 24, 2024 at 11:30:00AM +0530, Bharath Rupireddy wrote:

6) Vacuum command can't be run on the standby in recovery. So, to help
invalidate replication slots on the standby, I have for now let the
checkpointer also do the XID age based invalidation. I know
invalidating both in checkpointer and vacuum may not be a great idea,
but I'm open to thoughts.

Hm. I hadn't considered this angle.

Another idea would be to let the startup process do slot invalidation
when replaying a RUNNING_XACTS record. Since a RUNNING_XACTS record
has the latest XID on the primary, I think the startup process can
compare it to the slot-xmin, and invalidate slots which are older than
the age limit.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#234

Ajin Cherian

itsajin@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#230)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Jun 24, 2024 at 4:01 PM Bharath Rupireddy <
bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

On Mon, Jun 17, 2024 at 5:55 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Please find the attached v41 patches implementing the idea of vacuum
doing the invalidation.

Thoughts?

Some minor comments on the patch:
1.
+ /*
+ * Release the lock if it's not yet to keep the cleanup path on
+ * error happy.
+ */

I suggest rephrasing to: " "Release the lock if it hasn't been already to
ensure smooth cleanup on error."

elog(DEBUG1, "performing replication slot invalidation");

Probably change it to "performing replication slot invalidation checks" as
we might not actually invalidate any slot here.

In CheckPointReplicationSlots()

+ invalidated =
InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+ 0,
+ InvalidOid,
+ InvalidTransactionId);
+
+ if (invalidated)
+ {
+ /*
+ * If any slots have been invalidated, recalculate the resource
+ * limits.
+ */
+ ReplicationSlotsComputeRequiredXmin(false);
+ ReplicationSlotsComputeRequiredLSN();
+ }

Is this calculation of resource limits really required here when the same
is already done inside InvalidateObsoleteReplicationSlots()

regards,
Ajin Cherian
Fujitsu Australia

#235

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 1 year ago

In reply to: Ajin Cherian (#234)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Aug 14, 2024 at 9:20 AM Ajin Cherian <itsajin@gmail.com> wrote:

Some minor comments on the patch:

Thanks for reviewing.

1.
+ /*
+ * Release the lock if it's not yet to keep the cleanup path on
+ * error happy.
+ */
I suggest rephrasing to: " "Release the lock if it hasn't been already to ensure smooth cleanup on error."

Changed.

2.

elog(DEBUG1, "performing replication slot invalidation");

Probably change it to "performing replication slot invalidation checks" as we might not actually invalidate any slot here.

Changed.

3.
+ ReplicationSlotsComputeRequiredXmin(false);
+ ReplicationSlotsComputeRequiredLSN();
+ }
Is this calculation of resource limits really required here when the same is already done inside InvalidateObsoleteReplicationSlots()

Nice catch. Removed.

Please find the attached v42 patches.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v42-0001-Add-inactive_timeout-based-replication-slot-inva.patchapplication/x-patch; name=v42-0001-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 1a96a7decd928b4c36429a9a6d97467d61f87896 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 26 Aug 2024 05:22:15 +0000
Subject: [PATCH v42 1/2] Add inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get invalidated.

To achieve the above, postgres introduces a GUC allowing users
set inactive timeout. The replication slots that are inactive
for longer than specified amount of time get invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint

Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby, because such synced slots are typically considered not
active (for them to be later considered as inactive) as they don't
perform logical decoding to produce the changes.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Reviewed-by: Ajin Cherian, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CA%2BTgmoZTbaaEjSZUG1FL0mzxAdN3qmXksO3O9_PZhEuXTkVnRQ%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/202403260841.5jcv7ihniccy%40alvherre.pgsql
---
 doc/src/sgml/config.sgml                      |  33 ++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |  11 +-
 src/backend/replication/slot.c                | 175 ++++++++++-
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   6 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 286 ++++++++++++++++++
 13 files changed, 522 insertions(+), 20 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 2937384b00..fbbacbee5b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4560,6 +4560,39 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidates replication slots that are inactive for longer than
+        specified amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the timeout mechanism. This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during a checkpoint. The time since the slot has become
+        inactive is known from its
+        <structfield>inactive_since</structfield> value using which the
+        timeout is measured.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>).
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 634a4c0fab..9e00f7d184 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2618,6 +2618,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 2d1914ce08..abc2c06feb 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -448,7 +448,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -651,6 +651,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 						   " name slot \"%s\" already exists on the standby",
 						   remote_slot->name));
 
+		/*
+		 * Skip the sync if the local slot is already invalidated. We do this
+		 * beforehand to avoid slot acquire and release.
+		 */
+		if (slot->data.invalidated != RS_INVAL_NONE)
+			return false;
+
 		/*
 		 * The slot has been synchronized before.
 		 *
@@ -667,7 +674,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index c290339af5..5f504f956f 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -159,6 +161,13 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 
+static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
+										   ReplicationSlot *s,
+										   XLogRecPtr oldestLSN,
+										   Oid dboid,
+										   TransactionId snapshotConflictHorizon,
+										   bool *invalidated);
+
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
@@ -535,12 +544,17 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if check_for_invalidation is true and the slot gets
+ * invalidated now or has been invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
+	bool		released_lock = false;
 
 	Assert(name != NULL);
 
@@ -615,6 +629,57 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	/*
+	 * Check if the acquired slot needs to be invalidated. And, error out if
+	 * it gets invalidated now or has been invalidated previously, because
+	 * there's no use in acquiring the invalidated slot.
+	 *
+	 * XXX: Currently we check for inactive_timeout invalidation here. We
+	 * might need to check for other invalidations too.
+	 */
+	if (check_for_invalidation)
+	{
+		bool		invalidated = false;
+
+		released_lock = InvalidatePossiblyObsoleteSlot(RS_INVAL_INACTIVE_TIMEOUT,
+													   s, 0, InvalidOid,
+													   InvalidTransactionId,
+													   &invalidated);
+
+		/*
+		 * If the slot has been invalidated, recalculate the resource limits.
+		 */
+		if (invalidated)
+		{
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+		}
+
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		{
+			/*
+			 * Release the lock if it hasn't been already, to ensure smooth
+			 * cleanup on error.
+			 */
+			if (!released_lock)
+				LWLockRelease(ReplicationSlotControlLock);
+
+			Assert(s->inactive_since > 0);
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(s->data.name)),
+					 errdetail("This slot has been invalidated because it was inactive since %s for more than %d seconds specified by \"replication_slot_inactive_timeout\".",
+							   timestamptz_to_str(s->inactive_since),
+							   replication_slot_inactive_timeout)));
+		}
+	}
+
+	if (!released_lock)
+		LWLockRelease(ReplicationSlotControlLock);
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +850,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +877,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1501,7 +1566,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1531,6 +1597,13 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			Assert(inactive_since > 0);
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s for more than %d seconds specified by \"replication_slot_inactive_timeout\"."),
+							 timestamptz_to_str(inactive_since),
+							 replication_slot_inactive_timeout);
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1574,6 +1647,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1581,6 +1655,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1591,6 +1666,18 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT &&
+			(replication_slot_inactive_timeout > 0 &&
+			 s->inactive_since > 0 &&
+			 !(RecoveryInProgress() && s->data.synced)))
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1644,6 +1731,39 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+
+					/*
+					 * Quick exit if inactive timeout invalidation mechanism
+					 * is disabled or slot is currently being used or the slot
+					 * on standby is currently being synced from the primary.
+					 *
+					 * Note that we don't invalidate synced slots because,
+					 * they are typically considered not active as they don't
+					 * perform logical decoding to produce the changes.
+					 */
+					if (replication_slot_inactive_timeout == 0 ||
+						s->inactive_since == 0 ||
+						(RecoveryInProgress() && s->data.synced))
+						break;
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * replication_slot_inactive_timeout GUC.
+					 */
+					if (TimestampDifferenceExceeds(s->inactive_since, now,
+												   replication_slot_inactive_timeout * 1000))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+
+						/*
+						 * Invalidation due to inactive timeout implies that
+						 * no one is using the slot.
+						 */
+						Assert(s->active_pid == 0);
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1669,11 +1789,14 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so or if the slot is already ours,
+		 * then mark it invalidated.  Otherwise we'll signal the owning
+		 * process, below, and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot != NULL &&
+			 MyReplicationSlot == s &&
+			 active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1728,7 +1851,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1774,7 +1898,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1797,6 +1922,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1849,7 +1975,7 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk and invalidate slots.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1907,6 +2033,31 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	elog(DEBUG1, "performing replication slot invalidation checks");
+
+	/*
+	 * Note that we will make another pass over replication slots for
+	 * invalidations to keep the code simple. The assumption here is that the
+	 * traversal over replication slots isn't that costly even with hundreds
+	 * of replication slots. If it ever turns out that this assumption is
+	 * wrong, we might have to put the invalidation check logic in the above
+	 * loop, for that we might have to do the following:
+	 *
+	 * - Acqure ControlLock lock once before the loop.
+	 *
+	 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+	 *
+	 * - Handle the cases in which ControlLock gets released just like
+	 * InvalidateObsoleteReplicationSlots does.
+	 *
+	 * - Avoid saving slot info to disk two times for each invalidated slot.
+	 *
+	 * XXX: Should we move inactive_timeout inavalidation check closer to
+	 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+	 */
+	InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+									   0, InvalidOid, InvalidTransactionId);
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index c7bfbb15e0..b1b7b075bd 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -540,7 +540,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index c5f1009f37..61a0e38715 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -844,7 +844,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1462,7 +1462,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index af227b1f24..861692c683 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 667e0dc40a..deca3a4aeb 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -335,6 +335,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index c2ee149fd6..dd56a77547 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -230,6 +232,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -246,7 +249,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 712924c2fa..301be0f6c1 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -10,6 +10,7 @@ tests += {
        'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
     },
     'tests': [
+      't/050_invalidate_slots.pl',
       't/001_stream_rep.pl',
       't/002_archiving.pl',
       't/003_recovery_targets.pl',
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..4663019c16
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,286 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to replication_slot_inactive_timeout. Also,
+# check the logical failover slot synced on to the standby doesn't invalidate
+# the slot on its own, but gets the invalidated state from the remote slot on
+# the primary.
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr_1 = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb1_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('lsub1_sync_slot', 'test_decoding', false, false, true);}
+);
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+my $standby1_logstart = -s $standby1->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Synchronize the primary server slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced'.
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub1_sync_slot' AND synced AND NOT temporary;}
+	),
+	"t",
+	'logical slot lsub1_sync_slot has synced as true on standby');
+
+my $logstart = -s $primary->logfile;
+my $inactive_timeout = 2;
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$primary->reload;
+
+# Wait for the logical failover slot to become inactive on the primary. Note
+# that nobody has acquired that slot yet, so due to
+# replication_slot_inactive_timeout setting above it must get invalidated.
+wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart,
+	$inactive_timeout);
+
+# Set timeout on the standby also to check the synced slots don't get
+# invalidated due to timeout on the standby.
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$standby1->reload;
+
+# Now, sync the logical failover slot from the remote slot on the primary.
+# Note that the remote slot has already been invalidated due to inactive
+# timeout. Now, the standby must also see it as invalidated.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Wait for the inactive replication slot to be invalidated.
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'lsub1_sync_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for replication slot lsub1_sync_slot invalidation to be synced on standby";
+
+# Synced slot mustn't get invalidated on the standby even after a checkpoint,
+# it must sync invalidation from the primary. So, we must not see the slot's
+# invalidation message in server log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+ok( !$standby1->log_contains(
+		"invalidating obsolete replication slot \"lsub1_sync_slot\"",
+		$standby1_logstart),
+	'check that syned slot lsub1_sync_slot has not been invalidated on the standby'
+);
+
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
+wait_for_slot_invalidation($primary, 'sb1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end: Invalidate streaming standby's slot as well as logical failover
+# slot on primary due to replication_slot_inactive_timeout. Also, check the
+# logical failover slot synced on to the standby doesn't invalidate the slot on
+# its own, but gets the invalidated state from the remote slot on the primary.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+
+my $publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO ' ${inactive_timeout}s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+# =============================================================================
+
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset, $inactive_timeout) = @_;
+	my $name = $node->name;
+
+	# Wait for the replication slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for slot $slot_name to become inactive on node $name";
+
+	# Wait for the replication slot info to be updated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE inactive_since IS NOT NULL
+				AND slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for info of slot $slot_name to be updated on node $name";
+
+	# Sleep at least $inactive_timeout duration to avoid multiple checkpoints
+	# for the slot to get invalidated.
+	sleep($inactive_timeout);
+
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+
+	# Wait for the inactive replication slot to be invalidated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name";
+
+	# Check that the invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot_name', '0/1');
+	]);
+
+	ok( $stderr =~
+		  /can no longer get changes from replication slot "$slot_name"/,
+		"detected error upon trying to acquire invalidated slot $slot_name on node $name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot_name on node $name";
+}
+
+# Check for invalidation of slot in server log
+sub check_for_slot_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $name = $node->name;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot_name invalidation has been logged on node $name"
+	);
+}
+
+done_testing();
-- 
2.43.0

v42-0002-Add-XID-age-based-replication-slot-invalidation.patchapplication/x-patch; name=v42-0002-Add-XID-age-based-replication-slot-invalidation.patchDownload

From 4106d3edcb3b5b37eb198bfb076863cb99300813 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 26 Aug 2024 05:36:34 +0000
Subject: [PATCH v42 2/2] Add XID age based replication slot invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres introduces a GUC allowing users
set slot XID age. The replication slots whose xmin or catalog_xmin
has reached the age specified by this setting get invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint
- During vacuum (both command-based and autovacuum)

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/20240327150557.GA3994937%40nathanxps13
Discussion: https://www.postgresql.org/message-id/CA%2BTgmoaRECcnyqxAxUhP5dk2S4HX%3DpGh-p-PkA3uc%2BjG_9hiMw%40mail.gmail.com
---
 doc/src/sgml/config.sgml                      |  26 ++
 doc/src/sgml/system-views.sgml                |   8 +
 src/backend/commands/vacuum.c                 |  66 ++++
 src/backend/replication/slot.c                | 158 ++++++++-
 src/backend/utils/misc/guc_tables.c           |  10 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 321 +++++++++++++++++-
 8 files changed, 574 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index fbbacbee5b..61505ed6aa 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4593,6 +4593,32 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-xid-age" xreflabel="replication_slot_xid_age">
+      <term><varname>replication_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during vacuum or during checkpoint.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 9e00f7d184..a4f1ab5275 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2625,6 +2625,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>xid_aged</literal> means that the slot's
+          <literal>xmin</literal> or <literal>catalog_xmin</literal>
+          has reached the age specified by
+          <xref linkend="guc-replication-slot-xid-age"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7d8e9d2045..c909c0d001 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -47,6 +47,7 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/interrupt.h"
+#include "replication/slot.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -116,6 +117,7 @@ static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
 static double compute_parallel_delay(void);
 static VacOptValue get_vacoptval_from_boolean(DefElem *def);
 static bool vac_tid_reaped(ItemPointer itemptr, void *state);
+static void try_replication_slot_invalidation(void);
 
 /*
  * GUC check function to ensure GUC value specified is within the allowable
@@ -452,6 +454,61 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	MemoryContextDelete(vac_context);
 }
 
+/*
+ * Try invalidating replication slots based on current replication slot xmin
+ * limits once every vacuum cycle.
+ */
+static void
+try_replication_slot_invalidation(void)
+{
+	TransactionId min_slot_xmin;
+	TransactionId min_slot_catalog_xmin;
+	bool		can_invalidate = false;
+	TransactionId cutoff;
+	TransactionId curr;
+
+	curr = ReadNextTransactionId();
+
+	/*
+	 * The cutoff can tell how far we can go back from the current transaction
+	 * id till the age. And then, we check whether or not the xmin or
+	 * catalog_xmin falls within the cutoff; if yes, return true, otherwise
+	 * false.
+	 */
+	cutoff = curr - replication_slot_xid_age;
+
+	if (!TransactionIdIsNormal(cutoff))
+		cutoff = FirstNormalTransactionId;
+
+	ProcArrayGetReplicationSlotXmin(&min_slot_xmin, &min_slot_catalog_xmin);
+
+	/*
+	 * Current replication slot xmin limits can never be larger than the
+	 * current transaction id even in the case of transaction ID wraparound.
+	 */
+	Assert(min_slot_xmin <= curr);
+	Assert(min_slot_catalog_xmin <= curr);
+
+	if (TransactionIdIsNormal(min_slot_xmin) &&
+		TransactionIdPrecedesOrEquals(min_slot_xmin, cutoff))
+		can_invalidate = true;
+	else if (TransactionIdIsNormal(min_slot_catalog_xmin) &&
+			 TransactionIdPrecedesOrEquals(min_slot_catalog_xmin, cutoff))
+		can_invalidate = true;
+
+	if (can_invalidate)
+	{
+		/*
+		 * Note that InvalidateObsoleteReplicationSlots is also called as part
+		 * of CHECKPOINT, and emitting ERRORs from within is avoided already.
+		 * Therefore, there is no concern here that any ERROR from
+		 * invalidating replication slots blocks VACUUM.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+	}
+}
+
 /*
  * Internal entry point for autovacuum and the VACUUM / ANALYZE commands.
  *
@@ -483,6 +540,7 @@ vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
 	const char *stmttype;
 	volatile bool in_outer_xact,
 				use_own_xacts;
+	static bool first_time = true;
 
 	Assert(params != NULL);
 
@@ -594,6 +652,14 @@ vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
 		CommitTransactionCommand();
 	}
 
+	if (params->options & VACOPT_VACUUM &&
+		first_time &&
+		replication_slot_xid_age > 0)
+	{
+		try_replication_slot_invalidation();
+		first_time = false;
+	}
+
 	/* Turn vacuum cost accounting on or off, and set/clear in_vacuum */
 	PG_TRY();
 	{
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 5f504f956f..0654a81add 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -108,10 +108,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
 	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
+	[RS_INVAL_XID_AGE] = "xid_aged",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -142,6 +143,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 int			replication_slot_inactive_timeout = 0;
+int			replication_slot_xid_age = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -160,6 +162,9 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool ReplicationSlotIsXIDAged(ReplicationSlot *slot,
+									 TransactionId *xmin,
+									 TransactionId *catalog_xmin);
 
 static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 										   ReplicationSlot *s,
@@ -636,8 +641,8 @@ retry:
 	 * it gets invalidated now or has been invalidated previously, because
 	 * there's no use in acquiring the invalidated slot.
 	 *
-	 * XXX: Currently we check for inactive_timeout invalidation here. We
-	 * might need to check for other invalidations too.
+	 * XXX: Currently we check for inactive_timeout and xid_aged invalidations
+	 * here. We might need to check for other invalidations too.
 	 */
 	if (check_for_invalidation)
 	{
@@ -648,6 +653,22 @@ retry:
 													   InvalidTransactionId,
 													   &invalidated);
 
+		if (!invalidated && released_lock)
+		{
+			/* The slot is still ours */
+			Assert(s->active_pid == MyProcPid);
+
+			/* Reacquire the ControlLock */
+			LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+			released_lock = false;
+		}
+
+		if (!invalidated)
+			released_lock = InvalidatePossiblyObsoleteSlot(RS_INVAL_XID_AGE,
+														   s, 0, InvalidOid,
+														   InvalidTransactionId,
+														   &invalidated);
+
 		/*
 		 * If the slot has been invalidated, recalculate the resource limits.
 		 */
@@ -657,7 +678,8 @@ retry:
 			ReplicationSlotsComputeRequiredLSN();
 		}
 
-		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT ||
+			s->data.invalidated == RS_INVAL_XID_AGE)
 		{
 			/*
 			 * Release the lock if it hasn't been already, to ensure smooth
@@ -665,7 +687,10 @@ retry:
 			 */
 			if (!released_lock)
 				LWLockRelease(ReplicationSlotControlLock);
+		}
 
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		{
 			Assert(s->inactive_since > 0);
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -675,6 +700,20 @@ retry:
 							   timestamptz_to_str(s->inactive_since),
 							   replication_slot_inactive_timeout)));
 		}
+
+		if (s->data.invalidated == RS_INVAL_XID_AGE)
+		{
+			Assert(TransactionIdIsValid(s->data.xmin) ||
+				   TransactionIdIsValid(s->data.catalog_xmin));
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(s->data.name)),
+					 errdetail("The slot's xmin %u or catalog_xmin %u has reached the age %d specified by \"replication_slot_xid_age\".",
+							   s->data.xmin,
+							   s->data.catalog_xmin,
+							   replication_slot_xid_age)));
+		}
 	}
 
 	if (!released_lock)
@@ -1567,7 +1606,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
 					   TransactionId snapshotConflictHorizon,
-					   TimestampTz inactive_since)
+					   TimestampTz inactive_since,
+					   TransactionId xmin,
+					   TransactionId catalog_xmin)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1604,6 +1645,20 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 							 timestamptz_to_str(inactive_since),
 							 replication_slot_inactive_timeout);
 			break;
+		case RS_INVAL_XID_AGE:
+			Assert(TransactionIdIsValid(xmin) ||
+				   TransactionIdIsValid(catalog_xmin));
+
+			if (TransactionIdIsValid(xmin))
+				appendStringInfo(&err_detail, _("The slot's xmin %u has reached the age %d specified by \"replication_slot_xid_age\"."),
+								 xmin,
+								 replication_slot_xid_age);
+			else if (TransactionIdIsValid(catalog_xmin))
+				appendStringInfo(&err_detail, _("The slot's catalog_xmin %u has reached the age %d specified by \"replication_slot_xid_age\"."),
+								 catalog_xmin,
+								 replication_slot_xid_age);
+
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1648,6 +1703,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
 	TimestampTz inactive_since = 0;
+	TransactionId aged_xmin = InvalidTransactionId;
+	TransactionId aged_catalog_xmin = InvalidTransactionId;
 
 	for (;;)
 	{
@@ -1764,6 +1821,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 						Assert(s->active_pid == 0);
 					}
 					break;
+				case RS_INVAL_XID_AGE:
+					if (ReplicationSlotIsXIDAged(s, &aged_xmin, &aged_catalog_xmin))
+					{
+						Assert(TransactionIdIsValid(aged_xmin) ||
+							   TransactionIdIsValid(aged_catalog_xmin));
+
+						invalidation_cause = cause;
+						break;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1852,7 +1919,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
 									   oldestLSN, snapshotConflictHorizon,
-									   inactive_since);
+									   inactive_since, aged_xmin,
+									   aged_catalog_xmin);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1899,7 +1967,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
 								   oldestLSN, snapshotConflictHorizon,
-								   inactive_since);
+								   inactive_since, aged_xmin,
+								   aged_catalog_xmin);
 
 			/* done with this slot for now */
 			break;
@@ -1923,6 +1992,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1986,6 +2056,7 @@ void
 CheckPointReplicationSlots(bool is_shutdown)
 {
 	int			i;
+	bool		invalidated;
 
 	elog(DEBUG1, "performing replication slot checkpoint");
 
@@ -2053,11 +2124,76 @@ CheckPointReplicationSlots(bool is_shutdown)
 	 *
 	 * - Avoid saving slot info to disk two times for each invalidated slot.
 	 *
-	 * XXX: Should we move inactive_timeout inavalidation check closer to
-	 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+	 * XXX: Should we move inactive_timeout and xid_aged inavalidation checks
+	 * closer to wal_removed in CreateCheckPoint and CreateRestartPoint?
 	 */
-	InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
-									   0, InvalidOid, InvalidTransactionId);
+	invalidated = InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+													 0,
+													 InvalidOid,
+													 InvalidTransactionId);
+
+	if (!invalidated)
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+}
+
+/*
+ * Returns true if the given replication slot's xmin or catalog_xmin age is
+ * more than replication_slot_xid_age.
+ *
+ * Note that the caller must hold the replication slot's spinlock to avoid
+ * race conditions while this function reads xmin and catalog_xmin.
+ */
+static bool
+ReplicationSlotIsXIDAged(ReplicationSlot *slot, TransactionId *xmin,
+						 TransactionId *catalog_xmin)
+{
+	TransactionId cutoff;
+	TransactionId curr;
+
+	if (replication_slot_xid_age == 0)
+		return false;
+
+	curr = ReadNextTransactionId();
+
+	/*
+	 * Replication slot's xmin and catalog_xmin can never be larger than the
+	 * current transaction id even in the case of transaction ID wraparound.
+	 */
+	Assert(slot->data.xmin <= curr);
+	Assert(slot->data.catalog_xmin <= curr);
+
+	/*
+	 * The cutoff can tell how far we can go back from the current transaction
+	 * id till the age. And then, we check whether or not the xmin or
+	 * catalog_xmin falls within the cutoff; if yes, return true, otherwise
+	 * false.
+	 */
+	cutoff = curr - replication_slot_xid_age;
+
+	if (!TransactionIdIsNormal(cutoff))
+		cutoff = FirstNormalTransactionId;
+
+	*xmin = InvalidTransactionId;
+	*catalog_xmin = InvalidTransactionId;
+
+	if (TransactionIdIsNormal(slot->data.xmin) &&
+		TransactionIdPrecedesOrEquals(slot->data.xmin, cutoff))
+	{
+		*xmin = slot->data.xmin;
+		return true;
+	}
+
+	if (TransactionIdIsNormal(slot->data.catalog_xmin) &&
+		TransactionIdPrecedesOrEquals(slot->data.catalog_xmin, cutoff))
+	{
+		*catalog_xmin = slot->data.catalog_xmin;
+		return true;
+	}
+
+	return false;
 }
 
 /*
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 861692c683..5d10dd1c8a 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3040,6 +3040,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&replication_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index deca3a4aeb..3fb6813195 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -336,6 +336,7 @@
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
 #replication_slot_inactive_timeout = 0	# in seconds; 0 disables
+#replication_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index dd56a77547..5ea73f9331 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -55,6 +55,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* inactive slot timeout has occurred */
 	RS_INVAL_INACTIVE_TIMEOUT,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -233,6 +235,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
 extern PGDLLIMPORT int replication_slot_inactive_timeout;
+extern PGDLLIMPORT int replication_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index 4663019c16..18300cfeca 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -89,7 +89,7 @@ $primary->reload;
 # that nobody has acquired that slot yet, so due to
 # replication_slot_inactive_timeout setting above it must get invalidated.
 wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart,
-	$inactive_timeout);
+	$inactive_timeout, 'inactive_timeout');
 
 # Set timeout on the standby also to check the synced slots don't get
 # invalidated due to timeout on the standby.
@@ -129,7 +129,7 @@ $standby1->stop;
 
 # Wait for the standby's replication slot to become inactive
 wait_for_slot_invalidation($primary, 'sb1_slot', $logstart,
-	$inactive_timeout);
+	$inactive_timeout, 'inactive_timeout');
 
 # Testcase end: Invalidate streaming standby's slot as well as logical failover
 # slot on primary due to replication_slot_inactive_timeout. Also, check the
@@ -197,15 +197,280 @@ $subscriber->stop;
 # Wait for the replication slot to become inactive and then invalidated due to
 # timeout.
 wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
-	$inactive_timeout);
+	$inactive_timeout, 'inactive_timeout');
 
 # Testcase end: Invalidate logical subscriber's slot due to
 # replication_slot_inactive_timeout.
 # =============================================================================
 
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot due to replication_slot_xid_age
+# GUC.
+
+# Prepare for the next test
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby2->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb2_slot', immediately_reserve := true);
+]);
+
+$standby2->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NOT NULL AND catalog_xmin IS NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb2_slot';
+]) or die "Timed out waiting for slot sb2_slot xmin to advance";
+
+$primary->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby2->stop;
+
+$logstart = -s $primary->logfile;
+
+# Do some work to advance xids on primary
+advance_xids($primary, 'tab_int');
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+wait_for_slot_invalidation($primary, 'sb2_slot', $logstart, 0, 'xid_aged');
+
+# Testcase end: Invalidate streaming standby's slot due to replication_slot_xid_age
+# GUC.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to
+# replication_slot_xid_age GUC.
+
+$publisher = $primary;
+$publisher->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 500;
+]);
+$publisher->reload;
+
+$subscriber->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+));
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl2 (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl2 (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl2 VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+$publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub2 FOR TABLE test_tbl2");
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub2 WITH (slot_name = 'lsub2_slot')"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub2');
+
+$result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl2");
+
+is($result, qq(5), "check initial copy was done");
+
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NULL AND catalog_xmin IS NOT NULL
+	FROM pg_catalog.pg_replication_slots
+	WHERE slot_name = 'lsub2_slot';
+]) or die "Timed out waiting for slot lsub2_slot catalog_xmin to advance";
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Do some work to advance xids on publisher
+advance_xids($publisher, 'test_tbl2');
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+wait_for_slot_invalidation($publisher, 'lsub2_slot', $logstart, 0,
+	'xid_aged');
+
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_xid_age GUC.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical slot on standby that's being synced from
+# the primary due to replication_slot_xid_age GUC.
+
+$publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 0;
+]);
+$publisher->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby3 = PostgreSQL::Test::Cluster->new('standby3');
+$standby3->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+$standby3->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb3_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb3_slot', immediately_reserve := true);
+]);
+
+$standby3->start;
+
+my $standby3_logstart = -s $standby3->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby3);
+
+$subscriber->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+));
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl3 (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl3 (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl3 VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub3 FOR TABLE test_tbl3");
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub3 CONNECTION '$publisher_connstr' PUBLICATION pub3 WITH (slot_name = 'lsub3_sync_slot', failover = true)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub3');
+
+$result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl3");
+
+is($result, qq(5), "check initial copy was done");
+
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NULL AND catalog_xmin IS NOT NULL
+	FROM pg_catalog.pg_replication_slots
+	WHERE slot_name = 'lsub3_sync_slot';
+])
+  or die "Timed out waiting for slot lsub3_sync_slot catalog_xmin to advance";
+
+# Synchronize the primary server slots to the standby
+$standby3->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced' and has got catalog_xmin from the primary.
+is( $standby3->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub3_sync_slot' AND synced AND NOT temporary AND
+			xmin IS NULL AND catalog_xmin IS NOT NULL;}
+	),
+	"t",
+	'logical slot has synced as true on standby');
+
+my $primary_catalog_xmin = $primary->safe_psql('postgres',
+	"SELECT catalog_xmin FROM pg_replication_slots WHERE slot_name = 'lsub3_sync_slot' AND catalog_xmin IS NOT NULL;"
+);
+
+my $stabdby3_catalog_xmin = $standby3->safe_psql('postgres',
+	"SELECT catalog_xmin FROM pg_replication_slots WHERE slot_name = 'lsub3_sync_slot' AND catalog_xmin IS NOT NULL;"
+);
+
+is($primary_catalog_xmin, $stabdby3_catalog_xmin,
+	"check catalog_xmin are same for primary slot and synced slot");
+
+# Enable XID age based invalidation on the standby. Note that we disabled the
+# same on the primary to check if the invalidation occurs for synced slot on
+# the standby.
+$standby3->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 500;
+]);
+$standby3->reload;
+
+$logstart = -s $standby3->logfile;
+
+# Do some work to advance xids on primary
+advance_xids($primary, 'test_tbl3');
+
+# Wait for standby to catch up with the above work
+$primary->wait_for_catchup($standby3);
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+wait_for_slot_invalidation($standby3, 'lsub3_sync_slot', $logstart, 0,
+	'xid_aged');
+
+# Note that the replication slot on the primary is still active
+$result = $primary->safe_psql('postgres',
+	"SELECT COUNT(slot_name) = 1 FROM pg_replication_slots WHERE slot_name = 'lsub3_sync_slot' AND invalidation_reason IS NULL;"
+);
+
+is($result, 't', "check lsub3_sync_slot is still active on primary");
+
+# Testcase end: Invalidate logical slot on standby that's being synced from
+# the primary due to replication_slot_xid_age GUC.
+# =============================================================================
+
 sub wait_for_slot_invalidation
 {
-	my ($node, $slot_name, $offset, $inactive_timeout) = @_;
+	my ($node, $slot_name, $offset, $inactive_timeout, $reason) = @_;
 	my $name = $node->name;
 
 	# Wait for the replication slot to become inactive
@@ -231,14 +496,15 @@ sub wait_for_slot_invalidation
 	# for the slot to get invalidated.
 	sleep($inactive_timeout);
 
-	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset,
+		$reason);
 
 	# Wait for the inactive replication slot to be invalidated
 	$node->poll_query_until(
 		'postgres', qq[
 		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
 			WHERE slot_name = '$slot_name' AND
-			invalidation_reason = 'inactive_timeout';
+			invalidation_reason = '$reason';
 	])
 	  or die
 	  "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name";
@@ -262,15 +528,33 @@ sub wait_for_slot_invalidation
 # Check for invalidation of slot in server log
 sub check_for_slot_invalidation_in_server_log
 {
-	my ($node, $slot_name, $offset) = @_;
+	my ($node, $slot_name, $offset, $reason) = @_;
 	my $name = $node->name;
 	my $invalidated = 0;
+	my $isrecovery =
+	  $node->safe_psql('postgres', "SELECT pg_is_in_recovery()");
+
+	chomp($isrecovery);
 
 	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 	{
-		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($reason eq 'xid_aged' && $isrecovery eq 'f')
+		{
+			$node->safe_psql('postgres', "VACUUM");
+		}
+		else
+		{
+			$node->safe_psql('postgres', "CHECKPOINT");
+		}
+
 		if ($node->log_contains(
 				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset)
+			|| $node->log_contains(
+				"The slot's xmin .* has reached the age .* specified by \"replication_slot_xid_age\".",
+				$offset)
+			|| $node->log_contains(
+				"The slot's catalog_xmin .* has reached the age .* specified by \"replication_slot_xid_age\".",
 				$offset))
 		{
 			$invalidated = 1;
@@ -283,4 +567,25 @@ sub check_for_slot_invalidation_in_server_log
 	);
 }
 
+# Do some work for advancing xids on a given node
+sub advance_xids
+{
+	my ($node, $table_name) = @_;
+
+	$node->safe_psql(
+		'postgres', qq[
+		do \$\$
+		begin
+		for i in 10000..11000 loop
+			-- use an exception block so that each iteration eats an XID
+			begin
+			insert into $table_name values (i);
+			exception
+			when division_by_zero then null;
+			end;
+		end loop;
+		end\$\$;
+	]);
+}
+
 done_testing();
-- 
2.43.0

#236

Amit Kapila

amit.kapila16@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#235)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Aug 26, 2024 at 11:44 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Few comments on 0001:
1.
@@ -651,6 +651,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
remote_dbid)
" name slot \"%s\" already exists on the standby",
remote_slot->name));

+ /*
+ * Skip the sync if the local slot is already invalidated. We do this
+ * beforehand to avoid slot acquire and release.
+ */
+ if (slot->data.invalidated != RS_INVAL_NONE)
+ return false;
+
  /*
  * The slot has been synchronized before.

I was wondering why you have added this new check as part of this
patch. If you see the following comments in the related code, you will
know why we haven't done this previously.

/*
* The slot has been synchronized before.
*
* It is important to acquire the slot here before checking
* invalidation. If we don't acquire the slot first, there could be a
* race condition that the local slot could be invalidated just after
* checking the 'invalidated' flag here and we could end up
* overwriting 'invalidated' flag to remote_slot's value. See
* InvalidatePossiblyObsoleteSlot() where it invalidates slot directly
* if the slot is not acquired by other processes.
*
* XXX: If it ever turns out that slot acquire/release is costly for
* cases when none of the slot properties is changed then we can do a
* pre-check to ensure that at least one of the slot properties is
* changed before acquiring the slot.
*/
ReplicationSlotAcquire(remote_slot->name, true);

We need some modifications in these comments if you want to add a
pre-check here.

2.
@@ -1907,6 +2033,31 @@ CheckPointReplicationSlots(bool is_shutdown)
  SaveSlotToPath(s, path, LOG);
  }
  LWLockRelease(ReplicationSlotAllocationLock);
+
+ elog(DEBUG1, "performing replication slot invalidation checks");
+
+ /*
+ * Note that we will make another pass over replication slots for
+ * invalidations to keep the code simple. The assumption here is that the
+ * traversal over replication slots isn't that costly even with hundreds
+ * of replication slots. If it ever turns out that this assumption is
+ * wrong, we might have to put the invalidation check logic in the above
+ * loop, for that we might have to do the following:
+ *
+ * - Acqure ControlLock lock once before the loop.
+ *
+ * - Call InvalidatePossiblyObsoleteSlot for each slot.
+ *
+ * - Handle the cases in which ControlLock gets released just like
+ * InvalidateObsoleteReplicationSlots does.
+ *
+ * - Avoid saving slot info to disk two times for each invalidated slot.
+ *
+ * XXX: Should we move inactive_timeout inavalidation check closer to
+ * wal_removed in CreateCheckPoint and CreateRestartPoint?
+ */
+ InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+    0, InvalidOid, InvalidTransactionId);

Why do we want to call this for shutdown case (when is_shutdown is
true)? I understand trying to invalidate slots during regular
checkpoint but not sure if we need it at the time of shutdown. The
other point is can we try to check the performance impact with 100s of
slots as mentioned in the code comments?

--
With Regards,
Amit Kapila.

#237

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 1 year ago

In reply to: Amit Kapila (#236)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

Thanks for looking into this.

On Mon, Aug 26, 2024 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Few comments on 0001:
1.
@@ -651,6 +651,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
+ /*
+ * Skip the sync if the local slot is already invalidated. We do this
+ * beforehand to avoid slot acquire and release.
+ */
I was wondering why you have added this new check as part of this
patch. If you see the following comments in the related code, you will
know why we haven't done this previously.

Removed. Can deal with optimization separately.

2.
+ */
+ InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+    0, InvalidOid, InvalidTransactionId);
Why do we want to call this for shutdown case (when is_shutdown is
true)? I understand trying to invalidate slots during regular
checkpoint but not sure if we need it at the time of shutdown.

Changed it to invalidate only for non-shutdown checkpoints.
inactive_timeout invalidation isn't critical for shutdown unlike
wal_removed which can help shutdown by freeing up some disk space.

The
other point is can we try to check the performance impact with 100s of
slots as mentioned in the code comments?

I first checked how much does the wal_removed invalidation check add to the
checkpoint (see 2nd and 3rd column). I then checked how much
inactive_timeout invalidation check adds to the checkpoint (see 4th
column), it is not more than wal_remove invalidation check. I then checked
how much the wal_removed invalidation check adds for replication slots that
have already been invalidated due to inactive_timeout (see 5th column),
looks like not much.

| # of slots | HEAD (no invalidation) ms | HEAD (wal_removed) ms | PATCHED
(inactive_timeout) ms | PATCHED (inactive_timeout+wal_removed) ms |
|------------|----------------------------|-----------------------|-------------------------------|------------------------------------------|
| 100 | 18.591 | 370.586 | 359.299
| 373.882 |
| 1000 | 15.722 | 4834.901 |
5081.751 | 5072.128 |
| 10000 | 19.261 | 59801.062 |
61270.406 | 60270.099 |

Having said that, I'm okay to implement the optimization specified.
Thoughts?

+ /*
+ * NB: We will make another pass over replication slots for
+ * invalidation checks to keep the code simple. Testing shows that
+ * there is no noticeable overhead (when compared with wal_removed
+ * invalidation) even if we were to do inactive_timeout invalidation
+ * of thousands of replication slots here. If it is ever proven that
+ * this assumption is wrong, we will have to perform the invalidation
+ * checks in the above for loop with the following changes:
+ *
+ * - Acquire ControlLock lock once before the loop.
+ *
+ * - Call InvalidatePossiblyObsoleteSlot for each slot.
+ *
+ * - Handle the cases in which ControlLock gets released just like
+ * InvalidateObsoleteReplicationSlots does.
+ *
+ * - Avoid saving slot info to disk two times for each invalidated
+ * slot.

Please see the attached v43 patches addressing the above review comments.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v43-0002-Add-XID-age-based-replication-slot-invalidation.patchapplication/octet-stream; name=v43-0002-Add-XID-age-based-replication-slot-invalidation.patchDownload

From bb28e7ba7ba783b0c908412774b1d6ea4cca6dc5 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 29 Aug 2024 05:11:31 +0000
Subject: [PATCH v43 2/2] Add XID age based replication slot invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set an XID age (age of
slot's xmin or catalog_xmin) of say 1 or 1.5 billion, after which
the slots get invalidated.

To achieve the above, postgres introduces a GUC allowing users
set slot XID age. The replication slots whose xmin or catalog_xmin
has reached the age specified by this setting get invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.

The invalidation check happens at various locations to help beingas latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint
- During vacuum (both command-based and autovacuum)

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/20240327150557.GA3994937%40nathanxps13
Discussion: https://www.postgresql.org/message-id/CA%2BTgmoaRECcnyqxAxUhP5dk2S4HX%3DpGh-p-PkA3uc%2BjG_9hiMw%40mail.gmail.com
---
 doc/src/sgml/config.sgml                      |  26 ++
 doc/src/sgml/system-views.sgml                |   8 +
 src/backend/commands/vacuum.c                 |  66 ++++
 src/backend/replication/slot.c                | 157 ++++++++-
 src/backend/utils/misc/guc_tables.c           |  10 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   3 +
 src/test/recovery/t/050_invalidate_slots.pl   | 321 +++++++++++++++++-
 8 files changed, 572 insertions(+), 20 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 113303a501..a65c49d9fa 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4589,6 +4589,32 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-xid-age" xreflabel="replication_slot_xid_age">
+      <term><varname>replication_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during vacuum or during checkpoint.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 9e00f7d184..a4f1ab5275 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2625,6 +2625,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>xid_aged</literal> means that the slot's
+          <literal>xmin</literal> or <literal>catalog_xmin</literal>
+          has reached the age specified by
+          <xref linkend="guc-replication-slot-xid-age"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7d8e9d2045..c909c0d001 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -47,6 +47,7 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/interrupt.h"
+#include "replication/slot.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -116,6 +117,7 @@ static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
 static double compute_parallel_delay(void);
 static VacOptValue get_vacoptval_from_boolean(DefElem *def);
 static bool vac_tid_reaped(ItemPointer itemptr, void *state);
+static void try_replication_slot_invalidation(void);
 
 /*
  * GUC check function to ensure GUC value specified is within the allowable
@@ -452,6 +454,61 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	MemoryContextDelete(vac_context);
 }
 
+/*
+ * Try invalidating replication slots based on current replication slot xmin
+ * limits once every vacuum cycle.
+ */
+static void
+try_replication_slot_invalidation(void)
+{
+	TransactionId min_slot_xmin;
+	TransactionId min_slot_catalog_xmin;
+	bool		can_invalidate = false;
+	TransactionId cutoff;
+	TransactionId curr;
+
+	curr = ReadNextTransactionId();
+
+	/*
+	 * The cutoff can tell how far we can go back from the current transaction
+	 * id till the age. And then, we check whether or not the xmin or
+	 * catalog_xmin falls within the cutoff; if yes, return true, otherwise
+	 * false.
+	 */
+	cutoff = curr - replication_slot_xid_age;
+
+	if (!TransactionIdIsNormal(cutoff))
+		cutoff = FirstNormalTransactionId;
+
+	ProcArrayGetReplicationSlotXmin(&min_slot_xmin, &min_slot_catalog_xmin);
+
+	/*
+	 * Current replication slot xmin limits can never be larger than the
+	 * current transaction id even in the case of transaction ID wraparound.
+	 */
+	Assert(min_slot_xmin <= curr);
+	Assert(min_slot_catalog_xmin <= curr);
+
+	if (TransactionIdIsNormal(min_slot_xmin) &&
+		TransactionIdPrecedesOrEquals(min_slot_xmin, cutoff))
+		can_invalidate = true;
+	else if (TransactionIdIsNormal(min_slot_catalog_xmin) &&
+			 TransactionIdPrecedesOrEquals(min_slot_catalog_xmin, cutoff))
+		can_invalidate = true;
+
+	if (can_invalidate)
+	{
+		/*
+		 * Note that InvalidateObsoleteReplicationSlots is also called as part
+		 * of CHECKPOINT, and emitting ERRORs from within is avoided already.
+		 * Therefore, there is no concern here that any ERROR from
+		 * invalidating replication slots blocks VACUUM.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+	}
+}
+
 /*
  * Internal entry point for autovacuum and the VACUUM / ANALYZE commands.
  *
@@ -483,6 +540,7 @@ vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
 	const char *stmttype;
 	volatile bool in_outer_xact,
 				use_own_xacts;
+	static bool first_time = true;
 
 	Assert(params != NULL);
 
@@ -594,6 +652,14 @@ vacuum(List *relations, VacuumParams *params, BufferAccessStrategy bstrategy,
 		CommitTransactionCommand();
 	}
 
+	if (params->options & VACOPT_VACUUM &&
+		first_time &&
+		replication_slot_xid_age > 0)
+	{
+		try_replication_slot_invalidation();
+		first_time = false;
+	}
+
 	/* Turn vacuum cost accounting on or off, and set/clear in_vacuum */
 	PG_TRY();
 	{
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 70093500fa..530c121c2f 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -108,10 +108,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
 	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
+	[RS_INVAL_XID_AGE] = "xid_aged",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_XID_AGE
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -142,6 +143,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 int			replication_slot_inactive_timeout = 0;
+int			replication_slot_xid_age = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -160,6 +162,9 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
+static bool ReplicationSlotIsXIDAged(ReplicationSlot *slot,
+									 TransactionId *xmin,
+									 TransactionId *catalog_xmin);
 
 static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 										   ReplicationSlot *s,
@@ -636,8 +641,8 @@ retry:
 	 * gets invalidated now or has been invalidated previously, because
 	 * there's no use in acquiring the invalidated slot.
 	 *
-	 * XXX: Currently we check for inactive_timeout invalidation here. We
-	 * might need to check for other invalidations too.
+	 * XXX: Currently we check for inactive_timeout and xid_aged invalidations
+	 * here. We might need to check for other invalidations too.
 	 */
 	if (check_for_invalidation)
 	{
@@ -648,6 +653,22 @@ retry:
 													   InvalidTransactionId,
 													   &invalidated);
 
+		if (!invalidated && released_lock)
+		{
+			/* The slot is still ours */
+			Assert(s->active_pid == MyProcPid);
+
+			/* Reacquire the ControlLock */
+			LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+			released_lock = false;
+		}
+
+		if (!invalidated)
+			released_lock = InvalidatePossiblyObsoleteSlot(RS_INVAL_XID_AGE,
+														   s, 0, InvalidOid,
+														   InvalidTransactionId,
+														   &invalidated);
+
 		/*
 		 * If the slot has been invalidated, recalculate the resource limits.
 		 */
@@ -657,7 +678,8 @@ retry:
 			ReplicationSlotsComputeRequiredLSN();
 		}
 
-		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT ||
+			s->data.invalidated == RS_INVAL_XID_AGE)
 		{
 			/*
 			 * Release the lock if it hasn't been already, to ensure smooth
@@ -665,7 +687,10 @@ retry:
 			 */
 			if (!released_lock)
 				LWLockRelease(ReplicationSlotControlLock);
+		}
 
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		{
 			Assert(s->inactive_since > 0);
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -675,6 +700,20 @@ retry:
 							   timestamptz_to_str(s->inactive_since),
 							   replication_slot_inactive_timeout)));
 		}
+
+		if (s->data.invalidated == RS_INVAL_XID_AGE)
+		{
+			Assert(TransactionIdIsValid(s->data.xmin) ||
+				   TransactionIdIsValid(s->data.catalog_xmin));
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(s->data.name)),
+					 errdetail("The slot's xmin %u or catalog_xmin %u has reached the age %d specified by \"replication_slot_xid_age\".",
+							   s->data.xmin,
+							   s->data.catalog_xmin,
+							   replication_slot_xid_age)));
+		}
 	}
 
 	if (!released_lock)
@@ -1567,7 +1606,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
 					   TransactionId snapshotConflictHorizon,
-					   TimestampTz inactive_since)
+					   TimestampTz inactive_since,
+					   TransactionId xmin,
+					   TransactionId catalog_xmin)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1604,6 +1645,20 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 							 timestamptz_to_str(inactive_since),
 							 replication_slot_inactive_timeout);
 			break;
+		case RS_INVAL_XID_AGE:
+			Assert(TransactionIdIsValid(xmin) ||
+				   TransactionIdIsValid(catalog_xmin));
+
+			if (TransactionIdIsValid(xmin))
+				appendStringInfo(&err_detail, _("The slot's xmin %u has reached the age %d specified by \"replication_slot_xid_age\"."),
+								 xmin,
+								 replication_slot_xid_age);
+			else if (TransactionIdIsValid(catalog_xmin))
+				appendStringInfo(&err_detail, _("The slot's catalog_xmin %u has reached the age %d specified by \"replication_slot_xid_age\"."),
+								 catalog_xmin,
+								 replication_slot_xid_age);
+
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1648,6 +1703,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
 	TimestampTz inactive_since = 0;
+	TransactionId aged_xmin = InvalidTransactionId;
+	TransactionId aged_catalog_xmin = InvalidTransactionId;
 
 	for (;;)
 	{
@@ -1764,6 +1821,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 						Assert(s->active_pid == 0);
 					}
 					break;
+				case RS_INVAL_XID_AGE:
+					if (ReplicationSlotIsXIDAged(s, &aged_xmin, &aged_catalog_xmin))
+					{
+						Assert(TransactionIdIsValid(aged_xmin) ||
+							   TransactionIdIsValid(aged_catalog_xmin));
+
+						invalidation_cause = cause;
+						break;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1852,7 +1919,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
 									   oldestLSN, snapshotConflictHorizon,
-									   inactive_since);
+									   inactive_since, aged_xmin,
+									   aged_catalog_xmin);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1899,7 +1967,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
 								   oldestLSN, snapshotConflictHorizon,
-								   inactive_since);
+								   inactive_since, aged_xmin,
+								   aged_catalog_xmin);
 
 			/* done with this slot for now */
 			break;
@@ -1923,6 +1992,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
  * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs
+ * - RS_INVAL_XID_AGE: slot's xmin or catalog_xmin has reached the age
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -2043,10 +2113,11 @@ CheckPointReplicationSlots(bool is_shutdown)
 		 * NB: We will make another pass over replication slots for
 		 * invalidation checks to keep the code simple. Testing shows that
 		 * there is no noticeable overhead (when compared with wal_removed
-		 * invalidation) even if we were to do inactive_timeout invalidation
-		 * of thousands of replication slots here. If it is ever proven that
-		 * this assumption is wrong, we will have to perform the invalidation
-		 * checks in the above for loop with the following changes:
+		 * invalidation) even if we were to do inactive_timeout/xid_aged
+		 * invalidation of thousands of replication slots here. If it is ever
+		 * proven that this assumption is wrong, we will have to perform the
+		 * invalidation checks in the above for loop with the following
+		 * changes:
 		 *
 		 * - Acquire ControlLock lock once before the loop.
 		 *
@@ -2058,16 +2129,78 @@ CheckPointReplicationSlots(bool is_shutdown)
 		 * - Avoid saving slot info to disk two times for each invalidated
 		 * slot.
 		 *
-		 * XXX: Should we move inactive_timeout inavalidation check closer to
+		 * XXX: Should we move these inavalidation checks closer to
 		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
 		 */
 		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
 										   0,
 										   InvalidOid,
 										   InvalidTransactionId);
+
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
 	}
 }
 
+/*
+ * Returns true if the given replication slot's xmin or catalog_xmin age is
+ * more than replication_slot_xid_age.
+ *
+ * Note that the caller must hold the replication slot's spinlock to avoid
+ * race conditions while this function reads xmin and catalog_xmin.
+ */
+static bool
+ReplicationSlotIsXIDAged(ReplicationSlot *slot, TransactionId *xmin,
+						 TransactionId *catalog_xmin)
+{
+	TransactionId cutoff;
+	TransactionId curr;
+
+	if (replication_slot_xid_age == 0)
+		return false;
+
+	curr = ReadNextTransactionId();
+
+	/*
+	 * Replication slot's xmin and catalog_xmin can never be larger than the
+	 * current transaction id even in the case of transaction ID wraparound.
+	 */
+	Assert(slot->data.xmin <= curr);
+	Assert(slot->data.catalog_xmin <= curr);
+
+	/*
+	 * The cutoff can tell how far we can go back from the current transaction
+	 * id till the age. And then, we check whether or not the xmin or
+	 * catalog_xmin falls within the cutoff; if yes, return true, otherwise
+	 * false.
+	 */
+	cutoff = curr - replication_slot_xid_age;
+
+	if (!TransactionIdIsNormal(cutoff))
+		cutoff = FirstNormalTransactionId;
+
+	*xmin = InvalidTransactionId;
+	*catalog_xmin = InvalidTransactionId;
+
+	if (TransactionIdIsNormal(slot->data.xmin) &&
+		TransactionIdPrecedesOrEquals(slot->data.xmin, cutoff))
+	{
+		*xmin = slot->data.xmin;
+		return true;
+	}
+
+	if (TransactionIdIsNormal(slot->data.catalog_xmin) &&
+		TransactionIdPrecedesOrEquals(slot->data.catalog_xmin, cutoff))
+	{
+		*catalog_xmin = slot->data.catalog_xmin;
+		return true;
+	}
+
+	return false;
+}
+
 /*
  * Load all replication slots from disk into memory at server startup. This
  * needs to be run before we start crash recovery.
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 861692c683..5d10dd1c8a 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3040,6 +3040,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_xid_age", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Age of the transaction ID at which a replication slot gets invalidated."),
+			gettext_noop("The transaction is the oldest transaction (including the one affecting the system catalogs) that a replication slot needs the database to retain.")
+		},
+		&replication_slot_xid_age,
+		0, 0, 2000000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index deca3a4aeb..3fb6813195 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -336,6 +336,7 @@
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
 #replication_slot_inactive_timeout = 0	# in seconds; 0 disables
+#replication_slot_xid_age = 0
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index dd56a77547..5ea73f9331 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -55,6 +55,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL,
 	/* inactive slot timeout has occurred */
 	RS_INVAL_INACTIVE_TIMEOUT,
+	/* slot's xmin or catalog_xmin has reached the age */
+	RS_INVAL_XID_AGE,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -233,6 +235,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
 extern PGDLLIMPORT int replication_slot_inactive_timeout;
+extern PGDLLIMPORT int replication_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index 4663019c16..18300cfeca 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -89,7 +89,7 @@ $primary->reload;
 # that nobody has acquired that slot yet, so due to
 # replication_slot_inactive_timeout setting above it must get invalidated.
 wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart,
-	$inactive_timeout);
+	$inactive_timeout, 'inactive_timeout');
 
 # Set timeout on the standby also to check the synced slots don't get
 # invalidated due to timeout on the standby.
@@ -129,7 +129,7 @@ $standby1->stop;
 
 # Wait for the standby's replication slot to become inactive
 wait_for_slot_invalidation($primary, 'sb1_slot', $logstart,
-	$inactive_timeout);
+	$inactive_timeout, 'inactive_timeout');
 
 # Testcase end: Invalidate streaming standby's slot as well as logical failover
 # slot on primary due to replication_slot_inactive_timeout. Also, check the
@@ -197,15 +197,280 @@ $subscriber->stop;
 # Wait for the replication slot to become inactive and then invalidated due to
 # timeout.
 wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
-	$inactive_timeout);
+	$inactive_timeout, 'inactive_timeout');
 
 # Testcase end: Invalidate logical subscriber's slot due to
 # replication_slot_inactive_timeout.
 # =============================================================================
 
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot due to replication_slot_xid_age
+# GUC.
+
+# Prepare for the next test
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$primary->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby2->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb2_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb2_slot', immediately_reserve := true);
+]);
+
+$standby2->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NOT NULL AND catalog_xmin IS NULL
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb2_slot';
+]) or die "Timed out waiting for slot sb2_slot xmin to advance";
+
+$primary->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 500;
+]);
+$primary->reload;
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby2->stop;
+
+$logstart = -s $primary->logfile;
+
+# Do some work to advance xids on primary
+advance_xids($primary, 'tab_int');
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+wait_for_slot_invalidation($primary, 'sb2_slot', $logstart, 0, 'xid_aged');
+
+# Testcase end: Invalidate streaming standby's slot due to replication_slot_xid_age
+# GUC.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to
+# replication_slot_xid_age GUC.
+
+$publisher = $primary;
+$publisher->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 500;
+]);
+$publisher->reload;
+
+$subscriber->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+));
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl2 (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl2 (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl2 VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+$publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub2 FOR TABLE test_tbl2");
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub2 CONNECTION '$publisher_connstr' PUBLICATION pub2 WITH (slot_name = 'lsub2_slot')"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub2');
+
+$result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl2");
+
+is($result, qq(5), "check initial copy was done");
+
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NULL AND catalog_xmin IS NOT NULL
+	FROM pg_catalog.pg_replication_slots
+	WHERE slot_name = 'lsub2_slot';
+]) or die "Timed out waiting for slot lsub2_slot catalog_xmin to advance";
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Do some work to advance xids on publisher
+advance_xids($publisher, 'test_tbl2');
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+wait_for_slot_invalidation($publisher, 'lsub2_slot', $logstart, 0,
+	'xid_aged');
+
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_xid_age GUC.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical slot on standby that's being synced from
+# the primary due to replication_slot_xid_age GUC.
+
+$publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 0;
+]);
+$publisher->reload;
+
+# Create a standby linking to the primary using the replication slot
+my $standby3 = PostgreSQL::Test::Cluster->new('standby3');
+$standby3->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+$standby3->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb3_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb3_slot', immediately_reserve := true);
+]);
+
+$standby3->start;
+
+my $standby3_logstart = -s $standby3->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby3);
+
+$subscriber->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+));
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl3 (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl3 (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl3 VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+$publisher->safe_psql('postgres',
+	"CREATE PUBLICATION pub3 FOR TABLE test_tbl3");
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub3 CONNECTION '$publisher_connstr' PUBLICATION pub3 WITH (slot_name = 'lsub3_sync_slot', failover = true)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub3');
+
+$result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl3");
+
+is($result, qq(5), "check initial copy was done");
+
+$publisher->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NULL AND catalog_xmin IS NOT NULL
+	FROM pg_catalog.pg_replication_slots
+	WHERE slot_name = 'lsub3_sync_slot';
+])
+  or die "Timed out waiting for slot lsub3_sync_slot catalog_xmin to advance";
+
+# Synchronize the primary server slots to the standby
+$standby3->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced' and has got catalog_xmin from the primary.
+is( $standby3->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub3_sync_slot' AND synced AND NOT temporary AND
+			xmin IS NULL AND catalog_xmin IS NOT NULL;}
+	),
+	"t",
+	'logical slot has synced as true on standby');
+
+my $primary_catalog_xmin = $primary->safe_psql('postgres',
+	"SELECT catalog_xmin FROM pg_replication_slots WHERE slot_name = 'lsub3_sync_slot' AND catalog_xmin IS NOT NULL;"
+);
+
+my $stabdby3_catalog_xmin = $standby3->safe_psql('postgres',
+	"SELECT catalog_xmin FROM pg_replication_slots WHERE slot_name = 'lsub3_sync_slot' AND catalog_xmin IS NOT NULL;"
+);
+
+is($primary_catalog_xmin, $stabdby3_catalog_xmin,
+	"check catalog_xmin are same for primary slot and synced slot");
+
+# Enable XID age based invalidation on the standby. Note that we disabled the
+# same on the primary to check if the invalidation occurs for synced slot on
+# the standby.
+$standby3->safe_psql(
+	'postgres', qq[
+	ALTER SYSTEM SET replication_slot_xid_age = 500;
+]);
+$standby3->reload;
+
+$logstart = -s $standby3->logfile;
+
+# Do some work to advance xids on primary
+advance_xids($primary, 'test_tbl3');
+
+# Wait for standby to catch up with the above work
+$primary->wait_for_catchup($standby3);
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+wait_for_slot_invalidation($standby3, 'lsub3_sync_slot', $logstart, 0,
+	'xid_aged');
+
+# Note that the replication slot on the primary is still active
+$result = $primary->safe_psql('postgres',
+	"SELECT COUNT(slot_name) = 1 FROM pg_replication_slots WHERE slot_name = 'lsub3_sync_slot' AND invalidation_reason IS NULL;"
+);
+
+is($result, 't', "check lsub3_sync_slot is still active on primary");
+
+# Testcase end: Invalidate logical slot on standby that's being synced from
+# the primary due to replication_slot_xid_age GUC.
+# =============================================================================
+
 sub wait_for_slot_invalidation
 {
-	my ($node, $slot_name, $offset, $inactive_timeout) = @_;
+	my ($node, $slot_name, $offset, $inactive_timeout, $reason) = @_;
 	my $name = $node->name;
 
 	# Wait for the replication slot to become inactive
@@ -231,14 +496,15 @@ sub wait_for_slot_invalidation
 	# for the slot to get invalidated.
 	sleep($inactive_timeout);
 
-	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset,
+		$reason);
 
 	# Wait for the inactive replication slot to be invalidated
 	$node->poll_query_until(
 		'postgres', qq[
 		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
 			WHERE slot_name = '$slot_name' AND
-			invalidation_reason = 'inactive_timeout';
+			invalidation_reason = '$reason';
 	])
 	  or die
 	  "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name";
@@ -262,15 +528,33 @@ sub wait_for_slot_invalidation
 # Check for invalidation of slot in server log
 sub check_for_slot_invalidation_in_server_log
 {
-	my ($node, $slot_name, $offset) = @_;
+	my ($node, $slot_name, $offset, $reason) = @_;
 	my $name = $node->name;
 	my $invalidated = 0;
+	my $isrecovery =
+	  $node->safe_psql('postgres', "SELECT pg_is_in_recovery()");
+
+	chomp($isrecovery);
 
 	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 	{
-		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($reason eq 'xid_aged' && $isrecovery eq 'f')
+		{
+			$node->safe_psql('postgres', "VACUUM");
+		}
+		else
+		{
+			$node->safe_psql('postgres', "CHECKPOINT");
+		}
+
 		if ($node->log_contains(
 				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset)
+			|| $node->log_contains(
+				"The slot's xmin .* has reached the age .* specified by \"replication_slot_xid_age\".",
+				$offset)
+			|| $node->log_contains(
+				"The slot's catalog_xmin .* has reached the age .* specified by \"replication_slot_xid_age\".",
 				$offset))
 		{
 			$invalidated = 1;
@@ -283,4 +567,25 @@ sub check_for_slot_invalidation_in_server_log
 	);
 }
 
+# Do some work for advancing xids on a given node
+sub advance_xids
+{
+	my ($node, $table_name) = @_;
+
+	$node->safe_psql(
+		'postgres', qq[
+		do \$\$
+		begin
+		for i in 10000..11000 loop
+			-- use an exception block so that each iteration eats an XID
+			begin
+			insert into $table_name values (i);
+			exception
+			when division_by_zero then null;
+			end;
+		end loop;
+		end\$\$;
+	]);
+}
+
 done_testing();
-- 
2.43.0

v43-0001-Add-inactive_timeout-based-replication-slot-inva.patchapplication/octet-stream; name=v43-0001-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 8c035bcf94db49d09bbfb275018de3286e8522d7 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 29 Aug 2024 05:01:18 +0000
Subject: [PATCH v43 1/2] Add inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
max_slot_wal_keep_size is tricky. Because the amount of WAL a
customer generates, and their allocated storage will vary greatly
in production, making it difficult to pin down a one-size-fits-all
value. It is often easy for developers to set a timeout of say 1
or 2 or 3 days, after which the inactive slots get invalidated.

To achieve the above, postgres introduces a GUC allowing users
set inactive timeout. The replication slots that are inactive
for longer than specified amount of time get invalidated.

The invalidation check happens at various locations to help being
as latest as possible, these locations include the following:
- Whenever the slot is acquired and the slot acquisition errors
out if invalidated.
- During checkpoint

Note that this new invalidation mechanism won't kick-in for the
slots that are currently being synced from the primary to the
standby, because such synced slots are typically considered not
active (for them to be later considered as inactive) as they don't
perform logical decoding to produce the changes.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Reviewed-by: Ajin Cherian, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CA%2BTgmoZTbaaEjSZUG1FL0mzxAdN3qmXksO3O9_PZhEuXTkVnRQ%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/202403260841.5jcv7ihniccy%40alvherre.pgsql
---
 doc/src/sgml/config.sgml                      |  33 ++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 183 ++++++++++-
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   6 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 286 ++++++++++++++++++
 13 files changed, 523 insertions(+), 20 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 12feac6087..113303a501 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4556,6 +4556,39 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidates replication slots that are inactive for longer than
+        specified amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the timeout mechanism. This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during a checkpoint. The time since the slot has become
+        inactive is known from its
+        <structfield>inactive_since</structfield> value using which the
+        timeout is measured.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby that are being synced from a
+        primary server (whose <structfield>synced</structfield> field is
+        <literal>true</literal>).
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 634a4c0fab..9e00f7d184 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2618,6 +2618,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for the duration specified by
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 76de017635..758f0358b3 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -448,7 +448,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -667,7 +667,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index c290339af5..70093500fa 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -159,6 +161,13 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 
+static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
+										   ReplicationSlot *s,
+										   XLogRecPtr oldestLSN,
+										   Oid dboid,
+										   TransactionId snapshotConflictHorizon,
+										   bool *invalidated);
+
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
@@ -535,12 +544,17 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if check_for_invalidation is true and the slot gets
+ * invalidated now or has been invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
+	bool		released_lock = false;
 
 	Assert(name != NULL);
 
@@ -615,6 +629,57 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	/*
+	 * Check if the acquired slot needs to be invalidated. Error out if it
+	 * gets invalidated now or has been invalidated previously, because
+	 * there's no use in acquiring the invalidated slot.
+	 *
+	 * XXX: Currently we check for inactive_timeout invalidation here. We
+	 * might need to check for other invalidations too.
+	 */
+	if (check_for_invalidation)
+	{
+		bool		invalidated = false;
+
+		released_lock = InvalidatePossiblyObsoleteSlot(RS_INVAL_INACTIVE_TIMEOUT,
+													   s, 0, InvalidOid,
+													   InvalidTransactionId,
+													   &invalidated);
+
+		/*
+		 * If the slot has been invalidated, recalculate the resource limits.
+		 */
+		if (invalidated)
+		{
+			ReplicationSlotsComputeRequiredXmin(false);
+			ReplicationSlotsComputeRequiredLSN();
+		}
+
+		if (s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+		{
+			/*
+			 * Release the lock if it hasn't been already, to ensure smooth
+			 * cleanup on error.
+			 */
+			if (!released_lock)
+				LWLockRelease(ReplicationSlotControlLock);
+
+			Assert(s->inactive_since > 0);
+			ereport(ERROR,
+					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					 errmsg("can no longer get changes from replication slot \"%s\"",
+							NameStr(s->data.name)),
+					 errdetail("This slot has been invalidated because it was inactive since %s for more than %d seconds specified by \"replication_slot_inactive_timeout\".",
+							   timestamptz_to_str(s->inactive_since),
+							   replication_slot_inactive_timeout)));
+		}
+	}
+
+	if (!released_lock)
+		LWLockRelease(ReplicationSlotControlLock);
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +850,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +877,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1501,7 +1566,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1531,6 +1597,13 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			Assert(inactive_since > 0);
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s for more than %d seconds specified by \"replication_slot_inactive_timeout\"."),
+							 timestamptz_to_str(inactive_since),
+							 replication_slot_inactive_timeout);
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1574,6 +1647,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1581,6 +1655,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1591,6 +1666,18 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT &&
+			(replication_slot_inactive_timeout > 0 &&
+			 s->inactive_since > 0 &&
+			 !(RecoveryInProgress() && s->data.synced)))
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1644,6 +1731,39 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+
+					/*
+					 * Quick exit if inactive timeout invalidation mechanism
+					 * is disabled or slot is currently being used or the slot
+					 * on standby is currently being synced from the primary.
+					 *
+					 * Note that we don't invalidate synced slots because,
+					 * they are typically considered not active as they don't
+					 * perform logical decoding to produce the changes.
+					 */
+					if (replication_slot_inactive_timeout == 0 ||
+						s->inactive_since == 0 ||
+						(RecoveryInProgress() && s->data.synced))
+						break;
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * replication_slot_inactive_timeout GUC.
+					 */
+					if (TimestampDifferenceExceeds(s->inactive_since, now,
+												   replication_slot_inactive_timeout * 1000))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+
+						/*
+						 * Invalidation due to inactive timeout implies that
+						 * no one is using the slot.
+						 */
+						Assert(s->active_pid == 0);
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1669,11 +1789,14 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so or if the slot is already ours,
+		 * then mark it invalidated.  Otherwise we'll signal the owning
+		 * process, below, and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot != NULL &&
+			 MyReplicationSlot == s &&
+			 active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1728,7 +1851,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1774,7 +1898,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1797,6 +1922,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1849,7 +1975,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1907,6 +2034,38 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do inactive_timeout invalidation
+		 * of thousands of replication slots here. If it is ever proven that
+		 * this assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move inactive_timeout inavalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index c7bfbb15e0..b1b7b075bd 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -540,7 +540,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index c5f1009f37..61a0e38715 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -844,7 +844,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1462,7 +1462,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index af227b1f24..861692c683 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time to wait before invalidating an "
+						 "inactive replication slot."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 667e0dc40a..deca3a4aeb 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -335,6 +335,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index c2ee149fd6..dd56a77547 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -53,6 +53,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -230,6 +232,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -246,7 +249,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 712924c2fa..301be0f6c1 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -10,6 +10,7 @@ tests += {
        'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
     },
     'tests': [
+      't/050_invalidate_slots.pl',
       't/001_stream_rep.pl',
       't/002_archiving.pl',
       't/003_recovery_targets.pl',
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..4663019c16
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,286 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to replication_slot_inactive_timeout. Also,
+# check the logical failover slot synced on to the standby doesn't invalidate
+# the slot on its own, but gets the invalidated state from the remote slot on
+# the primary.
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid checkpoint during the test, otherwise, the test can get unpredictable
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr_1 = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb1_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('lsub1_sync_slot', 'test_decoding', false, false, true);}
+);
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+my $standby1_logstart = -s $standby1->logfile;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Synchronize the primary server slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced'.
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub1_sync_slot' AND synced AND NOT temporary;}
+	),
+	"t",
+	'logical slot lsub1_sync_slot has synced as true on standby');
+
+my $logstart = -s $primary->logfile;
+my $inactive_timeout = 2;
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$primary->reload;
+
+# Wait for the logical failover slot to become inactive on the primary. Note
+# that nobody has acquired that slot yet, so due to
+# replication_slot_inactive_timeout setting above it must get invalidated.
+wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart,
+	$inactive_timeout);
+
+# Set timeout on the standby also to check the synced slots don't get
+# invalidated due to timeout on the standby.
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$standby1->reload;
+
+# Now, sync the logical failover slot from the remote slot on the primary.
+# Note that the remote slot has already been invalidated due to inactive
+# timeout. Now, the standby must also see it as invalidated.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Wait for the inactive replication slot to be invalidated.
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'lsub1_sync_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for replication slot lsub1_sync_slot invalidation to be synced on standby";
+
+# Synced slot mustn't get invalidated on the standby even after a checkpoint,
+# it must sync invalidation from the primary. So, we must not see the slot's
+# invalidation message in server log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+ok( !$standby1->log_contains(
+		"invalidating obsolete replication slot \"lsub1_sync_slot\"",
+		$standby1_logstart),
+	'check that syned slot lsub1_sync_slot has not been invalidated on the standby'
+);
+
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
+wait_for_slot_invalidation($primary, 'sb1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end: Invalidate streaming standby's slot as well as logical failover
+# slot on primary due to replication_slot_inactive_timeout. Also, check the
+# logical failover slot synced on to the standby doesn't invalidate the slot on
+# its own, but gets the invalidated state from the remote slot on the primary.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+
+my $publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO ' ${inactive_timeout}s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+# =============================================================================
+
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset, $inactive_timeout) = @_;
+	my $name = $node->name;
+
+	# Wait for the replication slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for slot $slot_name to become inactive on node $name";
+
+	# Wait for the replication slot info to be updated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE inactive_since IS NOT NULL
+				AND slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for info of slot $slot_name to be updated on node $name";
+
+	# Sleep at least $inactive_timeout duration to avoid multiple checkpoints
+	# for the slot to get invalidated.
+	sleep($inactive_timeout);
+
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+
+	# Wait for the inactive replication slot to be invalidated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name";
+
+	# Check that the invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot_name', '0/1');
+	]);
+
+	ok( $stderr =~
+		  /can no longer get changes from replication slot "$slot_name"/,
+		"detected error upon trying to acquire invalidated slot $slot_name on node $name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot_name on node $name";
+}
+
+# Check for invalidation of slot in server log
+sub check_for_slot_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $name = $node->name;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot_name invalidation has been logged on node $name"
+	);
+}
+
+done_testing();
-- 
2.43.0

#238

Peter Smith

smithpb2250@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#237)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi, here are some review comments for patch v43-0001.

======
Commit message

1.
... introduces a GUC allowing users set inactive timeout.

1a. You should give the name of the new GUC in the commit message.

1b. /set/to set/

======
doc/src/sgml/config.sgml

GUC "replication_slot_inactive_timeout"

2.
Invalidates replication slots that are inactive for longer than
specified amount of time

nit - suggest use similar wording as the prior GUC (wal_sender_timeout):
Invalidate replication slots that are inactive for longer than this
amount of time.

3.
This invalidation check happens either when the slot is acquired for
use or during a checkpoint. The time since the slot has become
inactive is known from its inactive_since value using which the
timeout is measured.

nit - the wording is too complicated. suggest:
The timeout check occurs when the slot is next acquired for use, or
during a checkpoint. The slot's 'inactive_since' field value is when
the slot became inactive.

4.
Note that the inactive timeout invalidation mechanism is not
applicable for slots on the standby that are being synced from a
primary server (whose synced field is true).

nit - that word "whose" seems ambiguous. suggest:
(e.g. the standby slot has 'synced' field true).

======
doc/src/sgml/system-views.sgml

5.
inactive_timeout means that the slot has been inactive for the
duration specified by replication_slot_inactive_timeout parameter.

nit - suggestion ("longer than"):
... the slot has been inactive for longer than the duration specified
by the replication_slot_inactive_timeout parameter.

======
src/backend/replication/slot.c

6.
 /* Maximum number of invalidation causes */
-#define RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT

IMO this #define belongs in the slot.h, immediately below where the
enum is defined.

~~~

7. ReplicationSlotAcquire:

I had a fundamental question about this logic.

IIUC the purpose of the patch was to invalidate replication slots that
have been inactive for too long.

So, it makes sense to me that some periodic processing (e.g.
CheckPointReplicationSlots) might do a sweep over all the slots, and
invalidate the too-long-inactive ones that it finds.

OTOH, it seemed quite strange to me that the patch logic is also
detecting and invalidating inactive slots during the
ReplicationSlotAcquire function. This is kind of saying "ERROR -
sorry, because this was inactive for too long you can't have it" at
the very moment that you wanted to use it again! IIUC such a slot
would be invalidated by the function InvalidatePossiblyObsoleteSlot(),
but therein lies my doubt -- how can the slot be considered as
"obsolete" when we are in the very act of trying to acquire/use it?

I guess it might be argued this is not so different to the scenario of
attempting to acquire a slot that had been invalidated momentarily
before during checkpoint processing. But, somehow that scenario seems
more like bad luck to me, versus ReplicationSlotAcquire() deliberately
invalidating something we *know* is wanted.

8.
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("can no longer get changes from replication slot \"%s\"",
+ NameStr(s->data.name)),
+ errdetail("This slot has been invalidated because it was inactive
since %s for more than %d seconds specified by
\"replication_slot_inactive_timeout\".",
+    timestamptz_to_str(s->inactive_since),
+    replication_slot_inactive_timeout)));

nit - IMO the info should be split into errdetail + errhint. Like this:
errdetail("The slot became invalid because it was inactive since %s,
which is more than %d seconds ago."...)
errhint("You might need to increase \"%s\".",
"replication_slot_inactive_timeout")

~~~

9. ReportSlotInvalidation

+ appendStringInfo(&err_detail,
+ _("The slot has been inactive since %s for more than %d seconds
specified by \"replication_slot_inactive_timeout\"."),
+ timestamptz_to_str(inactive_since),
+ replication_slot_inactive_timeout);
+ break;

IMO this error in ReportSlotInvalidation() should be the same as the
other one from ReplicationSlotAcquire(), which I suggested above
(comment #8) should include a hint. Also, including a hint here will
make this new message consistent with the other errhint (for
"max_slot_wal_keep_size") that is already in this function.

~~~

10. InvalidatePossiblyObsoleteSlot

+ if (cause == RS_INVAL_INACTIVE_TIMEOUT &&
+ (replication_slot_inactive_timeout > 0 &&
+ s->inactive_since > 0 &&
+ !(RecoveryInProgress() && s->data.synced)))

10a. Everything here is && so this has some redundant parentheses.

10b. Actually, IMO this complicated condition is overkill. Won't it be
better to just unconditionally assign
now = GetCurrentTimestamp(); here?

11.
+ * Note that we don't invalidate synced slots because,
+ * they are typically considered not active as they don't
+ * perform logical decoding to produce the changes.

nit - tweaked punctuation

12.
+ * If the slot can be acquired, do so or if the slot is already ours,
+ * then mark it invalidated.  Otherwise we'll signal the owning
+ * process, below, and retry.

nit - tidied this comment. Suggestion:
If the slot can be acquired, do so and mark it as invalidated. If the
slot is already ours, mark it as invalidated. Otherwise, we'll signal
the owning process below and retry.

13.
+ if (active_pid == 0 ||
+ (MyReplicationSlot != NULL &&
+ MyReplicationSlot == s &&
+ active_pid == MyProcPid))

You are already checking MyReplicationSlot == s here, so that extra
check for MyReplicationSlot != NULL is redundant, isn't it?

~~~

14. CheckPointReplicationSlots

 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots

nit - /Also, invalidate slots/Also, invalidate obsolete slots/

======
src/backend/utils/misc/guc_tables.c

15.
+ {"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+ gettext_noop("Sets the amount of time to wait before invalidating an "
+ "inactive replication slot."),

nit - that is maybe a bit misleading because IIUC there is no real
"waiting" happening anywhere. Suggest:
Sets the amount of time a replication slot can remain inactive before
it will be invalidated.

======

Please take a look at the attached top-up patches. These include
changes for many of the nits above.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachments:

PS_NITPICKS_20240830_CODE_V430001.txttext/plain; charset=US-ASCII; name=PS_NITPICKS_20240830_CODE_V430001.txtDownload

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 7009350..c96ae53 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -671,9 +671,10 @@ retry:
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 					 errmsg("can no longer get changes from replication slot \"%s\"",
 							NameStr(s->data.name)),
-					 errdetail("This slot has been invalidated because it was inactive since %s for more than %d seconds specified by \"replication_slot_inactive_timeout\".",
+					 errdetail("The slot became invalid because it was inactive since %s, which is more than %d seconds ago.",
 							   timestamptz_to_str(s->inactive_since),
-							   replication_slot_inactive_timeout)));
+							   replication_slot_inactive_timeout),
+					 errhint("You might need to increase \"%s\".", "replication_slot_inactive_timeout")));
 		}
 	}
 
@@ -1738,9 +1739,9 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					 * is disabled or slot is currently being used or the slot
 					 * on standby is currently being synced from the primary.
 					 *
-					 * Note that we don't invalidate synced slots because,
-					 * they are typically considered not active as they don't
-					 * perform logical decoding to produce the changes.
+					 * Note that we don't invalidate synced slots because
+					 * they are typically considered not active, as they don't
+					 * perform logical decoding to produce changes.
 					 */
 					if (replication_slot_inactive_timeout == 0 ||
 						s->inactive_since == 0 ||
@@ -1789,9 +1790,9 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so or if the slot is already ours,
-		 * then mark it invalidated.  Otherwise we'll signal the owning
-		 * process, below, and retry.
+		 * If the slot can be acquired, do so and mark it as invalidated.
+		 * If the slot is already ours, mark it as invalidated.
+		 * Otherwise, we'll signal the owning process below and retry.
 		 */
 		if (active_pid == 0 ||
 			(MyReplicationSlot != NULL &&
@@ -1975,7 +1976,7 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk. Also, invalidate slots during
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
  * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 861692c..73a5824 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3030,8 +3030,9 @@ struct config_int ConfigureNamesInt[] =
 
 	{
 		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
-			gettext_noop("Sets the amount of time to wait before invalidating an "
-						 "inactive replication slot."),
+
+			gettext_noop("Sets the amount of time a replication slot can remain "
+						 "inactive before it will be invalidated."),
 			NULL,
 			GUC_UNIT_S
 		},

PS_NITPICKS_20240829_DOCS_v430001.txttext/plain; charset=US-ASCII; name=PS_NITPICKS_20240829_DOCS_v430001.txtDownload

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index fbbacbe..bd3ce5a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4568,8 +4568,8 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </term>
       <listitem>
        <para>
-        Invalidates replication slots that are inactive for longer than
-        specified amount of time. If this value is specified without units,
+        Invalidate replication slots that are inactive for longer than this
+        amount of time. If this value is specified without units,
         it is taken as seconds. A value of zero (which is default) disables
         the timeout mechanism. This parameter can only be set in
         the <filename>postgresql.conf</filename> file or on the server
@@ -4577,18 +4577,16 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </para>
 
        <para>
-        This invalidation check happens either when the slot is acquired
-        for use or during a checkpoint. The time since the slot has become
-        inactive is known from its
-        <structfield>inactive_since</structfield> value using which the
-        timeout is measured.
+        The timeout check occurs when the slot is next acquired for use, or
+        during a checkpoint. The slot's <structfield>inactive_since</structfield>
+        field value is when the slot became inactive.
        </para>
 
        <para>
         Note that the inactive timeout invalidation mechanism is not
         applicable for slots on the standby that are being synced from a
-        primary server (whose <structfield>synced</structfield> field is
-        <literal>true</literal>).
+        primary server (e.g. the standby slot <structfield>synced</structfield>
+        field is true).
        </para>
       </listitem>
      </varlistentry>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 9e00f7d..f230e6e 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2621,7 +2621,7 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
         <listitem>
          <para>
           <literal>inactive_timeout</literal> means that the slot has been
-          inactive for the duration specified by
+          inactive for longer than the duration specified by
           <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
          </para>
         </listitem>

#239

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 1 year ago

In reply to: Peter Smith (#238)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

Thanks for looking into this.

On Fri, Aug 30, 2024 at 8:13 AM Peter Smith <smithpb2250@gmail.com> wrote:

======
Commit message

1.
... introduces a GUC allowing users set inactive timeout.

~

1a. You should give the name of the new GUC in the commit message.

Modified.

1b. /set/to set/

Reworded the commit message.

======
doc/src/sgml/config.sgml

GUC "replication_slot_inactive_timeout"

2.
Invalidates replication slots that are inactive for longer than
specified amount of time

nit - suggest use similar wording as the prior GUC (wal_sender_timeout):
Invalidate replication slots that are inactive for longer than this
amount of time.

Modified.

3.
This invalidation check happens either when the slot is acquired for
use or during a checkpoint. The time since the slot has become
inactive is known from its inactive_since value using which the
timeout is measured.

nit - the wording is too complicated. suggest:
The timeout check occurs when the slot is next acquired for use, or
during a checkpoint. The slot's 'inactive_since' field value is when
the slot became inactive.

4.
Note that the inactive timeout invalidation mechanism is not
applicable for slots on the standby that are being synced from a
primary server (whose synced field is true).

nit - that word "whose" seems ambiguous. suggest:
(e.g. the standby slot has 'synced' field true).

Reworded.

======
doc/src/sgml/system-views.sgml

5.
inactive_timeout means that the slot has been inactive for the
duration specified by replication_slot_inactive_timeout parameter.

nit - suggestion ("longer than"):
... the slot has been inactive for longer than the duration specified
by the replication_slot_inactive_timeout parameter.

Modified.

======
src/backend/replication/slot.c
6.
/* Maximum number of invalidation causes */
-#define RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
IMO this #define belongs in the slot.h, immediately below where the
enum is defined.

Please check the commit that introduced it -
/messages/by-id/ZdU3CHqza9XJw4P-@paquier.xyz.
It is kept in the file where it's used.

7. ReplicationSlotAcquire:

I had a fundamental question about this logic.

IIUC the purpose of the patch was to invalidate replication slots that
have been inactive for too long.

So, it makes sense to me that some periodic processing (e.g.
CheckPointReplicationSlots) might do a sweep over all the slots, and
invalidate the too-long-inactive ones that it finds.

OTOH, it seemed quite strange to me that the patch logic is also
detecting and invalidating inactive slots during the
ReplicationSlotAcquire function. This is kind of saying "ERROR -
sorry, because this was inactive for too long you can't have it" at
the very moment that you wanted to use it again! IIUC such a slot
would be invalidated by the function InvalidatePossiblyObsoleteSlot(),
but therein lies my doubt -- how can the slot be considered as
"obsolete" when we are in the very act of trying to acquire/use it?

I guess it might be argued this is not so different to the scenario of
attempting to acquire a slot that had been invalidated momentarily
before during checkpoint processing. But, somehow that scenario seems
more like bad luck to me, versus ReplicationSlotAcquire() deliberately
invalidating something we *know* is wanted.

Hm. TBH, there's no real reason for invalidating the slot in
ReplicationSlotAcquire(). My thinking back then was to take this
opportunity to do some work. I agree to leave the invalidation work to
the checkpointer. However, I still think ReplicationSlotAcquire()
should error out if the slot has already been invalidated similar to
"can no longer get changes from replication slot \"%s\" for
wal_removed.

8.
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("can no longer get changes from replication slot \"%s\"",
+ NameStr(s->data.name)),
+ errdetail("This slot has been invalidated because it was inactive
since %s for more than %d seconds specified by
\"replication_slot_inactive_timeout\".",
+    timestamptz_to_str(s->inactive_since),
+    replication_slot_inactive_timeout)));
nit - IMO the info should be split into errdetail + errhint. Like this:
errdetail("The slot became invalid because it was inactive since %s,
which is more than %d seconds ago."...)
errhint("You might need to increase \"%s\".",
"replication_slot_inactive_timeout")

"invalid" is being covered by errmsg "invalidating obsolete
replication slot", so no need to duplicate it in errdetail.

9. ReportSlotInvalidation
+ appendStringInfo(&err_detail,
+ _("The slot has been inactive since %s for more than %d seconds
specified by \"replication_slot_inactive_timeout\"."),
+ timestamptz_to_str(inactive_since),
+ replication_slot_inactive_timeout);
+ break;
IMO this error in ReportSlotInvalidation() should be the same as the
other one from ReplicationSlotAcquire(), which I suggested above
(comment #8) should include a hint. Also, including a hint here will
make this new message consistent with the other errhint (for
"max_slot_wal_keep_size") that is already in this function.

Not exactly the same but similar. Because ReportSlotInvalidation()
errmsg has an "invalidating" component, whereas errmsg in
ReplicationSlotAcquire doesn't. Please check latest wordings.

10. InvalidatePossiblyObsoleteSlot
+ if (cause == RS_INVAL_INACTIVE_TIMEOUT &&
+ (replication_slot_inactive_timeout > 0 &&
+ s->inactive_since > 0 &&
+ !(RecoveryInProgress() && s->data.synced)))
10a. Everything here is && so this has some redundant parentheses.

Removed.

10b. Actually, IMO this complicated condition is overkill. Won't it be
better to just unconditionally assign
now = GetCurrentTimestamp(); here?

GetCurrentTimestamp() can get costlier on certain platforms. I think
the fields checking in the condition are pretty straight forward -
e.g. !RecoveryInProgress() server not in recovery, !s->data.synced
slot is not being synced and so on. Added a macro
IsInactiveTimeoutSlotInvalidationApplicable() for better readability
in two places.

11.
+ * Note that we don't invalidate synced slots because,
+ * they are typically considered not active as they don't
+ * perform logical decoding to produce the changes.

nit - tweaked punctuation

Used the consistent wording in the commit message, docs and code comments.

12.
+ * If the slot can be acquired, do so or if the slot is already ours,
+ * then mark it invalidated.  Otherwise we'll signal the owning
+ * process, below, and retry.
nit - tidied this comment. Suggestion:
If the slot can be acquired, do so and mark it as invalidated. If the
slot is already ours, mark it as invalidated. Otherwise, we'll signal
the owning process below and retry.

Modified.

13.
+ if (active_pid == 0 ||
+ (MyReplicationSlot != NULL &&
+ MyReplicationSlot == s &&
+ active_pid == MyProcPid))
You are already checking MyReplicationSlot == s here, so that extra
check for MyReplicationSlot != NULL is redundant, isn't it?

Removed.

14. CheckPointReplicationSlots

/*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate slots during
+ * non-shutdown checkpoint.
*
* It is convenient to flush dirty replication slots at the time of checkpoint.
* Additionally, in case of a shutdown checkpoint, we also identify the slots

nit - /Also, invalidate slots/Also, invalidate obsolete slots/

Modified.

15.
+ {"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+ gettext_noop("Sets the amount of time to wait before invalidating an "
+ "inactive replication slot."),
nit - that is maybe a bit misleading because IIUC there is no real
"waiting" happening anywhere. Suggest:
Sets the amount of time a replication slot can remain inactive before
it will be invalidated.

Modified.

Please find the attached v44 patch with the above changes. I will
include the 0002 xid_age based invalidation patch later.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v44-0001-Add-inactive_timeout-based-replication-slot-inva.patchapplication/x-patch; name=v44-0001-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From d0ee643d29f7df6fa39581b4c9304f327c79256a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 31 Aug 2024 07:43:22 +0000
Subject: [PATCH v44] Add inactive_timeout based replication slot invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage for instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named replication_slot_inactive_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are inactive for longer than this amount of
time.

Note that the inactive timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from primary server (i.e., standby slots having 'synced' field
true). Because such synced slots are typically considered not
active (for them to be later considered as inactive) as they don't
perform logical decoding to produce the changes.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Reviewed-by: Ajin Cherian, Shveta Malik, Peter Smith
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CA%2BTgmoZTbaaEjSZUG1FL0mzxAdN3qmXksO3O9_PZhEuXTkVnRQ%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/202403260841.5jcv7ihniccy%40alvherre.pgsql
---
 doc/src/sgml/config.sgml                      |  36 +++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 171 ++++++++++-
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |   6 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 283 ++++++++++++++++++
 13 files changed, 508 insertions(+), 23 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0aec11f443..970b496e39 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4556,6 +4556,42 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidates replication slots that are inactive for longer than
+        specified amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the timeout mechanism. This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during checkpoint. The time since the slot has become
+        inactive is known from its
+        <structfield>inactive_since</structfield> value using which the
+        timeout is measured.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from primary server (i.e., standby slots having
+        <structfield>synced</structfield> field <literal>true</literal>).
+        Because such synced slots are typically considered not active
+        (for them to be later considered as inactive) as they don't perform
+        logical decoding to produce the changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 634a4c0fab..f230e6e572 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2618,6 +2618,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for longer than the duration specified by
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 51072297fd..25c6a68b54 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -448,7 +448,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -667,7 +667,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 0a03776156..26448ecbfd 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -131,6 +132,12 @@ StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 #define SLOT_MAGIC		0x1051CA1	/* format identifier */
 #define SLOT_VERSION	5		/* version for new files */
 
+#define IsInactiveTimeoutSlotInvalidationApplicable(s) \
+	(replication_slot_inactive_timeout > 0 && \
+	 s->inactive_since > 0 && \
+	 !RecoveryInProgress() && \
+	 !s->data.synced)
+
 /* Control array for replication slot management */
 ReplicationSlotCtlData *ReplicationSlotCtl = NULL;
 
@@ -140,6 +147,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -159,6 +167,13 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 
+static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
+										   ReplicationSlot *s,
+										   XLogRecPtr oldestLSN,
+										   Oid dboid,
+										   TransactionId snapshotConflictHorizon,
+										   bool *invalidated);
+
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
@@ -535,9 +550,13 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if check_for_invalidation is true and the slot has been
+ * invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+					   bool check_for_invalidation)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +634,25 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * Error out if the slot has been invalidated previously. Because there's
+	 * no use in acquiring the invalidated slot.
+	 */
+	if (check_for_invalidation &&
+		s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+	{
+		Assert(s->inactive_since > 0);
+		ereport(ERROR,
+				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				 errmsg("can no longer get changes from replication slot \"%s\"",
+						NameStr(s->data.name)),
+				 errdetail("The slot became invalid because it was inactive since %s, which is more than %d seconds ago.",
+						   timestamptz_to_str(s->inactive_since),
+						   replication_slot_inactive_timeout),
+				 errhint("You might need to increase \"%s\".",
+						 "replication_slot_inactive_timeout.")));
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +823,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +850,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1501,10 +1539,11 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
-	bool		hint = false;
+	StringInfo	err_hint = NULL;
 
 	initStringInfo(&err_detail);
 
@@ -1514,13 +1553,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			{
 				unsigned long long ex = oldestLSN - restart_lsn;
 
-				hint = true;
 				appendStringInfo(&err_detail,
 								 ngettext("The slot's restart_lsn %X/%X exceeds the limit by %llu byte.",
 										  "The slot's restart_lsn %X/%X exceeds the limit by %llu bytes.",
 										  ex),
 								 LSN_FORMAT_ARGS(restart_lsn),
 								 ex);
+
+				err_hint = makeStringInfo();
+				appendStringInfo(err_hint,
+								 _("You might need to increase \"%s\"."), "max_slot_wal_keep_size");
 				break;
 			}
 		case RS_INVAL_HORIZON:
@@ -1531,6 +1573,17 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			Assert(inactive_since > 0);
+			appendStringInfo(&err_detail,
+							 _("The slot became inactive since %s, which is more than %d seconds ago."),
+							 timestamptz_to_str(inactive_since),
+							 replication_slot_inactive_timeout);
+
+			err_hint = makeStringInfo();
+			appendStringInfo(err_hint,
+							 _("You might need to increase \"%s\"."), "replication_slot_inactive_timeout");
+			break;
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1542,9 +1595,12 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			errmsg("invalidating obsolete replication slot \"%s\"",
 				   NameStr(slotname)),
 			errdetail_internal("%s", err_detail.data),
-			hint ? errhint("You might need to increase \"%s\".", "max_slot_wal_keep_size") : 0);
+			(err_hint != NULL) ? errhint("%s", err_hint->data) : 0);
 
 	pfree(err_detail.data);
+
+	if (err_hint != NULL)
+		destroyStringInfo(err_hint);
 }
 
 /*
@@ -1574,6 +1630,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1581,6 +1638,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1591,6 +1649,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT &&
+			IsInactiveTimeoutSlotInvalidationApplicable(s))
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1644,6 +1712,41 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+
+					/*
+					 * Quick exit if inactive timeout invalidation mechanism
+					 * is disabled or slot is currently being used or the
+					 * server is in recovery mode or the slot on standby is
+					 * currently being synced from the primary.
+					 *
+					 * Note that the inactive timeout invalidation mechanism
+					 * is not applicable for slots on the standby server that
+					 * are being synced from primary server. Because such
+					 * synced slots are typically considered not active (for
+					 * them to be later considered as inactive) as they don't
+					 * perform logical decoding to produce the changes.
+					 */
+					if (!IsInactiveTimeoutSlotInvalidationApplicable(s))
+						break;
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * replication_slot_inactive_timeout GUC.
+					 */
+					if (TimestampDifferenceExceeds(s->inactive_since, now,
+												   replication_slot_inactive_timeout * 1000))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+
+						/*
+						 * Invalidation due to inactive timeout implies that
+						 * no one is using the slot.
+						 */
+						Assert(s->active_pid == 0);
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1669,11 +1772,13 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s &&
+			 active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1728,7 +1833,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1774,7 +1880,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1797,6 +1904,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1849,7 +1957,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1907,6 +2016,38 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do inactive_timeout invalidation
+		 * of thousands of replication slots here. If it is ever proven that
+		 * this assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move inactive_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index c7bfbb15e0..b1b7b075bd 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -540,7 +540,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index c5f1009f37..61a0e38715 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -844,7 +844,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1462,7 +1462,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 521ec5591c..675eb115ac 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time a replication slot can remain inactive before "
+						 "it will be invalidated."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 667e0dc40a..deca3a4aeb 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -335,6 +335,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 45582cf9d8..431cc08c99 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -233,6 +235,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -249,7 +252,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool check_for_invalidation);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 712924c2fa..301be0f6c1 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -10,6 +10,7 @@ tests += {
        'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
     },
     'tests': [
+      't/050_invalidate_slots.pl',
       't/001_stream_rep.pl',
       't/002_archiving.pl',
       't/003_recovery_targets.pl',
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..fa6a12a12d
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,283 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start
+#
+# Invalidate streaming standby's slot as well as logical
+# failover slot on primary due to replication_slot_inactive_timeout. Also,
+# check the logical failover slot synced on to the standby doesn't invalidate
+# the slot on its own, but gets the invalidated state from the remote slot on
+# the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', q{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr_1 = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb1_slot'
+primary_conninfo = '$connstr_1 dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('lsub1_sync_slot', 'test_decoding', false, false, true);}
+);
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb1_slot', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Synchronize the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby and is
+# flagged as 'synced'.
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'lsub1_sync_slot' AND synced AND NOT temporary;}
+	),
+	"t",
+	'logical slot lsub1_sync_slot has synced as true on standby');
+
+my $standby1_logstart = -s $standby1->logfile;
+my $logstart = -s $primary->logfile;
+my $inactive_timeout = 2;
+
+# Set timeout so that the next checkpoint will invalidate the inactive
+# replication slot.
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$primary->reload;
+
+# Wait for the logical failover slot to become inactive on the primary. Note
+# that nobody has acquired that slot yet, so due to
+# replication_slot_inactive_timeout setting above it must get invalidated.
+wait_for_slot_invalidation($primary, 'lsub1_sync_slot', $logstart,
+	$inactive_timeout);
+
+# Set timeout on the standby also to check the synced slots don't get
+# invalidated due to timeout on the standby.
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '2s';
+]);
+$standby1->reload;
+
+# Now, sync the logical failover slot from the remote slot on the primary.
+# Note that the remote slot has already been invalidated due to inactive
+# timeout. Now, the standby must also see it as invalidated.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Wait for the inactive replication slot to be invalidated.
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'lsub1_sync_slot' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for lsub1_sync_slot invalidation to be synced on standby";
+
+# Synced slot mustn't get invalidated on the standby even after a checkpoint,
+# it must sync invalidation from the primary. So, we must not see the slot's
+# invalidation message in server log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+ok( !$standby1->log_contains(
+		"invalidating obsolete replication slot \"lsub1_sync_slot\"",
+		$standby1_logstart),
+	'check that syned lsub1_sync_slot has not been invalidated on the standby'
+);
+
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
+wait_for_slot_invalidation($primary, 'sb1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end
+# =============================================================================
+
+# =============================================================================
+# Testcase start
+# Invalidate logical subscriber's slot due to replication_slot_inactive_timeout.
+
+my $publisher = $primary;
+
+# Prepare for the next test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO ' ${inactive_timeout}s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+# =============================================================================
+
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset, $inactive_timeout) = @_;
+	my $name = $node->name;
+
+	# Wait for the replication slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for slot $slot_name to become inactive on node $name";
+
+	# Wait for the replication slot info to be updated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE inactive_since IS NOT NULL
+				AND slot_name = '$slot_name' AND active = 'f';
+	])
+	  or die
+	  "Timed out while waiting for info of slot $slot_name to be updated on node $name";
+
+	# Sleep at least $inactive_timeout duration to avoid multiple checkpoints
+	# for the slot to get invalidated.
+	sleep($inactive_timeout);
+
+	check_for_slot_invalidation_in_server_log($node, $slot_name, $offset);
+
+	# Wait for the inactive replication slot to be invalidated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name";
+
+	# Check that the invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot_name', '0/1');
+	]);
+
+	ok( $stderr =~
+		  /can no longer get changes from replication slot "$slot_name"/,
+		"detected error upon trying to acquire invalidated slot $slot_name on node $name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot_name on node $name";
+}
+
+# Check for invalidation of slot in server log
+sub check_for_slot_invalidation_in_server_log
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $name = $node->name;
+	my $invalidated = 0;
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot_name\"",
+				$offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot_name invalidation has been logged on node $name"
+	);
+}
+
+done_testing();
-- 
2.43.0

#240

Amit Kapila

amit.kapila16@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#237)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Aug 29, 2024 at 11:31 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks for looking into this.

On Mon, Aug 26, 2024 at 4:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Few comments on 0001:
1.
@@ -651,6 +651,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid
+ /*
+ * Skip the sync if the local slot is already invalidated. We do this
+ * beforehand to avoid slot acquire and release.
+ */
I was wondering why you have added this new check as part of this
patch. If you see the following comments in the related code, you will
know why we haven't done this previously.
Removed. Can deal with optimization separately.
2.
+ */
+ InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+    0, InvalidOid, InvalidTransactionId);
Why do we want to call this for shutdown case (when is_shutdown is
true)? I understand trying to invalidate slots during regular
checkpoint but not sure if we need it at the time of shutdown.
Changed it to invalidate only for non-shutdown checkpoints. inactive_timeout invalidation isn't critical for shutdown unlike wal_removed which can help shutdown by freeing up some disk space.

The
other point is can we try to check the performance impact with 100s of
slots as mentioned in the code comments?

I first checked how much does the wal_removed invalidation check add to the checkpoint (see 2nd and 3rd column). I then checked how much inactive_timeout invalidation check adds to the checkpoint (see 4th column), it is not more than wal_remove invalidation check. I then checked how much the wal_removed invalidation check adds for replication slots that have already been invalidated due to inactive_timeout (see 5th column), looks like not much.

| # of slots | HEAD (no invalidation) ms | HEAD (wal_removed) ms | PATCHED (inactive_timeout) ms | PATCHED (inactive_timeout+wal_removed) ms |
|------------|----------------------------|-----------------------|-------------------------------|------------------------------------------|
| 100 | 18.591 | 370.586 | 359.299 | 373.882 |
| 1000 | 15.722 | 4834.901 | 5081.751 | 5072.128 |
| 10000 | 19.261 | 59801.062 | 61270.406 | 60270.099 |

Having said that, I'm okay to implement the optimization specified. Thoughts?

The other possibility is to try invalidating due to timeout along with
wal_removed case during checkpoint. The idea is that if the slot can
be invalidated due to WAL then fine, otherwise check if it can be
invalidated due to timeout. This can avoid looping the slots and doing
similar work multiple times during the checkpoint.

--
With Regards,
Amit Kapila.

#241

Peter Smith

smithpb2250@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#239)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi. Thanks for addressing my previous review comments.

Here are some review comments for v44-0001.

======
Commit message.

1.
Because such synced slots are typically considered not
active (for them to be later considered as inactive) as they don't
perform logical decoding to produce the changes.

This sentence is bad grammar. The docs have the same wording, so
please see my doc review comment #4 suggestion below.

======
doc/src/sgml/config.sgml

2.
+       <para>
+        Invalidates replication slots that are inactive for longer than
+        specified amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the timeout mechanism. This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+

nit - This is OK as-is, but OTOH why not make the wording consistent
with the previous GUC description? (e.g. see my v43 [1]v43 review - /messages/by-id/CAHut+PuFzCHPCiZbpoQX59kgZbebuWT0gR0O7rOe4t_sdYu=OA@mail.gmail.com #2 review
comment)

~~~

3.
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during checkpoint. The time since the slot has become
+        inactive is known from its
+        <structfield>inactive_since</structfield> value using which the
+        timeout is measured.
+       </para>
+

I felt this is slightly misleading because slot acquiring has nothing
to do with setting the slot invalidation anymore. Furthermore, the 2nd
sentence is bad grammar.

nit - IMO something simple like the following rewording can address
both of those points:

Slot invalidation due to inactivity timeout occurs during checkpoint.
The duration of slot inactivity is calculated using the slot's
<structfield>inactive_since</structfield> field value.

4.
+        Because such synced slots are typically considered not active
+        (for them to be later considered as inactive) as they don't perform
+        logical decoding to produce the changes.

That sentence has bad grammar.

nit – suggest a much simpler replacement:
Synced slots are always considered to be inactive because they don't
perform logical decoding to produce changes.

======
src/backend/replication/slot.c

5.
+#define IsInactiveTimeoutSlotInvalidationApplicable(s) \
+ (replication_slot_inactive_timeout > 0 && \
+ s->inactive_since > 0 && \
+ !RecoveryInProgress() && \
+ !s->data.synced)
+

5a.
I felt this would be better implemented as an inline function. Then it
can be commented on properly to explain the parts of the condition.
e.g. the large comment currently in InvalidatePossiblyObsoleteSlot()
would be more appropriate in this function.

5b.
The name is very long. Can't it be something shorter/simpler like:
'IsSlotATimeoutCandidate()'

~~~

6. ReplicationSlotAcquire

-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+    bool check_for_invalidation)

nit - Previously this new parameter really did mean to "check" for
[and set the slot] invalidation. But now I suggest renaming it to
'error_if_invalid' to properly reflect the new usage. And also in the
slot.h.

7.
+ /*
+ * Error out if the slot has been invalidated previously. Because there's
+ * no use in acquiring the invalidated slot.
+ */

nit - The comment is contrary to the code. If there was no reason to
skip this error, then you would not have the new parameter allowing
you to skip this error. I suggest just repeating the same comment as
in the function header.

~~~

8. ReportSlotInvalidation

nit - Added some blank lines for consistency.

~~~

9. InvalidatePossiblyObsoleteSlot

+ /*
+ * Quick exit if inactive timeout invalidation mechanism
+ * is disabled or slot is currently being used or the
+ * server is in recovery mode or the slot on standby is
+ * currently being synced from the primary.
+ *
+ * Note that the inactive timeout invalidation mechanism
+ * is not applicable for slots on the standby server that
+ * are being synced from primary server. Because such
+ * synced slots are typically considered not active (for
+ * them to be later considered as inactive) as they don't
+ * perform logical decoding to produce the changes.
+ */
+ if (!IsInactiveTimeoutSlotInvalidationApplicable(s))
+ break;

9a.
Consistency is good (commit message, docs and code comments for this),
but the added sentence has bad grammar. Please see the docs review
comment #4 above for some alternate phrasing.

9b.
Now that this logic is moved into a macro (I suggested it should be an
inline function) IMO this comment does not belong here anymore because
it is commenting code that you cannot see. Instead, this comment (or
something like it) should be as comments within the new function.

======
src/include/replication/slot.h

10.
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+    bool check_for_invalidation);

Change the new param name as described in the earlier review comment.

======
src/test/recovery/t/050_invalidate_slots.pl

~~~

Please refer to the attached file which implements some of the nits
mentioned above.

======
[1]: v43 review - /messages/by-id/CAHut+PuFzCHPCiZbpoQX59kgZbebuWT0gR0O7rOe4t_sdYu=OA@mail.gmail.com
/messages/by-id/CAHut+PuFzCHPCiZbpoQX59kgZbebuWT0gR0O7rOe4t_sdYu=OA@mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Attachments:

PS_NITPICKS_v440001.txttext/plain; charset=US-ASCII; name=PS_NITPICKS_v440001.txtDownload

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 970b496..0537714 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4564,8 +4564,8 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </term>
       <listitem>
        <para>
-        Invalidates replication slots that are inactive for longer than
-        specified amount of time. If this value is specified without units,
+        Invalidate replication slots that are inactive for longer than this
+        amount of time. If this value is specified without units,
         it is taken as seconds. A value of zero (which is default) disables
         the timeout mechanism. This parameter can only be set in
         the <filename>postgresql.conf</filename> file or on the server
@@ -4573,11 +4573,9 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </para>
 
        <para>
-        This invalidation check happens either when the slot is acquired
-        for use or during checkpoint. The time since the slot has become
-        inactive is known from its
-        <structfield>inactive_since</structfield> value using which the
-        timeout is measured.
+        Slot invalidation due to inactivity timeout occurs during checkpoint.
+        The duration of slot inactivity is calculated using the slot's
+        <structfield>inactive_since</structfield> field value.
        </para>
 
        <para>
@@ -4585,9 +4583,8 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
         applicable for slots on the standby server that are being synced
         from primary server (i.e., standby slots having
         <structfield>synced</structfield> field <literal>true</literal>).
-        Because such synced slots are typically considered not active
-        (for them to be later considered as inactive) as they don't perform
-        logical decoding to produce the changes.
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
        </para>
       </listitem>
      </varlistentry>
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index acc0370..bb06592 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -551,12 +551,11 @@ ReplicationSlotName(int index, Name name)
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
  *
- * An error is raised if check_for_invalidation is true and the slot has been
+ * An error is raised if error_if_invalid is true and the slot has been
  * invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait,
-					   bool check_for_invalidation)
+ReplicationSlotAcquire(const char *name, bool nowait,  bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -635,11 +634,10 @@ retry:
 	MyReplicationSlot = s;
 
 	/*
-	 * Error out if the slot has been invalidated previously. Because there's
-	 * no use in acquiring the invalidated slot.
+	 * An error is raised if error_if_invalid is true and the slot has been
+	 * invalidated previously.
 	 */
-	if (check_for_invalidation &&
-		s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+	if (error_if_invalid && s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
 	{
 		Assert(s->inactive_since > 0);
 		ereport(ERROR,
@@ -1565,6 +1563,7 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 								 _("You might need to increase \"%s\"."), "max_slot_wal_keep_size");
 				break;
 			}
+
 		case RS_INVAL_HORIZON:
 			appendStringInfo(&err_detail, _("The slot conflicted with xid horizon %u."),
 							 snapshotConflictHorizon);
@@ -1573,6 +1572,7 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
 		case RS_INVAL_INACTIVE_TIMEOUT:
 			Assert(inactive_since > 0);
 			appendStringInfo(&err_detail,
@@ -1584,6 +1584,7 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			appendStringInfo(err_hint,
 							 _("You might need to increase \"%s\"."), "replication_slot_inactive_timeout");
 			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 431cc08..5678e1a 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -253,7 +253,7 @@ extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
 extern void ReplicationSlotAcquire(const char *name, bool nowait,
-								   bool check_for_invalidation);
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);

#242

Amit Kapila

amit.kapila16@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#239)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sat, Aug 31, 2024 at 1:45 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Please find the attached v44 patch with the above changes. I will
include the 0002 xid_age based invalidation patch later.

It is better to get the 0001 reviewed and committed first. We can
discuss about 0002 afterwards as 0001 is in itself a complete and
separate patch that can be committed.

--
With Regards,
Amit Kapila.

#243

Peter Smith

smithpb2250@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#239)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi, my previous review posts did not cover the test code.

Here are my review comments for the v44-0001 test code

======
TEST CASE #1

1.
+# Wait for the inactive replication slot to be invalidated.
+$standby1->poll_query_until(
+ 'postgres', qq[
+ SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+ WHERE slot_name = 'lsub1_sync_slot' AND
+ invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for lsub1_sync_slot invalidation to be
synced on standby";
+

Is that comment correct? IIUC the synced slot should *already* be
invalidated from the primary, so here we are not really "waiting" for
it to be invalidated; Instead, we are just "confirming" that the
synchronized slot is already invalidated with the correct reason as
expected.

~~~

2.
+# Synced slot mustn't get invalidated on the standby even after a checkpoint,
+# it must sync invalidation from the primary. So, we must not see the slot's
+# invalidation message in server log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+ok( !$standby1->log_contains(
+ "invalidating obsolete replication slot \"lsub1_sync_slot\"",
+ $standby1_logstart),
+ 'check that syned lsub1_sync_slot has not been invalidated on the standby'
+);
+

This test case seemed bogus, for a couple of reasons:

2a. IIUC this 'lsub1_sync_slot' is the same one that is already
invalid (from the primary), so nobody should be surprised that an
already invalid slot doesn't get flagged as invalid again. i.e.
Shouldn't your test scenario here be done using a valid synced slot?

2b. AFAICT it was only moments above this CHECKPOINT where you
assigned the standby inactivity timeout to 2s. So even if there was
some bug invalidating synced slots I don't think you gave it enough
time to happen -- e.g. I doubt 2s has elapsed yet.

3.
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
+wait_for_slot_invalidation($primary, 'sb1_slot', $logstart,
+ $inactive_timeout);

This seems a bit tricky. Both these (the stop and the wait) seem to
belong together, so I think maybe a single bigger explanatory comment
covering both parts would help for understanding.

======
TEST CASE #2

4.
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+ $inactive_timeout);

IIUC, this is just like comment #3 above. Both these (the stop and the
wait) seem to belong together, so I think maybe a single bigger
explanatory comment covering both parts would help for understanding.

~~~

5.
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+# =============================================================================

IMO the rest of the comment after "Testcase end" isn't very useful.

======
sub wait_for_slot_invalidation

6.
+sub wait_for_slot_invalidation
+{

An explanatory header comment for this subroutine would be helpful.

~~~

7.
+ # Wait for the replication slot to become inactive
+ $node->poll_query_until(
+ 'postgres', qq[
+ SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+ WHERE slot_name = '$slot_name' AND active = 'f';
+ ])
+   or die
+   "Timed out while waiting for slot $slot_name to become inactive on
node $name";
+
+ # Wait for the replication slot info to be updated
+ $node->poll_query_until(
+ 'postgres', qq[
+ SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+ WHERE inactive_since IS NOT NULL
+ AND slot_name = '$slot_name' AND active = 'f';
+ ])
+   or die
+   "Timed out while waiting for info of slot $slot_name to be updated
on node $name";
+

Why are there are 2 separate poll_query_until's here? Can't those be
combined into just one?

~~~

8.
+ # Sleep at least $inactive_timeout duration to avoid multiple checkpoints
+ # for the slot to get invalidated.
+ sleep($inactive_timeout);
+

Maybe this special sleep to prevent too many CHECKPOINTs should be
moved to be inside the other subroutine, which is actually doing those
CHECKPOINTs.

~~~

9.
+ # Wait for the inactive replication slot to be invalidated
+ $node->poll_query_until(
+ 'postgres', qq[
+ SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+ WHERE slot_name = '$slot_name' AND
+ invalidation_reason = 'inactive_timeout';
+ ])
+   or die
+   "Timed out while waiting for inactive slot $slot_name to be
invalidated on node $name";
+

The comment seems misleading. IIUC you are not "waiting" for the
invalidation here, because it is the other subroutine doing the
waiting for the invalidation message in the logs. Instead, here I
think you are just confirming the 'invalidation_reason' got set
correctly. The comment should say what it is really doing.

======
sub check_for_slot_invalidation_in_server_log

10.
+# Check for invalidation of slot in server log
+sub check_for_slot_invalidation_in_server_log
+{

I think the main function of this subroutine is the CHECKPOINT and the
waiting for the server log to say invalidation happened. It is doing a
loop of a) CHECKPOINT then b) inspecting the server log for the slot
invalidation, and c) waiting for a bit. Repeat 10 times.

A comment describing the logic for this subroutine would be helpful.

The most important side-effect of this function is the CHECKPOINT
because without that nothing will ever get invalidated due to
inactivity, but this key point is not obvious from the subroutine
name.

IMO it would be better to name this differently to reflect what it is
really doing:
e.g. "CHECKPOINT_and_wait_for_slot_invalidation_in_server_log"

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#244

shveta malik

shveta.malik@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#239)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sat, Aug 31, 2024 at 1:45 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

Please find the attached v44 patch with the above changes. I will
include the 0002 xid_age based invalidation patch later.

Thanks for the patch Bharath. My review and testing is WIP, but please
find few comments and queries:

1)
I see that ReplicationSlotAlter() will error out if the slot is
invalidated due to timeout. I have not tested it myself, but do you
know if slot-alter errors out for other invalidation causes as well?
Just wanted to confirm that the behaviour is consistent for all
invalidation causes.

2)
When a slot is invalidated, and we try to use that slot, it gives this msg:

ERROR: can no longer get changes from replication slot "mysubnew1_2"
DETAIL: The slot became invalid because it was inactive since
2024-09-03 14:23:34.094067+05:30, which is more than 600 seconds ago.
HINT: You might need to increase "replication_slot_inactive_timeout.".

Isn't HINT misleading? Even if we increase it now, the slot can not be
reused again.

3)
When the slot is invalidated, the' inactive_since' still keeps on
changing when there is a subscriber trying to start replication
continuously. I think ReplicationSlotAcquire() keeps on failing and
thus Release keeps on setting it again and again. Shouldn't we stop
setting/chnaging 'inactive_since' once the slot is invalidated
already, otherwise it will be misleading.

postgres=# select failover,synced,inactive_since,invalidation_reason
from pg_replication_slots;

4)
src/sgml/config.sgml:

4a)
+ A value of zero (which is default) disables the timeout mechanism.

Better will be:
A value of zero (which is default) disables the inactive timeout
invalidation mechanism .
or
A value of zero (which is default) disables the slot invalidation due
to the inactive timeout mechanism.

i.e. rephrase to indicate that invalidation is disabled.

4b)
'synced' and inactive_since should point to pg_replication_slots:

example:
<link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>

5)
src/sgml/system-views.sgml:
+ ..the slot has been inactive for longer than the duration specified
by replication_slot_inactive_timeout parameter.

Better to have:
..the slot has been inactive for a time longer than the duration
specified by the replication_slot_inactive_timeout parameter.

thanks
Shveta

#245

shveta malik

shveta.malik@gmail.com

over 1 year ago

In reply to: shveta malik (#244)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:

1)
I see that ReplicationSlotAlter() will error out if the slot is
invalidated due to timeout. I have not tested it myself, but do you
know if slot-alter errors out for other invalidation causes as well?
Just wanted to confirm that the behaviour is consistent for all
invalidation causes.

I was able to test this and as anticipated behavior is different. When
slot is invalidated due to say 'wal_removed', I am still able to do
'alter' of that slot.
Please see:

Sub:
newdb1=# alter subscription mysubnew1_1 disable;
ALTER SUBSCRIPTION

newdb1=# alter subscription mysubnew1_1 set (failover=false);
ALTER SUBSCRIPTION

while when invalidation_reason is 'inactive_timeout', it fails:

Sub:
newdb1=# alter subscription mysubnew1_1 disable;
ALTER SUBSCRIPTION

newdb1=# alter subscription mysubnew1_1 set (failover=false);
ERROR: could not alter replication slot "mysubnew1_1": ERROR: can no
longer get changes from replication slot "mysubnew1_1"
DETAIL: The slot became invalid because it was inactive since
2024-09-04 08:54:20.308996+05:30, which is more than 0 seconds ago.
HINT: You might need to increase "replication_slot_inactive_timeout.".

I think the behavior should be same.

thanks
Shveta

#246

shveta malik

shveta.malik@gmail.com

over 1 year ago

In reply to: shveta malik (#245)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Sep 4, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:

On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:

1)
It is related to one of my previous comments (pt 3 in [1]/messages/by-id/CAJpy0uC8Dg-0JS3NRUwVUemgz5Ar2v3_EQQFXyAigWSEQ8U47Q@mail.gmail.com) where I
stated that inactive_since should not keep on changing once a slot is
invalidated.
Below is one side effect if inactive_since keeps on changing:

postgres=# SELECT * FROM pg_replication_slot_advance('mysubnew1_1',
pg_current_wal_lsn());
ERROR: can no longer get changes from replication slot "mysubnew1_1"
DETAIL: The slot became invalid because it was inactive since
2024-09-04 10:03:56.68053+05:30, which is more than 10 seconds ago.
HINT: You might need to increase "replication_slot_inactive_timeout.".

postgres=# select now();
now
---------------------------------
2024-09-04 10:04:00.26564+05:30

'DETAIL' gives wrong information, we are not past 10-seconds. This is
because inactive_since got updated even in ERROR scenario.

2)
One more issue in this message is, once I set
replication_slot_inactive_timeout to a bigger value, it becomes more
misleading. This is because invalidation was done in the past using
previous value while message starts showing new value:

ALTER SYSTEM SET replication_slot_inactive_timeout TO '36h';

--see 129600 secs in DETAIL and the current time.
postgres=# SELECT * FROM pg_replication_slot_advance('mysubnew1_1',
pg_current_wal_lsn());
ERROR: can no longer get changes from replication slot "mysubnew1_1"
DETAIL: The slot became invalid because it was inactive since
2024-09-04 10:06:38.980939+05:30, which is more than 129600 seconds
ago.
postgres=# select now();
now
----------------------------------
2024-09-04 10:07:35.201894+05:30

I feel we should change this message itself.

~~~~~

When invalidation is due to wal_removed, we get a way simpler message:

newdb1=# SELECT * FROM pg_replication_slot_advance('mysubnew1_2',
pg_current_wal_lsn());
ERROR: replication slot "mysubnew1_2" cannot be advanced
DETAIL: This slot has never previously reserved WAL, or it has been
invalidated.

This message does not mention 'max_slot_wal_keep_size'. We should have
a similar message for our case. Thoughts?

[1]: /messages/by-id/CAJpy0uC8Dg-0JS3NRUwVUemgz5Ar2v3_EQQFXyAigWSEQ8U47Q@mail.gmail.com

thanks
Shveta

#247

Amit Kapila

amit.kapila16@gmail.com

over 1 year ago

In reply to: shveta malik (#246)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Sep 4, 2024 at 2:49 PM shveta malik <shveta.malik@gmail.com> wrote:

On Wed, Sep 4, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:

On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:

1)
It is related to one of my previous comments (pt 3 in [1]) where I
stated that inactive_since should not keep on changing once a slot is
invalidated.

Agreed. Updating the inactive_since for a slot that is already invalid
is misleading.

2)
One more issue in this message is, once I set
replication_slot_inactive_timeout to a bigger value, it becomes more
misleading. This is because invalidation was done in the past using
previous value while message starts showing new value:

ALTER SYSTEM SET replication_slot_inactive_timeout TO '36h';

--see 129600 secs in DETAIL and the current time.
postgres=# SELECT * FROM pg_replication_slot_advance('mysubnew1_1',
pg_current_wal_lsn());
ERROR: can no longer get changes from replication slot "mysubnew1_1"
DETAIL: The slot became invalid because it was inactive since
2024-09-04 10:06:38.980939+05:30, which is more than 129600 seconds
ago.
postgres=# select now();
now
----------------------------------
2024-09-04 10:07:35.201894+05:30

I feel we should change this message itself.

+1.

--
With Regards,
Amit Kapila.

#248

Amit Kapila

amit.kapila16@gmail.com

over 1 year ago

In reply to: shveta malik (#245)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Sep 4, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:

On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:

1)
I see that ReplicationSlotAlter() will error out if the slot is
invalidated due to timeout. I have not tested it myself, but do you
know if slot-alter errors out for other invalidation causes as well?
Just wanted to confirm that the behaviour is consistent for all
invalidation causes.

I was able to test this and as anticipated behavior is different. When
slot is invalidated due to say 'wal_removed', I am still able to do
'alter' of that slot.
Please see:

Pub:
slot_name | failover | synced | inactive_since |
invalidation_reason
-------------+----------+--------+----------------------------------+---------------------
mysubnew1_1 | t | f | 2024-09-04 08:58:12.802278+05:30 |
wal_removed

Sub:
newdb1=# alter subscription mysubnew1_1 disable;
ALTER SUBSCRIPTION

newdb1=# alter subscription mysubnew1_1 set (failover=false);
ALTER SUBSCRIPTION

Pub: (failover altered)
slot_name | failover | synced | inactive_since |
invalidation_reason
-------------+----------+--------+----------------------------------+---------------------
mysubnew1_1 | f | f | 2024-09-04 08:58:47.824471+05:30 |
wal_removed

while when invalidation_reason is 'inactive_timeout', it fails:

Pub:
slot_name | failover | synced | inactive_since |
invalidation_reason
-------------+----------+--------+----------------------------------+---------------------
mysubnew1_1 | t | f | 2024-09-03 14:30:57.532206+05:30 |
inactive_timeout

Sub:
newdb1=# alter subscription mysubnew1_1 disable;
ALTER SUBSCRIPTION

newdb1=# alter subscription mysubnew1_1 set (failover=false);
ERROR: could not alter replication slot "mysubnew1_1": ERROR: can no
longer get changes from replication slot "mysubnew1_1"
DETAIL: The slot became invalid because it was inactive since
2024-09-04 08:54:20.308996+05:30, which is more than 0 seconds ago.
HINT: You might need to increase "replication_slot_inactive_timeout.".

I think the behavior should be same.

We should not allow the invalid replication slot to be altered
irrespective of the reason unless there is any benefit.

--
With Regards,
Amit Kapila.

#249

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 1 year ago

In reply to: Peter Smith (#241)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

Thanks for reviewing.

On Mon, Sep 2, 2024 at 1:37 PM Peter Smith <smithpb2250@gmail.com> wrote:

Commit message.

1.
Because such synced slots are typically considered not
active (for them to be later considered as inactive) as they don't
perform logical decoding to produce the changes.

This sentence is bad grammar. The docs have the same wording, so
please see my doc review comment #4 suggestion below.

2.
+       <para>
+        Invalidates replication slots that are inactive for longer than
+        specified amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the timeout mechanism. This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+

nit - This is OK as-is, but OTOH why not make the wording consistent
with the previous GUC description? (e.g. see my v43 [1] #2 review
comment)

+1.

3.
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during checkpoint. The time since the slot has become
+        inactive is known from its
+        <structfield>inactive_since</structfield> value using which the
+        timeout is measured.
+       </para>
+
I felt this is slightly misleading because slot acquiring has nothing
to do with setting the slot invalidation anymore. Furthermore, the 2nd
sentence is bad grammar.

nit - IMO something simple like the following rewording can address
both of those points:

Slot invalidation due to inactivity timeout occurs during checkpoint.
The duration of slot inactivity is calculated using the slot's
<structfield>inactive_since</structfield> field value.

+1.

4.
+        Because such synced slots are typically considered not active
+        (for them to be later considered as inactive) as they don't perform
+        logical decoding to produce the changes.
That sentence has bad grammar.

nit – suggest a much simpler replacement:
Synced slots are always considered to be inactive because they don't
perform logical decoding to produce changes.

+1.

5.
+#define IsInactiveTimeoutSlotInvalidationApplicable(s) \

5a.
I felt this would be better implemented as an inline function. Then it
can be commented on properly to explain the parts of the condition.
e.g. the large comment currently in InvalidatePossiblyObsoleteSlot()
would be more appropriate in this function.

+1.

5b.
The name is very long. Can't it be something shorter/simpler like:
'IsSlotATimeoutCandidate()'

~~~

Missing inactive in the above suggested name. Used
SlotInactiveTimeoutCheckAllowed, similar to XLogInsertAllowed.

6. ReplicationSlotAcquire
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+    bool check_for_invalidation)
nit - Previously this new parameter really did mean to "check" for
[and set the slot] invalidation. But now I suggest renaming it to
'error_if_invalid' to properly reflect the new usage. And also in the
slot.h.

+1.

7.
+ /*
+ * Error out if the slot has been invalidated previously. Because there's
+ * no use in acquiring the invalidated slot.
+ */
nit - The comment is contrary to the code. If there was no reason to
skip this error, then you would not have the new parameter allowing
you to skip this error. I suggest just repeating the same comment as
in the function header.

+1.

8. ReportSlotInvalidation

nit - Added some blank lines for consistency.

+1.

9. InvalidatePossiblyObsoleteSlot

9a.
Consistency is good (commit message, docs and code comments for this),
but the added sentence has bad grammar. Please see the docs review
comment #4 above for some alternate phrasing.

+1.

9b.
Now that this logic is moved into a macro (I suggested it should be an
inline function) IMO this comment does not belong here anymore because
it is commenting code that you cannot see. Instead, this comment (or
something like it) should be as comments within the new function.

======
src/include/replication/slot.h

+1.

10.
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+    bool check_for_invalidation);
Change the new param name as described in the earlier review comment.

+1.

Please refer to the attached file which implements some of the nits
mentioned above.

Merged the diff into v45. Thanks.

On Tue, Sep 3, 2024 at 12:26 PM Peter Smith <smithpb2250@gmail.com> wrote:

TEST CASE #1

1.
+# Wait for the inactive replication slot to be invalidated.

Is that comment correct? IIUC the synced slot should *already* be
invalidated from the primary, so here we are not really "waiting" for
it to be invalidated; Instead, we are just "confirming" that the
synchronized slot is already invalidated with the correct reason as
expected.

Modified the comment.

2.
+# Synced slot mustn't get invalidated on the standby even after a checkpoint,
+# it must sync invalidation from the primary. So, we must not see the slot's
+# invalidation message in server log.
This test case seemed bogus, for a couple of reasons:

2a. IIUC this 'lsub1_sync_slot' is the same one that is already
invalid (from the primary), so nobody should be surprised that an
already invalid slot doesn't get flagged as invalid again. i.e.
Shouldn't your test scenario here be done using a valid synced slot?

+1. Added another test case for checking the synced slot not getting
invalidated despite inactive timeout being set.

2b. AFAICT it was only moments above this CHECKPOINT where you
assigned the standby inactivity timeout to 2s. So even if there was
some bug invalidating synced slots I don't think you gave it enough
time to happen -- e.g. I doubt 2s has elapsed yet.

Added sleep(timeout+1) before the checkpoint.

3.
+# Stop standby to make the standby's replication slot on the primary inactive
+$standby1->stop;
+
+# Wait for the standby's replication slot to become inactive
TEST CASE #2
4.
+# Stop subscriber to make the replication slot on publisher inactive
+$subscriber->stop;
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# timeout.
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+ $inactive_timeout);
IIUC, this is just like comment #3 above. Both these (the stop and the
wait) seem to belong together, so I think maybe a single bigger
explanatory comment covering both parts would help for understanding.

Done.

5.
+# Testcase end: Invalidate logical subscriber's slot due to
+# replication_slot_inactive_timeout.
+# =============================================================================

IMO the rest of the comment after "Testcase end" isn't very useful.

Removed.

======
sub wait_for_slot_invalidation
6.
+sub wait_for_slot_invalidation
+{
An explanatory header comment for this subroutine would be helpful.

Done.

7.
+ # Wait for the replication slot to become inactive
+ $node->poll_query_until(
Why are there are 2 separate poll_query_until's here? Can't those be
combined into just one?

Ah. My bad. Removed.

~~~
8.
+ # Sleep at least $inactive_timeout duration to avoid multiple checkpoints
+ # for the slot to get invalidated.
+ sleep($inactive_timeout);
+
Maybe this special sleep to prevent too many CHECKPOINTs should be
moved to be inside the other subroutine, which is actually doing those
CHECKPOINTs.

Done.

9.
+ # Wait for the inactive replication slot to be invalidated
+   "Timed out while waiting for inactive slot $slot_name to be
invalidated on node $name";
+
The comment seems misleading. IIUC you are not "waiting" for the
invalidation here, because it is the other subroutine doing the
waiting for the invalidation message in the logs. Instead, here I
think you are just confirming the 'invalidation_reason' got set
correctly. The comment should say what it is really doing.

Modified.

sub check_for_slot_invalidation_in_server_log
10.
+# Check for invalidation of slot in server log
+sub check_for_slot_invalidation_in_server_log
+{
I think the main function of this subroutine is the CHECKPOINT and the
waiting for the server log to say invalidation happened. It is doing a
loop of a) CHECKPOINT then b) inspecting the server log for the slot
invalidation, and c) waiting for a bit. Repeat 10 times.

A comment describing the logic for this subroutine would be helpful.

The most important side-effect of this function is the CHECKPOINT
because without that nothing will ever get invalidated due to
inactivity, but this key point is not obvious from the subroutine
name.

IMO it would be better to name this differently to reflect what it is
really doing:
e.g. "CHECKPOINT_and_wait_for_slot_invalidation_in_server_log"

That would be too long. Changed the function name to
trigger_slot_invalidation() which is appropriate.

Please find the v45 patch. Addressed above and Shveta's review comments [1]/messages/by-id/CAJpy0uC8Dg-0JS3NRUwVUemgz5Ar2v3_EQQFXyAigWSEQ8U47Q@mail.gmail.com.

Amit's comments [2]/messages/by-id/CAA4eK1K7DdT_5HnOWs5tVPYC=-h+m85wu7k-7RVJaJ7zMxprWQ@mail.gmail.com and [3]/messages/by-id/CAA4eK1+kt-QRr1RP=D=4_tp+S+CErQ6rNe7KVYEyZ3f6PYXpvw@mail.gmail.com are still pending.

[1]: /messages/by-id/CAJpy0uC8Dg-0JS3NRUwVUemgz5Ar2v3_EQQFXyAigWSEQ8U47Q@mail.gmail.com
[2]: /messages/by-id/CAA4eK1K7DdT_5HnOWs5tVPYC=-h+m85wu7k-7RVJaJ7zMxprWQ@mail.gmail.com
[3]: /messages/by-id/CAA4eK1+kt-QRr1RP=D=4_tp+S+CErQ6rNe7KVYEyZ3f6PYXpvw@mail.gmail.com

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v45-0001-Add-inactive_timeout-based-replication-slot-inva.patchapplication/octet-stream; name=v45-0001-Add-inactive_timeout-based-replication-slot-inva.patchDownload

From 60a5beb355ef2c9c2392e0229af0a24293882f50 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sun, 8 Sep 2024 11:33:05 +0000
Subject: [PATCH v45] Add inactive_timeout based replication slot invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage for instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named replication_slot_inactive_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are inactive for longer than this amount of
time.

Note that the inactive timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Reviewed-by: Ajin Cherian, Shveta Malik, Peter Smith
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CA%2BTgmoZTbaaEjSZUG1FL0mzxAdN3qmXksO3O9_PZhEuXTkVnRQ%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/202403260841.5jcv7ihniccy%40alvherre.pgsql
---
 doc/src/sgml/config.sgml                      |  35 ++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |   8 +-
 src/backend/replication/slot.c                | 171 ++++++++--
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |  24 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 321 ++++++++++++++++++
 13 files changed, 560 insertions(+), 30 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0aec11f443..27b2285da1 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4556,6 +4556,41 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the inactive timeout invalidation mechanism. This parameter can only
+        be set in the <filename>postgresql.conf</filename> file or on the
+        server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to inactivity timeout occurs during checkpoint.
+        The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 634a4c0fab..5633429eef 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2618,6 +2618,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for longer than the amount of time specified by the
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f9649eec1a..6cc7a739ec 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -448,7 +448,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -667,7 +667,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1543,9 +1543,7 @@ update_synced_slots_inactive_since(void)
 			if (now == 0)
 				now = GetCurrentTimestamp();
 
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, &now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 0a03776156..d92b92bfed 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -159,6 +161,14 @@ static XLogRecPtr ss_oldest_flush_lsn = InvalidXLogRecPtr;
 static void ReplicationSlotShmemExit(int code, Datum arg);
 static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 
+static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
+										   ReplicationSlot *s,
+										   XLogRecPtr oldestLSN,
+										   Oid dboid,
+										   TransactionId snapshotConflictHorizon,
+										   bool *invalidated);
+static inline bool SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s);
+
 /* internal persistency functions */
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
@@ -535,9 +545,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot has been
+ * invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +628,21 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot has been
+	 * invalidated previously.
+	 */
+	if (error_if_invalid && s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+	{
+		Assert(s->inactive_since > 0);
+		ereport(ERROR,
+				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				 errmsg("can no longer get changes from replication slot \"%s\"",
+						NameStr(s->data.name)),
+				 errdetail("This slot has been invalidated because it was inactive for longer than the amount of time specified by \"%s\".",
+						   "replication_slot_inactive_timeout.")));
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -703,16 +731,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, &now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, &now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -785,7 +809,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +836,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1501,7 +1525,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1531,6 +1556,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			Assert(inactive_since > 0);
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s for longer than the amount of time specified by \"%s\"."),
+							 timestamptz_to_str(inactive_since),
+							 "replication_slot_inactive_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1547,6 +1581,28 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Is this replication slot allowed for inactive timeout invalidation check?
+ *
+ * Check if inactive timeout invalidation mechanism is disabled or slot is
+ * currently being used or server is in recovery mode or slot on standby is
+ * currently being synced from the primary.
+ *
+ * Note that the inactive timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s)
+{
+	return (replication_slot_inactive_timeout > 0 &&
+			s->inactive_since > 0 &&
+			!RecoveryInProgress() &&
+			!s->data.synced);
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1574,6 +1630,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1581,6 +1638,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1591,6 +1649,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT &&
+			SlotInactiveTimeoutCheckAllowed(s))
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1644,6 +1712,28 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+
+					if (!SlotInactiveTimeoutCheckAllowed(s))
+						break;
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * replication_slot_inactive_timeout GUC.
+					 */
+					if (TimestampDifferenceExceeds(s->inactive_since, now,
+												   replication_slot_inactive_timeout * 1000))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+
+						/*
+						 * Invalidation due to inactive timeout implies that
+						 * no one is using the slot.
+						 */
+						Assert(s->active_pid == 0);
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1669,11 +1759,13 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s &&
+			 active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1728,7 +1820,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1774,7 +1867,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1797,6 +1891,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1849,7 +1944,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1907,6 +2003,38 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do inactive_timeout invalidation
+		 * of thousands of replication slots here. If it is ever proven that
+		 * this assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move inactive_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2201,6 +2329,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now = 0;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2388,12 +2517,16 @@ RestoreSlotFromDisk(const char *name)
 		slot->in_use = true;
 		slot->active_pid = 0;
 
+		/* Use the same inactive_since time for all the slots. */
+		if (now == 0)
+			now = GetCurrentTimestamp();
+
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		ReplicationSlotSetInactiveSince(slot, &now, false);
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index c7bfbb15e0..b1b7b075bd 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -540,7 +540,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index c5f1009f37..61a0e38715 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -844,7 +844,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1462,7 +1462,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 686309db58..5e27cd3270 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time a replication slot can remain inactive before "
+						 "it will be invalidated."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 667e0dc40a..deca3a4aeb 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -335,6 +335,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 45582cf9d8..27c4f107e5 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -224,6 +226,24 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated due
+ * to inactive timeout.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz *now,
+								bool acquire_lock)
+{
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	if (s->data.invalidated != RS_INVAL_INACTIVE_TIMEOUT)
+		s->inactive_since = *now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -233,6 +253,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -249,7 +270,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 712924c2fa..c45a5106f4 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_wal_replay_wait.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..669d6ccc7a
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,321 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start
+#
+# Invalidate streaming standby slot and logical failover slot on primary due to
+# inactive timeout. Also, check logical failover slot synced to standby from
+# primary doesn't invalidate on its own, but gets the invalidated state from the
+# primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+primary_conninfo = '$connstr dbname=postgres'
+));
+
+# Create sync slot on primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Sync primary slot to standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that logical failover slot is created on standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced AND NOT temporary;}
+	),
+	"t",
+	'logical slot sync_slot1 has synced as true on standby');
+
+my $logstart = -s $primary->logfile;
+my $inactive_timeout = 1;
+
+# Set timeout so that next checkpoint will invalidate inactive slot
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$primary->reload;
+
+# Check for logical failover slot to become inactive on primary. Note that
+# nobody has acquired slot yet, so it must get invalidated due to
+# inactive timeout.
+check_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$inactive_timeout);
+
+# Sync primary slot to standby. Note that primary slot has already been
+# invalidated due to inactive timeout. Standby must just sync inavalidated
+# state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sync_slot1' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for sync_slot1 invalidation to be synced on standby";
+
+# Make standby slot on primary inactive and check for invalidation
+$standby1->stop;
+check_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$inactive_timeout);
+
+# Testcase end
+# =============================================================================
+
+# =============================================================================
+# Testcase start
+# Synced slot mustn't get invalidated on standby on its own due to inactive
+# timeout.
+
+# Disable inactive timeout on primary
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$primary->reload;
+
+# Create standby
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+$connstr = $primary->connstr;
+$standby2->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot2'
+primary_conninfo = '$connstr dbname=postgres'
+));
+
+# Create sync slot on primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot2', 'test_decoding', false, false, true);}
+);
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot2', immediately_reserve := true);
+]);
+
+$standby2->start;
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+
+$standby2->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$standby2->reload;
+
+# Sync primary slot to standby
+$standby2->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that logical failover slot is created on standby
+is( $standby2->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot2' AND synced AND NOT temporary;}
+	),
+	"t",
+	'logical slot sync_slot2 has synced as true on standby');
+
+$logstart = -s $standby2->logfile;
+
+# Give enough time
+sleep($inactive_timeout+1);
+
+# Despite inactive timeout being set, synced slot won't get invalidated on its
+# own on standby. So, we must not see invalidation message in server log.
+$standby2->safe_psql('postgres', "CHECKPOINT");
+ok( !$standby2->log_contains(
+		"invalidating obsolete replication slot \"sync_slot2\"",
+		$logstart),
+	'check that syned sync_slot2 has not been invalidated on the standby'
+);
+
+$standby2->stop;
+
+# Testcase end
+# =============================================================================
+
+# =============================================================================
+# Testcase start
+# Invalidate logical subscriber slot due to inactive timeout.
+
+my $publisher = $primary;
+
+# Prepare for test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+
+is($result, qq(5), "check initial copy was done");
+
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO ' ${inactive_timeout}s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Make subscriber slot on publisher inactive and check for invalidation
+$subscriber->stop;
+check_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end
+# =============================================================================
+
+# Check for slot to first become inactive and then get invalidated
+sub check_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $name = $node->name;
+
+	# Wait for slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND active = 'f' AND
+				  inactive_since IS NOT NULL;
+	])
+	  or die
+	  "Timed out while waiting for slot $slot to become inactive on node $name";
+
+	trigger_slot_invalidation($node, $slot, $offset, $inactive_timeout);
+
+	# Wait for invalidation reason to be set
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $name";
+
+	# Check that invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+
+	ok( $stderr =~
+		  /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $name";
+}
+
+# Trigger slot invalidation and confirm it in server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time to avoid multiple checkpoints
+	sleep($inactive_timeout+1);
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot\"",
+				$offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot invalidation has been logged on node $name"
+	);
+}
+
+done_testing();
-- 
2.43.0

#250

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 1 year ago

In reply to: shveta malik (#244)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

Thanks for reviewing.

On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:

1)
I see that ReplicationSlotAlter() will error out if the slot is
invalidated due to timeout. I have not tested it myself, but do you
know if slot-alter errors out for other invalidation causes as well?
Just wanted to confirm that the behaviour is consistent for all
invalidation causes.

Will respond to Amit's comment soon.

2)
When a slot is invalidated, and we try to use that slot, it gives this msg:

ERROR: can no longer get changes from replication slot "mysubnew1_2"
DETAIL: The slot became invalid because it was inactive since
2024-09-03 14:23:34.094067+05:30, which is more than 600 seconds ago.
HINT: You might need to increase "replication_slot_inactive_timeout.".

Isn't HINT misleading? Even if we increase it now, the slot can not be
reused again.

Below is one side effect if inactive_since keeps on changing:

postgres=# SELECT * FROM pg_replication_slot_advance('mysubnew1_1',
pg_current_wal_lsn());
ERROR: can no longer get changes from replication slot "mysubnew1_1"
DETAIL: The slot became invalid because it was inactive since
2024-09-04 10:03:56.68053+05:30, which is more than 10 seconds ago.
HINT: You might need to increase "replication_slot_inactive_timeout.".

postgres=# select now();
now
---------------------------------
2024-09-04 10:04:00.26564+05:30

'DETAIL' gives wrong information, we are not past 10-seconds. This is
because inactive_since got updated even in ERROR scenario.

ERROR: can no longer get changes from replication slot "mysubnew1_1"
DETAIL: The slot became invalid because it was inactive since
2024-09-04 10:06:38.980939+05:30, which is more than 129600 seconds
ago.
postgres=# select now();
now
----------------------------------
2024-09-04 10:07:35.201894+05:30

I feel we should change this message itself.

Removed the hint and corrected the detail message as following:

errmsg("can no longer get changes from replication slot \"%s\"",
NameStr(s->data.name)),
errdetail("This slot has been invalidated because it was inactive for
longer than the amount of time specified by \"%s\".",
"replication_slot_inactive_timeout.")));

3)
When the slot is invalidated, the' inactive_since' still keeps on
changing when there is a subscriber trying to start replication
continuously. I think ReplicationSlotAcquire() keeps on failing and
thus Release keeps on setting it again and again. Shouldn't we stop
setting/chnaging 'inactive_since' once the slot is invalidated
already, otherwise it will be misleading.

postgres=# select failover,synced,inactive_since,invalidation_reason
from pg_replication_slots;

failover | synced | inactive_since | invalidation_reason
----------+--------+----------------------------------+---------------------
t | f | 2024-09-03 14:23:.. | inactive_timeout

after sometime:
failover | synced | inactive_since | invalidation_reason
----------+--------+----------------------------------+---------------------
t | f | 2024-09-03 14:26:..| inactive_timeout

Changed it to not update inactive_since for slots invalidated due to
inactive timeout.

4)
src/sgml/config.sgml:

4a)
+ A value of zero (which is default) disables the timeout mechanism.

Better will be:
A value of zero (which is default) disables the inactive timeout
invalidation mechanism .

Changed.

4b)
'synced' and inactive_since should point to pg_replication_slots:

example:
<link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>

Modified.

5)
src/sgml/system-views.sgml:
+ ..the slot has been inactive for longer than the duration specified
by replication_slot_inactive_timeout parameter.

Better to have:
..the slot has been inactive for a time longer than the duration
specified by the replication_slot_inactive_timeout parameter.

Changed it to the following to be consistent with the config.sgml.

<literal>inactive_timeout</literal> means that the slot has been
inactive for longer than the amount of time specified by the
<xref linkend="guc-replication-slot-inactive-timeout"/> parameter.

Please find the v45 patch posted upthread at
/messages/by-id/CALj2ACWXQT3_HY40ceqKf1DadjLQP6b1r=0sZRh-xhAOd-b0pA@mail.gmail.com
for the changes.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#251

shveta malik

shveta.malik@gmail.com

over 1 year ago

In reply to: Amit Kapila (#248)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Sep 5, 2024 at 9:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Sep 4, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:

On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:

1)
I see that ReplicationSlotAlter() will error out if the slot is
invalidated due to timeout. I have not tested it myself, but do you
know if slot-alter errors out for other invalidation causes as well?
Just wanted to confirm that the behaviour is consistent for all
invalidation causes.

I was able to test this and as anticipated behavior is different. When
slot is invalidated due to say 'wal_removed', I am still able to do
'alter' of that slot.
Please see:

Pub:
slot_name | failover | synced | inactive_since |
invalidation_reason
-------------+----------+--------+----------------------------------+---------------------
mysubnew1_1 | t | f | 2024-09-04 08:58:12.802278+05:30 |
wal_removed

Sub:
newdb1=# alter subscription mysubnew1_1 disable;
ALTER SUBSCRIPTION

newdb1=# alter subscription mysubnew1_1 set (failover=false);
ALTER SUBSCRIPTION

Pub: (failover altered)
slot_name | failover | synced | inactive_since |
invalidation_reason
-------------+----------+--------+----------------------------------+---------------------
mysubnew1_1 | f | f | 2024-09-04 08:58:47.824471+05:30 |
wal_removed

while when invalidation_reason is 'inactive_timeout', it fails:

Pub:
slot_name | failover | synced | inactive_since |
invalidation_reason
-------------+----------+--------+----------------------------------+---------------------
mysubnew1_1 | t | f | 2024-09-03 14:30:57.532206+05:30 |
inactive_timeout

Sub:
newdb1=# alter subscription mysubnew1_1 disable;
ALTER SUBSCRIPTION

newdb1=# alter subscription mysubnew1_1 set (failover=false);
ERROR: could not alter replication slot "mysubnew1_1": ERROR: can no
longer get changes from replication slot "mysubnew1_1"
DETAIL: The slot became invalid because it was inactive since
2024-09-04 08:54:20.308996+05:30, which is more than 0 seconds ago.
HINT: You might need to increase "replication_slot_inactive_timeout.".

I think the behavior should be same.

We should not allow the invalid replication slot to be altered
irrespective of the reason unless there is any benefit.

Okay, then I think we need to change the existing behaviour of the
other invalidation causes which still allow alter-slot.

thanks
Shveta

#252

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 1 year ago

In reply to: shveta malik (#251)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Mon, Sep 9, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:

We should not allow the invalid replication slot to be altered
irrespective of the reason unless there is any benefit.

Okay, then I think we need to change the existing behaviour of the
other invalidation causes which still allow alter-slot.

+1. Perhaps, track it in a separate thread?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#253

shveta malik

shveta.malik@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#252)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Sep 9, 2024 at 10:26 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

On Mon, Sep 9, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:

We should not allow the invalid replication slot to be altered
irrespective of the reason unless there is any benefit.

Okay, then I think we need to change the existing behaviour of the
other invalidation causes which still allow alter-slot.

+1. Perhaps, track it in a separate thread?

I think so. It does not come under the scope of this thread.

thanks
Shveta

#254

shveta malik

shveta.malik@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#249)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sun, Sep 8, 2024 at 5:25 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Please find the v45 patch. Addressed above and Shveta's review comments [1].

Thanks for the patch. Please find my comments:

1)
src/sgml/config.sgml:

+ Synced slots are always considered to be inactive because they
don't perform logical decoding to produce changes.

It is better we avoid such a statement, as internally we use logical
decoding to advance restart-lsn, see
'LogicalSlotAdvanceAndCheckSnapState' called form slotsync.c.
<Also see related comment 6 below>

2)
src/sgml/config.sgml:

+ disables the inactive timeout invalidation mechanism

+ Slot invalidation due to inactivity timeout occurs during checkpoint.

Either have 'inactive' at both the places or 'inactivity'.

3)
slot.c:
+static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause
cause,
+    ReplicationSlot *s,
+    XLogRecPtr oldestLSN,
+    Oid dboid,
+    TransactionId snapshotConflictHorizon,
+    bool *invalidated);
+static inline bool SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s);

I think, we do not need above 2 declarations. The code compile fine
without these as the usage is later than the definition.

4)
+ /*
+ * An error is raised if error_if_invalid is true and the slot has been
+ * invalidated previously.
+ */
+ if (error_if_invalid && s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)

The comment is generic while the 'if condition' is specific to one
invalidation cause. Even though I feel it can be made generic test for
all invalidation causes but that is not under scope of this thread and
needs more testing/analysis. For the time being, we can make comment
specific to the concerned invalidation cause. The header of function
will also need the same change.

5)
SlotInactiveTimeoutCheckAllowed():

+ * Check if inactive timeout invalidation mechanism is disabled or slot is
+ * currently being used or server is in recovery mode or slot on standby is
+ * currently being synced from the primary.
+ *

These comments say exact opposite of what we are checking in code.
Since the function name has 'Allowed' in it, we should be putting
comments which say what allows it instead of what disallows it.

+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s)

Perhaps we should avoid mentioning logical decoding here. When slots
are synced, they are performing decoding and their inactive_since is
changing continuously. A better way to make this statement will be:

We want to ensure that the slots being synchronized are not
invalidated, as they need to be preserved for future use when the
standby server is promoted to the primary. This is necessary for
resuming logical replication from the new primary server.
<Rephrase if needed>

InvalidatePossiblyObsoleteSlot()

we are calling SlotInactiveTimeoutCheckAllowed() twice in this
function. We shall optimize.

At the first usage place, shall we simply get timestamp when cause is
RS_INVAL_INACTIVE_TIMEOUT without checking
SlotInactiveTimeoutCheckAllowed() as IMO it does not seem a
performance critical section. Or if we retain check at first place,
then at the second place we can avoid calling it again based on
whether 'now' is NULL or not.

thanks
Shveta

#255

Peter Smith

smithpb2250@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#249)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi, here are some review comments for v45-0001 (excluding the test code)

======
doc/src/sgml/config.sgml

1.
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from primary server (i.e., standby slots having

nit - /from primary server/from the primary server/

======
src/backend/replication/slot.c

2. ReplicationSlotAcquire

+ errmsg("can no longer get changes from replication slot \"%s\"",
+ NameStr(s->data.name)),
+ errdetail("This slot has been invalidated because it was inactive
for longer than the amount of time specified by \"%s\".",
+    "replication_slot_inactive_timeout.")));

nit - "replication_slot_inactive_timeout." - should be no period
inside that GUC name literal

~~~

3. ReportSlotInvalidation

I didn't understand why there was a hint for:
"You might need to increase \"%s\".", "max_slot_wal_keep_size"

But you don't have an equivalent hint for timeout invalidation:
"You might need to increase \"%s\".", "replication_slot_inactive_timeout"

Why aren't these similar cases consistent?

~~~

4. RestoreSlotFromDisk

+ /* Use the same inactive_since time for all the slots. */
+ if (now == 0)
+ now = GetCurrentTimestamp();
+

Is the deferred assignment really necessary? Why not just
unconditionally assign the 'now' just before the for-loop? Or even at
the declaration? e.g. The 'replication_slot_inactive_timeout' is
measured in seconds so I don't think 'inactive_since' being wrong by a
millisecond here will make any difference.

======
src/include/replication/slot.h

5. ReplicationSlotSetInactiveSince

+/*
+ * Set slot's inactive_since property unless it was previously invalidated due
+ * to inactive timeout.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz *now,
+ bool acquire_lock)
+{
+ if (acquire_lock)
+ SpinLockAcquire(&s->mutex);
+
+ if (s->data.invalidated != RS_INVAL_INACTIVE_TIMEOUT)
+ s->inactive_since = *now;
+
+ if (acquire_lock)
+ SpinLockRelease(&s->mutex);
+}

Is the logic correct? What if the slot was already invalid due to some
reason other than RS_INVAL_INACTIVE_TIMEOUT? Is an Assert needed?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachments:

PS_NITPICKS_20240909_TIMEOUT_V450001.txttext/plain; charset=US-ASCII; name=PS_NITPICKS_20240909_TIMEOUT_V450001.txtDownload

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 27b2285..97b4fb5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4582,7 +4582,7 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        <para>
         Note that the inactive timeout invalidation mechanism is not
         applicable for slots on the standby server that are being synced
-        from primary server (i.e., standby slots having
+        from the primary server (i.e., standby slots having
         <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
         value <literal>true</literal>).
         Synced slots are always considered to be inactive because they don't
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d92b92b..8cc67b4 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -640,7 +640,7 @@ retry:
 				 errmsg("can no longer get changes from replication slot \"%s\"",
 						NameStr(s->data.name)),
 				 errdetail("This slot has been invalidated because it was inactive for longer than the amount of time specified by \"%s\".",
-						   "replication_slot_inactive_timeout.")));
+						   "replication_slot_inactive_timeout")));
 	}
 
 	/*

#256

Amit Kapila

amit.kapila16@gmail.com

over 1 year ago

In reply to: shveta malik (#253)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Sep 9, 2024 at 10:28 AM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Sep 9, 2024 at 10:26 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

On Mon, Sep 9, 2024 at 9:17 AM shveta malik <shveta.malik@gmail.com> wrote:

We should not allow the invalid replication slot to be altered
irrespective of the reason unless there is any benefit.

Okay, then I think we need to change the existing behaviour of the
other invalidation causes which still allow alter-slot.

+1. Perhaps, track it in a separate thread?

I think so. It does not come under the scope of this thread.

It makes sense to me as well. But let's go ahead and get that sorted out first.

--
With Regards,
Amit Kapila.

#257

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 1 year ago

In reply to: Amit Kapila (#256)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

On Mon, Sep 9, 2024 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

We should not allow the invalid replication slot to be altered
irrespective of the reason unless there is any benefit.

Okay, then I think we need to change the existing behaviour of the
other invalidation causes which still allow alter-slot.

+1. Perhaps, track it in a separate thread?

I think so. It does not come under the scope of this thread.

It makes sense to me as well. But let's go ahead and get that sorted out first.

Moved the discussion to new thread -
/messages/by-id/CALj2ACW4fSOMiKjQ3=2NVBMTZRTG8Ujg6jsK9z3EvOtvA4vzKQ@mail.gmail.com.
Please have a look.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#258

Peter Smith

smithpb2250@gmail.com

over 1 year ago

In reply to: Peter Smith (#255)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi, here is the remainder of my v45-0001 review. These comments are
for the test code only.

======
Testcase #1

1.
+# Testcase start
+#
+# Invalidate streaming standby slot and logical failover slot on primary due to
+# inactive timeout. Also, check logical failover slot synced to standby from
+# primary doesn't invalidate on its own, but gets the invalidated
state from the
+# primary.

nit - s/primary/the primary/ (in a couple of places)
nit - s/standby/the standby/
nit - other trivial tweaks.

~~~

2.
+# Create sync slot on primary
+$primary->psql('postgres',
+ q{SELECT pg_create_logical_replication_slot('sync_slot1',
'test_decoding', false, false, true);}
+);

nit - s/primary/the primary/

~~~

3.
+$primary->safe_psql(
+ 'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name :=
'sb_slot1', immediately_reserve := true);
+]);

Should this have a comment?

~~~

4.
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby1);

nit - s/standby/the standby/

~~~

5.
+# Sync primary slot to standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");

nit - /Sync primary slot to standby/Sync the primary slots to the standby/

~~~

6.
+# Confirm that logical failover slot is created on standby

nit - s/Confirm that logical failover slot is created on
standby/Confirm that the logical failover slot is created on the
standby/

~~~

7.
+is( $standby1->safe_psql(
+ 'postgres',
+ q{SELECT count(*) = 1 FROM pg_replication_slots
+   WHERE slot_name = 'sync_slot1' AND synced AND NOT temporary;}
+ ),
+ "t",
+ 'logical slot sync_slot1 has synced as true on standby');

IMO here you should also be checking that the sync slot state is NOT
invalidated, just as a counterpoint for the test part later that
checks that it IS invalidated.

~~~

8.
+my $inactive_timeout = 1;
+
+# Set timeout so that next checkpoint will invalidate inactive slot
+$primary->safe_psql(
+ 'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO
'${inactive_timeout}s';
+]);
+$primary->reload;

8a.
nit - I think that $inactive_timeout assignment belongs below your comment.

8b.
nit - s/Set timeout so that next checkpoint will invalidate inactive
slot/Set timeout GUC so that the next checkpoint will invalidate
inactive slots/

~~~

9.
+# Check for logical failover slot to become inactive on primary. Note that
+# nobody has acquired slot yet, so it must get invalidated due to
+# inactive timeout.

nit - /Check for logical failover slot to become inactive on
primary./Wait for logical failover slot to become inactive on the
primary./
nit - /has acquired slot/has acquired the slot/

~~~

10.
+# Sync primary slot to standby. Note that primary slot has already been
+# invalidated due to inactive timeout. Standby must just sync inavalidated
+# state.

nit - minor, add "the". fix typo "inavalidated", etc. suggestion:

Re-sync the primary slots to the standby. Note that the primary slot was already
invalidated (above) due to inactive timeout. The standby must just
sync the invalidated
state.

~~~

11.
+# Make standby slot on primary inactive and check for invalidation
+$standby1->stop;

nit - /standby slot/the standby slot/
nit - /on primary/on the primary/

======
Testcase #2

12.
I'm not sure it is necessary to do all this extra work. IIUC, there
was already almost everything you needed in the previous Testcase #1.
So, I thought you could just combine this extra standby timeout test
in Testcase #1.

Indeed, your Testcase #1 comment still says it is doing this: ("Also,
check logical failover slot synced to standby from primary doesn't
invalidate on its own,...")

e.g.
- NEW: set the GUC timeout on the standby
- sync the sync_slot (already doing in test #1)
- ensure the synced slot is NOT invalid (already suggested above for test #1)
- NEW: then do a standby sleep > timeout duration
- NEW: then do a standby CHECKPOINT...
- NEW: then ensure the sync slot invalidation did NOT happen
- then proceed with the rest of test #1...

======
Testcase #3

13.
nit - remove a few blank lines to group associated statements together.

~~~

14.
+$publisher->safe_psql(
+ 'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '
${inactive_timeout}s';
+]);
+$publisher->reload;

nit - this deserves a comment, the same as in Testcase #1

======
sub wait_for_slot_invalidation

15.
+# Check for slot to first become inactive and then get invalidated
+sub check_for_slot_invalidation

nit - IMO the previous name was better (e.g. "wait_for.." instead of
"check_for...") because that describes exactly what the subroutine is
doing.

suggestion:
# Wait for the slot to first become inactive and then get invalidated
sub wait_for_slot_invalidation

~~~

16.
+{
+ my ($node, $slot, $offset, $inactive_timeout) = @_;
+ my $name = $node->name;

The variable $name seems too vague. How about $node_name?

~~~

17.
+ # Wait for invalidation reason to be set
+ $node->poll_query_until(
+ 'postgres', qq[
+ SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+ WHERE slot_name = '$slot' AND
+ invalidation_reason = 'inactive_timeout';
+ ])
+   or die
+   "Timed out while waiting for invalidation reason of slot $slot to
be set on node $name";

17a.
nit - /# Wait for invalidation reason to be set/# Check that the
invalidation reason is 'inactive_timeout'/

IIUC, the 'trigger_slot_invalidation' function has already invalidated
the slot at this point, so we are not really "Waiting..."; we are
"Checking..." that the reason was correctly set.

17b.
I think this code fragment maybe would be better put inside the
'trigger_slot_invalidation' function. (I've done this in the nitpicks
attachment)

~~~

18.
+ # Check that invalidated slot cannot be acquired
+ my ($result, $stdout, $stderr);
+
+ ($result, $stdout, $stderr) = $node->psql(
+ 'postgres', qq[
+ SELECT pg_replication_slot_advance('$slot', '0/1');
+ ]);

18a.
s/Check that invalidated slot/Check that an invalidated slot/

18b.
nit - Remove some blank lines, because the comment applies to all below it.

======
sub trigger_slot_invalidation

19.
+# Trigger slot invalidation and confirm it in server log
+sub trigger_slot_invalidation

nit - s/confirm it in server log/confirm it in the server log/

20.
+{
+ my ($node, $slot, $offset, $inactive_timeout) = @_;
+ my $name = $node->name;
+ my $invalidated = 0;

(same as the other subroutine)
nit - The variable $name seems too vague. How about $node_name?

======

Please refer to the attached nitpicks top-up patch which implements
most of the above nits.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachments:

PS_NITPICKS_20240910_SLOT_V450001_TESTS.txttext/plain; charset=US-ASCII; name=PS_NITPICKS_20240910_SLOT_V450001_TESTS.txtDownload

diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index 669d6cc..34b46d5 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -12,10 +12,10 @@ use Time::HiRes qw(usleep);
 # =============================================================================
 # Testcase start
 #
-# Invalidate streaming standby slot and logical failover slot on primary due to
-# inactive timeout. Also, check logical failover slot synced to standby from
-# primary doesn't invalidate on its own, but gets the invalidated state from the
-# primary.
+# Invalidate a streaming standby slot and logical failover slot on the primary
+# due to inactive timeout. Also, check that a logical failover slot synced to
+# the standby from the primary doesn't invalidate on its own, but gets the
+# invalidated state from the primary.
 
 # Initialize primary
 my $primary = PostgreSQL::Test::Cluster->new('primary');
@@ -45,7 +45,7 @@ primary_slot_name = 'sb_slot1'
 primary_conninfo = '$connstr dbname=postgres'
 ));
 
-# Create sync slot on primary
+# Create sync slot on the primary
 $primary->psql('postgres',
 	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
 );
@@ -57,13 +57,13 @@ $primary->safe_psql(
 
 $standby1->start;
 
-# Wait until standby has replayed enough data
+# Wait until the standby has replayed enough data
 $primary->wait_for_catchup($standby1);
 
-# Sync primary slot to standby
+# Sync the primary slots to the standby
 $standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
 
-# Confirm that logical failover slot is created on standby
+# Confirm that the logical failover slot is created on the standby
 is( $standby1->safe_psql(
 		'postgres',
 		q{SELECT count(*) = 1 FROM pg_replication_slots
@@ -73,24 +73,24 @@ is( $standby1->safe_psql(
 	'logical slot sync_slot1 has synced as true on standby');
 
 my $logstart = -s $primary->logfile;
-my $inactive_timeout = 1;
 
-# Set timeout so that next checkpoint will invalidate inactive slot
+# Set timeout GUC so that that next checkpoint will invalidate inactive slots
+my $inactive_timeout = 1;
 $primary->safe_psql(
 	'postgres', qq[
     ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
 ]);
 $primary->reload;
 
-# Check for logical failover slot to become inactive on primary. Note that
+# Wait for logical failover slot to become inactive on the primary. Note that
 # nobody has acquired slot yet, so it must get invalidated due to
 # inactive timeout.
-check_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
 	$inactive_timeout);
 
-# Sync primary slot to standby. Note that primary slot has already been
-# invalidated due to inactive timeout. Standby must just sync inavalidated
-# state.
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to inactive timeout. The standby must just
+# sync the invalidated state.
 $standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
 $standby1->poll_query_until(
 	'postgres', qq[
@@ -101,9 +101,9 @@ $standby1->poll_query_until(
   or die
   "Timed out while waiting for sync_slot1 invalidation to be synced on standby";
 
-# Make standby slot on primary inactive and check for invalidation
+# Make the standby slot on the primary inactive and check for invalidation
 $standby1->stop;
-check_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
 	$inactive_timeout);
 
 # Testcase end
@@ -223,14 +223,12 @@ $publisher->safe_psql(
 $subscriber->safe_psql('postgres',
 	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
 );
-
 $subscriber->wait_for_subscription_sync($publisher, 'sub');
-
 my $result =
   $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
-
 is($result, qq(5), "check initial copy was done");
 
+# Set timeout GUC so that that next checkpoint will invalidate inactive slots
 $publisher->safe_psql(
 	'postgres', qq[
     ALTER SYSTEM SET replication_slot_inactive_timeout TO ' ${inactive_timeout}s';
@@ -241,17 +239,17 @@ $logstart = -s $publisher->logfile;
 
 # Make subscriber slot on publisher inactive and check for invalidation
 $subscriber->stop;
-check_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
 	$inactive_timeout);
 
 # Testcase end
 # =============================================================================
 
-# Check for slot to first become inactive and then get invalidated
-sub check_for_slot_invalidation
+# Wait for slot to first become inactive and then get invalidated
+sub wait_for_slot_invalidation
 {
 	my ($node, $slot, $offset, $inactive_timeout) = @_;
-	my $name = $node->name;
+	my $node_name = $node->name;
 
 	# Wait for slot to become inactive
 	$node->poll_query_until(
@@ -261,45 +259,33 @@ sub check_for_slot_invalidation
 				  inactive_since IS NOT NULL;
 	])
 	  or die
-	  "Timed out while waiting for slot $slot to become inactive on node $name";
+	  "Timed out while waiting for slot $slot to become inactive on node $node_name";
 
 	trigger_slot_invalidation($node, $slot, $offset, $inactive_timeout);
 
-	# Wait for invalidation reason to be set
-	$node->poll_query_until(
-		'postgres', qq[
-		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
-			WHERE slot_name = '$slot' AND
-			invalidation_reason = 'inactive_timeout';
-	])
-	  or die
-	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $name";
-
-	# Check that invalidated slot cannot be acquired
+	# Check that an invalidated slot cannot be acquired
 	my ($result, $stdout, $stderr);
-
 	($result, $stdout, $stderr) = $node->psql(
 		'postgres', qq[
 			SELECT pg_replication_slot_advance('$slot', '0/1');
 	]);
-
 	ok( $stderr =~
 		  /can no longer get changes from replication slot "$slot"/,
-		"detected error upon trying to acquire invalidated slot $slot on node $name"
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
 	  )
 	  or die
-	  "could not detect error upon trying to acquire invalidated slot $slot on node $name";
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
 }
 
-# Trigger slot invalidation and confirm it in server log
+# Trigger slot invalidation and confirm it in the server log
 sub trigger_slot_invalidation
 {
 	my ($node, $slot, $offset, $inactive_timeout) = @_;
-	my $name = $node->name;
+	my $node_name = $node->name;
 	my $invalidated = 0;
 
 	# Give enough time to avoid multiple checkpoints
-	sleep($inactive_timeout+1);
+	sleep($inactive_timeout + 1);
 
 	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 	{
@@ -314,8 +300,18 @@ sub trigger_slot_invalidation
 		usleep(100_000);
 	}
 	ok($invalidated,
-		"check that slot $slot invalidation has been logged on node $name"
+		"check that slot $slot invalidation has been logged on node $node_name"
 	);
+
+	# Check that the invalidation reason is 'inactive_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
 }
 
 done_testing();

#259

Amit Kapila

amit.kapila16@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#257)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Sep 10, 2024 at 12:13 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Mon, Sep 9, 2024 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

We should not allow the invalid replication slot to be altered
irrespective of the reason unless there is any benefit.

Okay, then I think we need to change the existing behaviour of the
other invalidation causes which still allow alter-slot.

+1. Perhaps, track it in a separate thread?

I think so. It does not come under the scope of this thread.

It makes sense to me as well. But let's go ahead and get that sorted out first.

Moved the discussion to new thread -
/messages/by-id/CALj2ACW4fSOMiKjQ3=2NVBMTZRTG8Ujg6jsK9z3EvOtvA4vzKQ@mail.gmail.com.
Please have a look.

That is pushed now. Please send the rebased patch after addressing the
pending comments.

--
With Regards,
Amit Kapila.

#260

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 1 year ago

In reply to: shveta malik (#254)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

Thanks for reviewing.

On Mon, Sep 9, 2024 at 10:54 AM shveta malik <shveta.malik@gmail.com> wrote:

2)
src/sgml/config.sgml:

+ disables the inactive timeout invalidation mechanism

+ Slot invalidation due to inactivity timeout occurs during checkpoint.

Either have 'inactive' at both the places or 'inactivity'.

Used "inactive timeout".

3)
slot.c:
+static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause
cause,
+    ReplicationSlot *s,
+    XLogRecPtr oldestLSN,
+    Oid dboid,
+    TransactionId snapshotConflictHorizon,
+    bool *invalidated);
+static inline bool SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s);

I think, we do not need above 2 declarations. The code compile fine
without these as the usage is later than the definition.

Hm, it's a usual practice that I follow irrespective of the placement
of function declarations. Since it was brought up, I removed the
declarations.

4)
+ /*
+ * An error is raised if error_if_invalid is true and the slot has been
+ * invalidated previously.
+ */
+ if (error_if_invalid && s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
The comment is generic while the 'if condition' is specific to one
invalidation cause. Even though I feel it can be made generic test for
all invalidation causes but that is not under scope of this thread and
needs more testing/analysis.

Right.

For the time being, we can make comment
specific to the concerned invalidation cause. The header of function
will also need the same change.

Adjusted the comment, but left the variable name error_if_invalid as
is. Didn't want to make it long, one can look at the code to
understand what it is used for.

5)
SlotInactiveTimeoutCheckAllowed():
+ * Check if inactive timeout invalidation mechanism is disabled or slot is
+ * currently being used or server is in recovery mode or slot on standby is
+ * currently being synced from the primary.
+ *
These comments say exact opposite of what we are checking in code.
Since the function name has 'Allowed' in it, we should be putting
comments which say what allows it instead of what disallows it.

Modified.

1)
src/sgml/config.sgml:

+ Synced slots are always considered to be inactive because they
don't perform logical decoding to produce changes.

It is better we avoid such a statement, as internally we use logical
decoding to advance restart-lsn, see
'LogicalSlotAdvanceAndCheckSnapState' called form slotsync.c.
<Also see related comment 6 below>

6)
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s)
Perhaps we should avoid mentioning logical decoding here. When slots
are synced, they are performing decoding and their inactive_since is
changing continuously. A better way to make this statement will be:

We want to ensure that the slots being synchronized are not
invalidated, as they need to be preserved for future use when the
standby server is promoted to the primary. This is necessary for
resuming logical replication from the new primary server.
<Rephrase if needed>

They are performing logical decoding, but not producing the changes
for the clients to consume. So, IMO, the accompanying "to produce
changes" next to the "logical decoding" is good here.

7)

InvalidatePossiblyObsoleteSlot()

we are calling SlotInactiveTimeoutCheckAllowed() twice in this
function. We shall optimize.

At the first usage place, shall we simply get timestamp when cause is
RS_INVAL_INACTIVE_TIMEOUT without checking
SlotInactiveTimeoutCheckAllowed() as IMO it does not seem a
performance critical section. Or if we retain check at first place,
then at the second place we can avoid calling it again based on
whether 'now' is NULL or not.

Getting a current timestamp can get costlier on platforms that use
various clock sources, so assigning 'now' unconditionally isn't the
way IMO. Using the inline function in two places improves the
readability. Can optimize it if there's any performance impact of
calling the inline function in two places.

Will post the new patch version soon.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#261

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 1 year ago

In reply to: Peter Smith (#255)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

Thanks for reviewing.

On Mon, Sep 9, 2024 at 1:11 PM Peter Smith <smithpb2250@gmail.com> wrote:

1.
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from primary server (i.e., standby slots having

nit - /from primary server/from the primary server/

2. ReplicationSlotAcquire
+ errmsg("can no longer get changes from replication slot \"%s\"",
+ NameStr(s->data.name)),
+ errdetail("This slot has been invalidated because it was inactive
for longer than the amount of time specified by \"%s\".",
+    "replication_slot_inactive_timeout.")));
nit - "replication_slot_inactive_timeout." - should be no period
inside that GUC name literal

Typo. Fixed.

3. ReportSlotInvalidation

I didn't understand why there was a hint for:
"You might need to increase \"%s\".", "max_slot_wal_keep_size"

Why aren't these similar cases consistent?

It looks misleading and not very useful. What happens if the removed
WAL (that's needed for the slot) is put back into pg_wal somehow (by
manually copying from archive or by some tool/script)? Can the slot
invalidated due to wal_removed start sending WAL to its clients?

But you don't have an equivalent hint for timeout invalidation:
"You might need to increase \"%s\".", "replication_slot_inactive_timeout"

I removed this per review comments upthread.

4. RestoreSlotFromDisk
+ /* Use the same inactive_since time for all the slots. */
+ if (now == 0)
+ now = GetCurrentTimestamp();
+
Is the deferred assignment really necessary? Why not just
unconditionally assign the 'now' just before the for-loop? Or even at
the declaration? e.g. The 'replication_slot_inactive_timeout' is
measured in seconds so I don't think 'inactive_since' being wrong by a
millisecond here will make any difference.

Moved it before the for-loop.

5. ReplicationSlotSetInactiveSince

+/*
+ * Set slot's inactive_since property unless it was previously invalidated due
+ * to inactive timeout.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz *now,
+ bool acquire_lock)
+{
+ if (acquire_lock)
+ SpinLockAcquire(&s->mutex);
+
+ if (s->data.invalidated != RS_INVAL_INACTIVE_TIMEOUT)
+ s->inactive_since = *now;
+
+ if (acquire_lock)
+ SpinLockRelease(&s->mutex);
+}

Is the logic correct? What if the slot was already invalid due to some
reason other than RS_INVAL_INACTIVE_TIMEOUT? Is an Assert needed?

Hm. Since invalidated slots can't be acquired and made active, not
modifying inactive_since irrespective of invalidation reason looks
good to me.

Please find the attached v46 patch having changes for the above review
comments and your test review comments and Shveta's review comments.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v46-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v46-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 360a41be6ba3b8e6ba17b26d7be7b7a63c56b485 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 16 Sep 2024 09:34:33 +0000
Subject: [PATCH v46] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage for instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named replication_slot_inactive_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are inactive for longer than this amount of
time.

Note that the inactive timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Reviewed-by: Ajin Cherian, Shveta Malik, Peter Smith
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CA%2BTgmoZTbaaEjSZUG1FL0mzxAdN3qmXksO3O9_PZhEuXTkVnRQ%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/202403260841.5jcv7ihniccy%40alvherre.pgsql
---
 doc/src/sgml/config.sgml                      |  35 +++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |  17 +-
 src/backend/replication/slot.c                | 166 +++++++++--
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |  25 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 266 ++++++++++++++++++
 13 files changed, 505 insertions(+), 35 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0aec11f443..bce7ecdc86 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4556,6 +4556,41 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the inactive timeout invalidation mechanism. This parameter can only
+        be set in the <filename>postgresql.conf</filename> file or on the
+        server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to inactive timeout occurs during checkpoint.
+        The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 634a4c0fab..5633429eef 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2618,6 +2618,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for longer than the amount of time specified by the
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f9649eec1a..d5ea75065b 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -448,7 +448,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -667,7 +667,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1510,7 +1510,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1525,6 +1525,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1539,13 +1542,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, &now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 6828100cf1..851120e6d2 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -535,9 +537,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot has been
+ * invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +620,22 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot has been
+	 * previously invalidated due to inactive timeout.
+	 */
+	if (error_if_invalid &&
+		s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+	{
+		Assert(s->inactive_since > 0);
+		ereport(ERROR,
+				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				 errmsg("can no longer get changes from replication slot \"%s\"",
+						NameStr(s->data.name)),
+				 errdetail("This slot has been invalidated because it was inactive for longer than the amount of time specified by \"%s\".",
+						   "replication_slot_inactive_timeout")));
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -703,16 +724,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, &now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, &now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -785,7 +802,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +829,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1508,7 +1525,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1538,6 +1556,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			Assert(inactive_since > 0);
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s for longer than the amount of time specified by \"%s\"."),
+							 timestamptz_to_str(inactive_since),
+							 "replication_slot_inactive_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1554,6 +1581,31 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Is this replication slot allowed for inactive timeout invalidation check?
+ *
+ * Inactive timeout invalidation is allowed only when:
+ *
+ * 1. Inactive timeout is set
+ * 2. Slot is inactive
+ * 3. Server is not in recovery
+ * 4. Slot is not being synced from the primary
+ *
+ * Note that the inactive timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s)
+{
+	return (replication_slot_inactive_timeout > 0 &&
+			s->inactive_since > 0 &&
+			!RecoveryInProgress() &&
+			!s->data.synced);
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1581,6 +1633,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1588,6 +1641,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1598,6 +1652,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT &&
+			SlotInactiveTimeoutCheckAllowed(s))
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1651,6 +1715,28 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+
+					if (!SlotInactiveTimeoutCheckAllowed(s))
+						break;
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * replication_slot_inactive_timeout GUC.
+					 */
+					if (TimestampDifferenceExceeds(s->inactive_since, now,
+												   replication_slot_inactive_timeout * 1000))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+
+						/*
+						 * Invalidation due to inactive timeout implies that
+						 * no one is using the slot.
+						 */
+						Assert(s->active_pid == 0);
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1676,11 +1762,13 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s &&
+			 active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1735,7 +1823,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1781,7 +1870,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1804,6 +1894,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1856,7 +1947,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1914,6 +2006,38 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do inactive_timeout invalidation
+		 * of thousands of replication slots here. If it is ever proven that
+		 * this assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move inactive_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2208,6 +2332,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2368,6 +2493,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2400,7 +2528,7 @@ RestoreSlotFromDisk(const char *name)
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		ReplicationSlotSetInactiveSince(slot, &now, false);
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index c7bfbb15e0..b1b7b075bd 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -540,7 +540,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index c5f1009f37..61a0e38715 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -844,7 +844,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1462,7 +1462,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index c54b08fe18..82956d58d3 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -299,7 +299,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 686309db58..5e27cd3270 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time a replication slot can remain inactive before "
+						 "it will be invalidated."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 667e0dc40a..deca3a4aeb 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -335,6 +335,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 45582cf9d8..8ffd06dd9f 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -224,6 +226,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz *now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = *now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -233,6 +254,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -249,7 +271,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 712924c2fa..c45a5106f4 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_wal_replay_wait.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..c53b5b3dbf
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,266 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to inactive timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+primary_conninfo = '$connstr dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+my $logstart = -s $standby1->logfile;
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $inactive_timeout = 1;
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$standby1->reload;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time
+sleep($inactive_timeout+1);
+
+# Despite inactive timeout being set, the synced slot won't get invalidated on
+# its own on the standby. So, we must not see invalidation message in server
+# log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+ok( !$standby1->log_contains(
+		"invalidating obsolete replication slot \"sync_slot1\"",
+		$logstart),
+	'check that synced slot sync_slot1 has not been invalidated on standby'
+);
+
+$logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# inactive timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$inactive_timeout);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to inactive timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sync_slot1' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for sync_slot1 invalidation to be synced on standby";
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$inactive_timeout);
+
+# Testcase end
+# =============================================================================
+
+# =============================================================================
+# Testcase start
+# Invalidate logical subscriber slot due to inactive timeout.
+
+my $publisher = $primary;
+
+# Prepare for test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+is($result, qq(5), "check initial copy was done");
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO ' ${inactive_timeout}s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Make subscriber slot on publisher inactive and check for invalidation
+$subscriber->stop;
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become inactive and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $node_name = $node->name;
+
+	# Wait for slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND active = 'f' AND
+				  inactive_since IS NOT NULL;
+	])
+	  or die
+	  "Timed out while waiting for slot $slot to become inactive on node $node_name";
+
+	trigger_slot_invalidation($node, $slot, $offset, $inactive_timeout);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~
+		  /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time to avoid multiple checkpoints
+	sleep($inactive_timeout + 1);
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot\"",
+				$offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot invalidation has been logged on node $node_name"
+	);
+
+	# Check that the invalidation reason is 'inactive_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.43.0

#262

Amit Kapila

amit.kapila16@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#261)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Please find the attached v46 patch having changes for the above review
comments and your test review comments and Shveta's review comments.

-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
  ReplicationSlot *s;
  int active_pid;
@@ -615,6 +620,22 @@ retry:
  /* We made this slot active, so it's ours now. */
  MyReplicationSlot = s;

+ /*
+ * An error is raised if error_if_invalid is true and the slot has been
+ * previously invalidated due to inactive timeout.
+ */
+ if (error_if_invalid &&
+ s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+ {
+ Assert(s->inactive_since > 0);
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("can no longer get changes from replication slot \"%s\"",
+ NameStr(s->data.name)),
+ errdetail("This slot has been invalidated because it was inactive
for longer than the amount of time specified by \"%s\".",
+    "replication_slot_inactive_timeout")));
+ }

Why raise the ERROR just for timeout invalidation here and why not if
the slot is invalidated for other reasons? This raises the question of
what happens before this patch if the invalid slot is used from places
where we call ReplicationSlotAcquire(). I did a brief code analysis
and found that for StartLogicalReplication(), even if the error won't
occur in ReplicationSlotAcquire(), it would have been caught in
CreateDecodingContext(). I think that is where we should also add this
new error. Similarly, pg_logical_slot_get_changes_guts() and other
logical replication functions should be calling
CreateDecodingContext() which can raise the new ERROR. I am not sure
about how the invalid slots are handled during physical replication,
please check the behavior of that before this patch.

--
With Regards,
Amit Kapila.

#263

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 1 year ago

In reply to: Amit Kapila (#262)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi,

Thanks for looking into this.

On Mon, Sep 16, 2024 at 4:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Why raise the ERROR just for timeout invalidation here and why not if
the slot is invalidated for other reasons? This raises the question of
what happens before this patch if the invalid slot is used from places
where we call ReplicationSlotAcquire(). I did a brief code analysis
and found that for StartLogicalReplication(), even if the error won't
occur in ReplicationSlotAcquire(), it would have been caught in
CreateDecodingContext(). I think that is where we should also add this
new error. Similarly, pg_logical_slot_get_changes_guts() and other
logical replication functions should be calling
CreateDecodingContext() which can raise the new ERROR. I am not sure
about how the invalid slots are handled during physical replication,
please check the behavior of that before this patch.

When physical slots are invalidated due to wal_removed reason, the failure
happens at a much later point for the streaming standbys while reading the
requested WAL files like the following:

2024-09-16 16:29:52.416 UTC [876059] FATAL: could not receive data from
WAL stream: ERROR: requested WAL segment 000000010000000000000005 has
already been removed
2024-09-16 16:29:52.416 UTC [872418] LOG: waiting for WAL to become
available at 0/5002000

At this point, despite the slot being invalidated, its wal_status can still
come back to 'unreserved' even from 'lost', and the standby can catch up if
removed WAL files are copied either by manually or by a tool/script to the
primary's pg_wal directory. IOW, the physical slots invalidated due to
wal_removed are *somehow* recoverable unlike the logical slots.

IIUC, the invalidation of a slot implies that it is not guaranteed to hold
any resources like WAL and XMINs. Does it also imply that the slot must be
unusable?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#264

Peter Smith

smithpb2250@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#261)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Here are a few comments for the patch v46-0001.

======
src/backend/replication/slot.c

1. ReportSlotInvalidation

On Mon, Sep 16, 2024 at 8:01 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Mon, Sep 9, 2024 at 1:11 PM Peter Smith <smithpb2250@gmail.com> wrote:

3. ReportSlotInvalidation

I didn't understand why there was a hint for:
"You might need to increase \"%s\".", "max_slot_wal_keep_size"

Why aren't these similar cases consistent?

It looks misleading and not very useful. What happens if the removed
WAL (that's needed for the slot) is put back into pg_wal somehow (by
manually copying from archive or by some tool/script)? Can the slot
invalidated due to wal_removed start sending WAL to its clients?

But you don't have an equivalent hint for timeout invalidation:
"You might need to increase \"%s\".", "replication_slot_inactive_timeout"

I removed this per review comments upthread.

IIUC the errors are quite similar, so my previous review comment was
mostly about the unexpected inconsistency of why one of them has a
hint and the other one does not. I don't have a strong opinion about
whether they should both *have* or *not have* hints, so long as they
are treated the same.

If you think the current code hint is not useful then maybe we need a
new thread to address that existing issue. For example, maybe it
should be removed or reworded.

~~~

2. InvalidatePossiblyObsoleteSlot:

+ case RS_INVAL_INACTIVE_TIMEOUT:
+
+ if (!SlotInactiveTimeoutCheckAllowed(s))
+ break;
+
+ /*
+ * Check if the slot needs to be invalidated due to
+ * replication_slot_inactive_timeout GUC.
+ */
+ if (TimestampDifferenceExceeds(s->inactive_since, now,
+    replication_slot_inactive_timeout * 1000))

nit - it might be tidier to avoid multiple breaks by just combining
these conditions. See the nitpick attachment.

~~~

3.
* - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs

nit - use comment wording "inactive slot timeout has occurred", to
make it identical to the comment in slot.h

======
src/test/recovery/t/050_invalidate_slots.pl

4.
+# Despite inactive timeout being set, the synced slot won't get invalidated on
+# its own on the standby. So, we must not see invalidation message in server
+# log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+ok( !$standby1->log_contains(
+ "invalidating obsolete replication slot \"sync_slot1\"",
+ $logstart),
+ 'check that synced slot sync_slot1 has not been invalidated on standby'
+);
+

It seems kind of brittle to check the logs for something that is NOT
there because any change to the message will make this accidentally
pass. Apart from that, it might anyway be more efficient just to check
the pg_replication_slots again to make sure the 'invalidation_reason
remains' still NULL.

======

Please see the attachment which implements some of the nit changes
mentioned above.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachments:

PS_NITPICKS_20240917_SLOT_TIMEOUT_v46.txttext/plain; charset=US-ASCII; name=PS_NITPICKS_20240917_SLOT_TIMEOUT_v46.txtDownload

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 851120e..0076e4b 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -1716,15 +1716,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 						invalidation_cause = cause;
 					break;
 				case RS_INVAL_INACTIVE_TIMEOUT:
-
-					if (!SlotInactiveTimeoutCheckAllowed(s))
-						break;
-
 					/*
 					 * Check if the slot needs to be invalidated due to
 					 * replication_slot_inactive_timeout GUC.
 					 */
-					if (TimestampDifferenceExceeds(s->inactive_since, now,
+					if (SlotInactiveTimeoutCheckAllowed(s) &&
+						TimestampDifferenceExceeds(s->inactive_since, now,
 												   replication_slot_inactive_timeout * 1000))
 					{
 						invalidation_cause = cause;
@@ -1894,7 +1891,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
- * - RS_INVAL_INACTIVE_TIMEOUT: inactive timeout occurs
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
index c53b5b3..e2fdd52 100644
--- a/src/test/recovery/t/050_invalidate_slots.pl
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -87,7 +87,7 @@ is( $standby1->safe_psql(
 	'logical slot sync_slot1 is synced to standby');
 
 # Give enough time
-sleep($inactive_timeout+1);
+sleep($inactive_timeout + 1);
 
 # Despite inactive timeout being set, the synced slot won't get invalidated on
 # its own on the standby. So, we must not see invalidation message in server

#265

shveta malik

shveta.malik@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#261)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

Please find the attached v46 patch having changes for the above review
comments and your test review comments and Shveta's review comments.

Thanks for addressing comments.

Is there a reason that we don't support this invalidation on hot
standby for non-synced slots? Shouldn't we support this time-based
invalidation there too just like other invalidations?

thanks
Shveta

#266

shveta malik

shveta.malik@gmail.com

over 1 year ago

In reply to: shveta malik (#265)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Sep 18, 2024 at 12:21 PM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

Please find the attached v46 patch having changes for the above review
comments and your test review comments and Shveta's review comments.

Thanks for addressing comments.

Is there a reason that we don't support this invalidation on hot
standby for non-synced slots? Shouldn't we support this time-based
invalidation there too just like other invalidations?

Now since we are not changing inactive_since once it is invalidated,
we are not even initializing it during restart; and thus later when
someone tries to use slot, it leads to assert in
ReplicationSlotAcquire() ( Assert(s->inactive_since > 0);

Steps:
--Disable logical subscriber and let the slot on publisher gets
invalidated due to inactive_timeout.
--Enable the logical subscriber again.
--Restart publisher.

a) We should initialize inactive_since when
ReplicationSlotSetInactiveSince() is called from RestoreSlotFromDisk()
even though it is invalidated.
b) And shall we mention in the doc of 'active_since', that once the
slot is invalidated, this value will remain unchanged until we
shutdown the server. On server restart, it is initialized to start
time. Thought?

thanks
Shveta

#267

shveta malik

shveta.malik@gmail.com

over 1 year ago

In reply to: shveta malik (#266)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Sep 18, 2024 at 2:49 PM shveta malik <shveta.malik@gmail.com> wrote:

Please find the attached v46 patch having changes for the above review
comments and your test review comments and Shveta's review comments.

When the synced slot is marked as 'inactive_timeout' invalidated on
hot standby due to invalidation of publisher 's failover slot, the
former starts showing NULL' inactive_since'. Is this intentional
behaviour? I feel inactive_since should be non-NULL here too?
Thoughts?

physical standby:
postgres=# select slot_name, inactive_since, invalidation_reason,
failover, synced from pg_replication_slots;
slot_name | inactive_since |
invalidation_reason | failover | synced
-------------+----------------------------------+---------------------+----------+--------
sub2 | 2024-09-18 15:20:04.364998+05:30 | | t | t
sub3 | 2024-09-18 15:20:04.364953+05:30 | | t | t

After sync of invalidation_reason:

thanks
shveta

#268

Amit Kapila

amit.kapila16@gmail.com

over 1 year ago

In reply to: Bharath Rupireddy (#263)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Sep 16, 2024 at 10:41 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks for looking into this.

On Mon, Sep 16, 2024 at 4:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Why raise the ERROR just for timeout invalidation here and why not if
the slot is invalidated for other reasons? This raises the question of
what happens before this patch if the invalid slot is used from places
where we call ReplicationSlotAcquire(). I did a brief code analysis
and found that for StartLogicalReplication(), even if the error won't
occur in ReplicationSlotAcquire(), it would have been caught in
CreateDecodingContext(). I think that is where we should also add this
new error. Similarly, pg_logical_slot_get_changes_guts() and other
logical replication functions should be calling
CreateDecodingContext() which can raise the new ERROR. I am not sure
about how the invalid slots are handled during physical replication,
please check the behavior of that before this patch.

When physical slots are invalidated due to wal_removed reason, the failure happens at a much later point for the streaming standbys while reading the requested WAL files like the following:

2024-09-16 16:29:52.416 UTC [876059] FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 000000010000000000000005 has already been removed
2024-09-16 16:29:52.416 UTC [872418] LOG: waiting for WAL to become available at 0/5002000

At this point, despite the slot being invalidated, its wal_status can still come back to 'unreserved' even from 'lost', and the standby can catch up if removed WAL files are copied either by manually or by a tool/script to the primary's pg_wal directory. IOW, the physical slots invalidated due to wal_removed are *somehow* recoverable unlike the logical slots.

IIUC, the invalidation of a slot implies that it is not guaranteed to hold any resources like WAL and XMINs. Does it also imply that the slot must be unusable?

If we can't hold the dead rows against xmin of the invalid slot, then
how can we make it usable even after copying the required WAL?

--
With Regards,
Amit Kapila.

#269

shveta malik

shveta.malik@gmail.com

over 1 year ago

In reply to: shveta malik (#267)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Sep 18, 2024 at 3:31 PM shveta malik <shveta.malik@gmail.com> wrote:

On Wed, Sep 18, 2024 at 2:49 PM shveta malik <shveta.malik@gmail.com> wrote:

Please find the attached v46 patch having changes for the above review
comments and your test review comments and Shveta's review comments.

When we promote hot standby with synced logical slots to become new
primary, the logical slots are never invalidated with
'inactive_timeout' on new primary. It seems the check in
SlotInactiveTimeoutCheckAllowed() is wrong. We should allow
invalidation of slots on primary even if they are marked as 'synced'.
Please see [4]-------------------- postgres=# select pg_is_in_recovery(); -------- f.
I have raised 4 issues so far on v46, the first 3 are in [1]/messages/by-id/CAJpy0uAwxc49Dz6t=-y_-z-MU+A4RWX4BR3Zri_jj2qgGMq_8g@mail.gmail.com,[2]/messages/by-id/CAJpy0uC6nN3SLbEuCvz7-CpaPdNdXxH=feW5MhYQch-JWV0tLg@mail.gmail.com,[3]/messages/by-id/CAJpy0uBXXJC6f04+FU1axKaU+p78wN0SEhUNE9XoqbjXj=hhgw@mail.gmail.com.
Once all these are addressed, I can continue reviewing further.

[1]: /messages/by-id/CAJpy0uAwxc49Dz6t=-y_-z-MU+A4RWX4BR3Zri_jj2qgGMq_8g@mail.gmail.com
[2]: /messages/by-id/CAJpy0uC6nN3SLbEuCvz7-CpaPdNdXxH=feW5MhYQch-JWV0tLg@mail.gmail.com
[3]: /messages/by-id/CAJpy0uBXXJC6f04+FU1axKaU+p78wN0SEhUNE9XoqbjXj=hhgw@mail.gmail.com

[4]: -------------------- postgres=# select pg_is_in_recovery(); -------- f
--------------------
postgres=# select pg_is_in_recovery();
--------
f

postgres=# show replication_slot_inactive_timeout;
replication_slot_inactive_timeout
-----------------------------------
10s

postgres=# select slot_name, inactive_since, invalidation_reason,
synced from pg_replication_slots;
slot_name | inactive_since | invalidation_reason | synced
-------------+----------------------------------+---------------------+----------+--------
mysubnew1_1 | 2024-09-19 09:04:09.714283+05:30 | | t

postgres=# select now();
now
----------------------------------
2024-09-19 09:06:28.871354+05:30

postgres=# checkpoint;
CHECKPOINT

thanks
Shveta

#270

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: Bharath Rupireddy (#261)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Please find the attached v46 patch having changes for the above review
comments and your test review comments and Shveta's review comments.

Hi,

I’ve reviewed this thread and am interested in working on the
remaining tasks and comments, as well as the future review comments.
However, Bharath, please let me know if you'd prefer to continue with
it.

Attached the rebased v47 patch, which also addresses Peter’s comments
#2, #3, and #4 at [1]/messages/by-id/CAHut+Ps=x+2Hq5ue0YppOeDZqgHTnyw=u+vs-qy0JRjKaeJtew@mail.gmail.com. I will try addressing other comments as well in
next versions.

[1]: /messages/by-id/CAHut+Ps=x+2Hq5ue0YppOeDZqgHTnyw=u+vs-qy0JRjKaeJtew@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v47-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v47-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 251dd5cb5502b438e40fe636486a4e1a5adbbc7a Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Thu, 7 Nov 2024 12:09:47 +0530
Subject: [PATCH v47] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage for instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named replication_slot_inactive_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are inactive for longer than this amount of
time.

Note that the inactive timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  35 +++
 doc/src/sgml/system-views.sgml                |   7 +
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |  17 +-
 src/backend/replication/slot.c                | 163 +++++++++--
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |  25 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 267 ++++++++++++++++++
 13 files changed, 503 insertions(+), 35 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d54f904956..f19abe91b9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4583,6 +4583,41 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the inactive timeout invalidation mechanism. This parameter can only
+        be set in the <filename>postgresql.conf</filename> file or on the
+        server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to inactive timeout occurs during checkpoint.
+        The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 61d28e701f..c909ef5bf1 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2618,6 +2618,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for longer than the amount of time specified by the
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index d62186a510..4443bd53b4 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,13 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, &now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 6828100cf1..0076e4b5ea 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -535,9 +537,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot has been
+ * invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +620,22 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot has been
+	 * previously invalidated due to inactive timeout.
+	 */
+	if (error_if_invalid &&
+		s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+	{
+		Assert(s->inactive_since > 0);
+		ereport(ERROR,
+				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				 errmsg("can no longer get changes from replication slot \"%s\"",
+						NameStr(s->data.name)),
+				 errdetail("This slot has been invalidated because it was inactive for longer than the amount of time specified by \"%s\".",
+						   "replication_slot_inactive_timeout")));
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -703,16 +724,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, &now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, &now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -785,7 +802,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +829,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1508,7 +1525,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1538,6 +1556,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			Assert(inactive_since > 0);
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s for longer than the amount of time specified by \"%s\"."),
+							 timestamptz_to_str(inactive_since),
+							 "replication_slot_inactive_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1554,6 +1581,31 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Is this replication slot allowed for inactive timeout invalidation check?
+ *
+ * Inactive timeout invalidation is allowed only when:
+ *
+ * 1. Inactive timeout is set
+ * 2. Slot is inactive
+ * 3. Server is not in recovery
+ * 4. Slot is not being synced from the primary
+ *
+ * Note that the inactive timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s)
+{
+	return (replication_slot_inactive_timeout > 0 &&
+			s->inactive_since > 0 &&
+			!RecoveryInProgress() &&
+			!s->data.synced);
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1581,6 +1633,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1588,6 +1641,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1598,6 +1652,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT &&
+			SlotInactiveTimeoutCheckAllowed(s))
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1651,6 +1715,25 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * replication_slot_inactive_timeout GUC.
+					 */
+					if (SlotInactiveTimeoutCheckAllowed(s) &&
+						TimestampDifferenceExceeds(s->inactive_since, now,
+												   replication_slot_inactive_timeout * 1000))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+
+						/*
+						 * Invalidation due to inactive timeout implies that
+						 * no one is using the slot.
+						 */
+						Assert(s->active_pid == 0);
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1676,11 +1759,13 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s &&
+			 active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1735,7 +1820,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1781,7 +1867,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1804,6 +1891,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1856,7 +1944,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1914,6 +2003,38 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do inactive_timeout invalidation
+		 * of thousands of replication slots here. If it is ever proven that
+		 * this assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move inactive_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2208,6 +2329,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2368,6 +2490,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2400,7 +2525,7 @@ RestoreSlotFromDisk(const char *name)
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		ReplicationSlotSetInactiveSince(slot, &now, false);
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 488a161b3e..578cff64c8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 371eef3ddd..b36ae90b2c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 8a45b5827e..3322848e03 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8a67f01200..367f510118 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time a replication slot can remain inactive before "
+						 "it will be invalidated."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 39a3ac2312..7c6ae1baa2 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -336,6 +336,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 45582cf9d8..8ffd06dd9f 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -224,6 +226,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz *now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = *now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -233,6 +254,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -249,7 +271,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..270f87c10a
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,267 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to inactive timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+primary_conninfo = '$connstr dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+my $logstart = -s $standby1->logfile;
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $inactive_timeout = 1;
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$standby1->reload;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time
+sleep($inactive_timeout + 1);
+
+# Despite inactive timeout being set, the synced slot won't get invalidated on
+# its own on the standby. So, we must not see invalidation message in server
+# log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+$logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# inactive timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$inactive_timeout);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to inactive timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sync_slot1' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for sync_slot1 invalidation to be synced on standby";
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$inactive_timeout);
+
+# Testcase end
+# =============================================================================
+
+# =============================================================================
+# Testcase start
+# Invalidate logical subscriber slot due to inactive timeout.
+
+my $publisher = $primary;
+
+# Prepare for test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+is($result, qq(5), "check initial copy was done");
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO ' ${inactive_timeout}s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Make subscriber slot on publisher inactive and check for invalidation
+$subscriber->stop;
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become inactive and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $node_name = $node->name;
+
+	# Wait for slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND active = 'f' AND
+				  inactive_since IS NOT NULL;
+	])
+	  or die
+	  "Timed out while waiting for slot $slot to become inactive on node $node_name";
+
+	trigger_slot_invalidation($node, $slot, $offset, $inactive_timeout);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time to avoid multiple checkpoints
+	sleep($inactive_timeout + 1);
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot\"", $offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot invalidation has been logged on node $node_name"
+	);
+
+	# Check that the invalidation reason is 'inactive_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#271

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: shveta malik (#267)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Sep 18, 2024 at 3:31 PM shveta malik <shveta.malik@gmail.com> wrote:

On Wed, Sep 18, 2024 at 2:49 PM shveta malik <shveta.malik@gmail.com> wrote:

Please find the attached v46 patch having changes for the above review
comments and your test review comments and Shveta's review comments.

When the synced slot is marked as 'inactive_timeout' invalidated on
hot standby due to invalidation of publisher 's failover slot, the
former starts showing NULL' inactive_since'. Is this intentional
behaviour? I feel inactive_since should be non-NULL here too?
Thoughts?

physical standby:
postgres=# select slot_name, inactive_since, invalidation_reason,
failover, synced from pg_replication_slots;
slot_name | inactive_since |
invalidation_reason | failover | synced
-------------+----------------------------------+---------------------+----------+--------
sub2 | 2024-09-18 15:20:04.364998+05:30 | | t | t
sub3 | 2024-09-18 15:20:04.364953+05:30 | | t | t

After sync of invalidation_reason:

slot_name | inactive_since | invalidation_reason |
failover | synced
-------------+----------------------------------+---------------------+----------+--------
sub2 | | inactive_timeout | t | t
sub3 | | inactive_timeout | t | t

For synced slots on the standby, inactive_since indicates the last
synchronization time rather than the time the slot became inactive
(see doc - https://www.postgresql.org/docs/devel/view-pg-replication-slots.html).

In the reported case above, once a synced slot is invalidated we don't
even keep the last synchronization time for it. This is because when a
synced slot on the standby is marked invalid, inactive_since is reset
to NULL each time the slot-sync worker acquires a lock on it. This
lock acquisition before checking invalidation is done to avoid certain
race conditions and will activate the slot temporarily, resetting
inactive_since. Later, the slot-sync worker updates inactive_since for
all synced slots to the current synchronization time. However, for
invalid slots, this update is skipped, as per the patch’s design.

If we want to preserve the inactive_since value for the invalid synced
slots on standby, we need to clarify the time it should display. Here
are three possible approaches:

1) Copy the primary's inactive_since upon invalidation: When a slot
becomes invalid on the primary, the slot-sync worker could copy the
primary slot’s inactive_since to the standby slot and retain it, by
preventing future updates on the standby.

2) Use the current time of standby when the synced slot is marked
invalid for the first time and do not update it in subsequent sync
cycles if the slot is invalid.

Approach (2) seems more reasonable to me, however, Both 1) & 2)
approaches contradicts the purpose of inactive_since, as it no longer
represents either the true "last sync time" or the "time slot became
inactive" because the slot-sync worker acquires locks periodically for
syncing, and keeps activating the slot.

3) Continuously update inactive_since for invalid synced slots as
well: Treat invalid synced slots like valid ones by updating
inactive_since with each sync cycle. This way, we can keep the "last
sync time" in the inactive_since. However, this could confuse users
when "invalidation_reason=inactive_timeout" is set for a synced slot
on standby but inactive_since would reflect sync time rather than the
time slot became inactive. IIUC, on the primary, when
invalidation_reason=inactive_timeout for a slot, the inactive_since
represents the actual time the slot became inactive before getting
invalidated, unless the primary is restarted.

Thoughts?

--
Thanks,
Nisha

#272

vignesh C

vignesh21@gmail.com

about 1 year ago

In reply to: Nisha Moond (#270)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, 7 Nov 2024 at 15:33, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Please find the attached v46 patch having changes for the above review
comments and your test review comments and Shveta's review comments.

Hi,

I’ve reviewed this thread and am interested in working on the
remaining tasks and comments, as well as the future review comments.
However, Bharath, please let me know if you'd prefer to continue with
it.

Attached the rebased v47 patch, which also addresses Peter’s comments
#2, #3, and #4 at [1]. I will try addressing other comments as well in
next versions.

The following crash occurs while upgrading:
2024-11-13 14:19:45.955 IST [44539] LOG: checkpoint starting: time
TRAP: failed Assert("!(*invalidated && SlotIsLogical(s) &&
IsBinaryUpgrade)"), File: "slot.c", Line: 1793, PID: 44539
postgres: checkpointer (ExceptionalCondition+0xbb)[0x555555e305bd]
postgres: checkpointer (+0x63ab04)[0x555555b8eb04]
postgres: checkpointer
(InvalidateObsoleteReplicationSlots+0x149)[0x555555b8ee5f]
postgres: checkpointer (CheckPointReplicationSlots+0x267)[0x555555b8f125]
postgres: checkpointer (+0x1f3ee8)[0x555555747ee8]
postgres: checkpointer (CreateCheckPoint+0x78f)[0x5555557475ee]
postgres: checkpointer (CheckpointerMain+0x632)[0x555555b2f1e7]
postgres: checkpointer (postmaster_child_launch+0x119)[0x555555b30892]
postgres: checkpointer (+0x5e2dc8)[0x555555b36dc8]
postgres: checkpointer (PostmasterMain+0x14bd)[0x555555b33647]
postgres: checkpointer (+0x487f2e)[0x5555559dbf2e]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7ffff6c29d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7ffff6c29e40]
postgres: checkpointer (_start+0x25)[0x555555634c25]
2024-11-13 14:19:45.967 IST [44538] LOG: checkpointer process (PID
44539) was terminated by signal 6: Aborted

This can happen in the following case:
1) Setup a logical replication cluster with enough data so that it
will take at least few minutes to upgrade
2) Stop the publisher node
3) Configure replication_slot_inactive_timeout and checkpoint_timeout
to 30 seconds
4) Upgrade the publisher node.

This is happening because logical replication slots are getting
invalidated during upgrade and there is an assertion which checks that
the slots are not invalidated.
I feel this can be fixed by having a function similar to
check_max_slot_wal_keep_size which will make sure that
replication_slot_inactive_timeout is 0 during upgrade.

Regards,
Vignesh

#273

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: shveta malik (#269)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Please find the v48 patch attached.

On Thu, Sep 19, 2024 at 9:40 AM shveta malik <shveta.malik@gmail.com> wrote:

When we promote hot standby with synced logical slots to become new
primary, the logical slots are never invalidated with
'inactive_timeout' on new primary. It seems the check in
SlotInactiveTimeoutCheckAllowed() is wrong. We should allow
invalidation of slots on primary even if they are marked as 'synced'.

fixed.

I have raised 4 issues so far on v46, the first 3 are in [1],[2],[3].
Once all these are addressed, I can continue reviewing further.

Fixed issues reported in [1]/messages/by-id/CAJpy0uAwxc49Dz6t=-y_-z-MU+A4RWX4BR3Zri_jj2qgGMq_8g@mail.gmail.com, [2]/messages/by-id/CAJpy0uC6nN3SLbEuCvz7-CpaPdNdXxH=feW5MhYQch-JWV0tLg@mail.gmail.com.

[1]: /messages/by-id/CAJpy0uAwxc49Dz6t=-y_-z-MU+A4RWX4BR3Zri_jj2qgGMq_8g@mail.gmail.com
[2]: /messages/by-id/CAJpy0uC6nN3SLbEuCvz7-CpaPdNdXxH=feW5MhYQch-JWV0tLg@mail.gmail.com

Attachments:

v48-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v48-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 5c933e5231bb20e9eb3b0089665de76adb896cca Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Thu, 7 Nov 2024 12:09:47 +0530
Subject: [PATCH v48] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage for instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named replication_slot_inactive_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are inactive for longer than this amount of
time.

Note that the inactive timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  35 +++
 doc/src/sgml/system-views.sgml                |  12 +-
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |  17 +-
 src/backend/replication/slot.c                | 162 +++++++++--
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |  25 +-
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 267 ++++++++++++++++++
 13 files changed, 505 insertions(+), 37 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a84e60c09b..25ca232a02 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4585,6 +4585,41 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the inactive timeout invalidation mechanism. This parameter can only
+        be set in the <filename>postgresql.conf</filename> file or on the
+        server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to inactive timeout occurs during checkpoint.
+        The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 61d28e701f..d16496a941 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2567,8 +2567,9 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time since the slot has become inactive.
-        <literal>NULL</literal> if the slot is currently being used.
-        Note that for slots on the standby that are being synced from a
+        <literal>NULL</literal> if the slot is currently being used. Once the
+        slot is invalidated, this value will remain unchanged until we shutdown
+        the server. Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the
         <structfield>inactive_since</structfield> indicates the last
@@ -2618,6 +2619,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for longer than the amount of time specified by the
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index d62186a510..4443bd53b4 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,13 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, &now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 6828100cf1..df76291b7d 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -535,9 +537,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot has been
+ * invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +620,22 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot has been
+	 * previously invalidated due to inactive timeout.
+	 */
+	if (error_if_invalid &&
+		s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+	{
+		Assert(s->inactive_since > 0);
+		ereport(ERROR,
+				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				 errmsg("can no longer get changes from replication slot \"%s\"",
+						NameStr(s->data.name)),
+				 errdetail("This slot has been invalidated because it was inactive for longer than the amount of time specified by \"%s\".",
+						   "replication_slot_inactive_timeout")));
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -703,16 +724,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, &now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, &now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -785,7 +802,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +829,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1508,7 +1525,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1538,6 +1556,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			Assert(inactive_since > 0);
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s for longer than the amount of time specified by \"%s\"."),
+							 timestamptz_to_str(inactive_since),
+							 "replication_slot_inactive_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1554,6 +1581,29 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Is this replication slot allowed for inactive timeout invalidation check?
+ *
+ * Inactive timeout invalidation is allowed only when:
+ *
+ * 1. Inactive timeout is set
+ * 2. Slot is inactive
+ * 3. Server is in recovery and slot is not being synced from the primary
+ *
+ * Note that the inactive timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s)
+{
+	return (replication_slot_inactive_timeout > 0 &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1581,6 +1631,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1588,6 +1639,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1598,6 +1650,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT &&
+			SlotInactiveTimeoutCheckAllowed(s))
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1651,6 +1713,26 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * replication_slot_inactive_timeout GUC.
+					 */
+					if (SlotInactiveTimeoutCheckAllowed(s) &&
+						TimestampDifferenceExceeds(s->inactive_since, now,
+												   replication_slot_inactive_timeout * 1000))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+
+						/*
+						 * Invalidation due to inactive timeout implies that
+						 * no one is using the slot.
+						 */
+						Assert(s->active_pid == 0);
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1676,11 +1758,13 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s &&
+			 active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1735,7 +1819,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1781,7 +1866,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1804,6 +1890,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1856,7 +1943,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1914,6 +2002,38 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do inactive_timeout invalidation
+		 * of thousands of replication slots here. If it is ever proven that
+		 * this assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move inactive_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2208,6 +2328,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2368,6 +2489,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2400,7 +2524,7 @@ RestoreSlotFromDisk(const char *name)
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 488a161b3e..578cff64c8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 371eef3ddd..b36ae90b2c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 8a45b5827e..3322848e03 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8a67f01200..367f510118 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time a replication slot can remain inactive before "
+						 "it will be invalidated."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout,
+		0, 0, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 39a3ac2312..7c6ae1baa2 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -336,6 +336,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 45582cf9d8..8ffd06dd9f 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -224,6 +226,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz *now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = *now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -233,6 +254,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int replication_slot_inactive_timeout;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -249,7 +271,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..270f87c10a
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,267 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to inactive timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+primary_conninfo = '$connstr dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+my $logstart = -s $standby1->logfile;
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $inactive_timeout = 1;
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$standby1->reload;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time
+sleep($inactive_timeout + 1);
+
+# Despite inactive timeout being set, the synced slot won't get invalidated on
+# its own on the standby. So, we must not see invalidation message in server
+# log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+$logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# inactive timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$inactive_timeout);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to inactive timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sync_slot1' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for sync_slot1 invalidation to be synced on standby";
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$inactive_timeout);
+
+# Testcase end
+# =============================================================================
+
+# =============================================================================
+# Testcase start
+# Invalidate logical subscriber slot due to inactive timeout.
+
+my $publisher = $primary;
+
+# Prepare for test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+is($result, qq(5), "check initial copy was done");
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO ' ${inactive_timeout}s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Make subscriber slot on publisher inactive and check for invalidation
+$subscriber->stop;
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become inactive and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $node_name = $node->name;
+
+	# Wait for slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND active = 'f' AND
+				  inactive_since IS NOT NULL;
+	])
+	  or die
+	  "Timed out while waiting for slot $slot to become inactive on node $node_name";
+
+	trigger_slot_invalidation($node, $slot, $offset, $inactive_timeout);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time to avoid multiple checkpoints
+	sleep($inactive_timeout + 1);
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot\"", $offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot invalidation has been logged on node $node_name"
+	);
+
+	# Check that the invalidation reason is 'inactive_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#274

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: shveta malik (#265)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Sep 18, 2024 at 12:22 PM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

Please find the attached v46 patch having changes for the above review
comments and your test review comments and Shveta's review comments.

Thanks for addressing comments.

Is there a reason that we don't support this invalidation on hot
standby for non-synced slots? Shouldn't we support this time-based
invalidation there too just like other invalidations?

I don’t see any reason to *not* support this invalidation on hot
standby for non-synced slots. Therefore, I’ve added the same in v48.

--
Thanks,
Nisha

#275

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: Nisha Moond (#273)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha.

Thanks for the recent patch updates. Here are my review comments for
the latest patch v48-0001.

======
Commit message

1.
Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage for instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

What do the words "for instance" mean here? Did it mean "per instance"
or "(for example)" or something else?

======
doc/src/sgml/system-views.sgml

2.
       <para>
         The time since the slot has become inactive.
-        <literal>NULL</literal> if the slot is currently being used.
-        Note that for slots on the standby that are being synced from a
+        <literal>NULL</literal> if the slot is currently being used. Once the
+        slot is invalidated, this value will remain unchanged until we shutdown
+        the server. Note that for slots on the standby that are being
synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the

Is this change related to the new inactivity timeout feature or are
you just clarifying the existing behaviour of the 'active_since'
field.

Note there is already another thread [1]/messages/by-id/CAA4eK1JQFdssaBBh-oQskpKM-UpG8jPyUdtmGWa_0qCDy+K7_A@mail.gmail.com created to patch/clarify this
same field. So if you are just clarifying existing behavior then IMO
it would be better if you can to try and get your desired changes
included there quickly before that other patch gets pushed.

~~~

3.
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has been
+          inactive for longer than the amount of time specified by the
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>

Maybe there is a slightly shorter/simpler way to express this. For example,

BEFORE
inactive_timeout means that the slot has been inactive for longer than
the amount of time specified by the replication_slot_inactive_timeout
parameter.

SUGGESTION
inactive_timeout means that the slot has remained inactive beyond the
duration specified by the replication_slot_inactive_timeout parameter.

======
src/backend/replication/slot.c

4.
+int replication_slot_inactive_timeout = 0;

IMO it would be more informative to give the units in the variable
name (but not in the GUC name). e.g.
'replication_slot_inactive_timeout_secs'.

~~~

ReplicationSlotAcquire:

5.
+ *
+ * An error is raised if error_if_invalid is true and the slot has been
+ * invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)

This function comment makes it seem like "invalidated previously"
might mean *any* kind of invalidation, but later in the body of the
function we find the logic is really only used for inactive timeout.

+ /*
+ * An error is raised if error_if_invalid is true and the slot has been
+ * previously invalidated due to inactive timeout.
+ */

So, I think a better name for that parameter might be
'error_if_inactive_timeout'

OTOH, if it really is supposed to erro for *any* kind of invalidation
then there needs to be more ereports.

~~~

6.
+ errdetail("This slot has been invalidated because it was inactive
for longer than the amount of time specified by \"%s\".",

This errdetail message seems quite long. I think it can be shortened
like below and still retain exactly the same meaning:

BEFORE:
This slot has been invalidated because it was inactive for longer than
the amount of time specified by \"%s\".

SUGGESTION:
This slot has been invalidated due to inactivity exceeding the time
limit set by "%s".

~~~

ReportSlotInvalidation:

7.
+ case RS_INVAL_INACTIVE_TIMEOUT:
+ Assert(inactive_since > 0);
+ appendStringInfo(&err_detail,
+ _("The slot has been inactive since %s for longer than the amount of
time specified by \"%s\"."),
+ timestamptz_to_str(inactive_since),
+ "replication_slot_inactive_timeout");
+ break;

Here also as in the above review comment #6 I think the message can be
shorter and still say the same thing

BEFORE:
_("The slot has been inactive since %s for longer than the amount of
time specified by \"%s\"."),

SUGGESTION:
_("The slot has been inactive since %s, exceeding the time limit set
by \"%s\"."),

~~~

SlotInactiveTimeoutCheckAllowed:

8.
+/*
+ * Is this replication slot allowed for inactive timeout invalidation check?
+ *
+ * Inactive timeout invalidation is allowed only when:
+ *
+ * 1. Inactive timeout is set
+ * 2. Slot is inactive
+ * 3. Server is in recovery and slot is not being synced from the primary
+ *
+ * Note that the inactive timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */

8a.
Somehow that first sentence seems strange. Would it be better to write it like:

SUGGESTION
Can this replication slot timeout due to inactivity?

8b.
AFAICT that reason 3 ("Server is in recovery and slot is not being
synced from the primary") seems not quite worded right...

Should it say more like:
The slot is not being synced from the primary while the server is in recovery

or maybe like:
The slot is not currently being synced from the primary (e.g. not
'synced' is true when server is in recovery)

8c.
Similarly, I think something about that "Note that the inactive
timeout invalidation mechanism is not applicable..." paragraph needs
tweaking because IMO that should also now be saying something about
'RecoveryInProgress'.

~~~

9.
+static inline bool
+SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s)

Maybe the function name should be 'IsSlotInactiveTimeoutPossible' or
something better.

~~~

InvalidatePossiblyObsoleteSlot:

10.
  break;
+ case RS_INVAL_INACTIVE_TIMEOUT:
+
+ /*
+ * Check if the slot needs to be invalidated due to
+ * replication_slot_inactive_timeout GUC.
+ */

Since there are no other blank lines anywhere in this switch, the
introduction of this one in v48 looks out of place to me. IMO it would
be more readable if a blank line followed each/every of the breaks,
but then that is not a necessary change for this patch so...

~~~

11.
+ /*
+ * Invalidation due to inactive timeout implies that
+ * no one is using the slot.
+ */
+ Assert(s->active_pid == 0);

Given this assertion, does it mean that "(s->active_pid == 0)" should
have been another condition done up-front in the function
'SlotInactiveTimeoutCheckAllowed'?

~~~

12.
  /*
- * If the slot can be acquired, do so and mark it invalidated
- * immediately.  Otherwise we'll signal the owning process, below, and
- * retry.
+ * If the slot can be acquired, do so and mark it as invalidated. If
+ * the slot is already ours, mark it as invalidated. Otherwise, we'll
+ * signal the owning process below and retry.
  */
- if (active_pid == 0)
+ if (active_pid == 0 ||
+ (MyReplicationSlot == s &&
+ active_pid == MyProcPid))

I wasn't sure how this change belongs to this patch, because the logic
of the previous review comment said for the case of invalidation due
to inactivity that active_id must be 0. e.g. Assert(s->active_pid ==
0);

~~~

RestoreSlotFromDisk:

13.
- slot->inactive_since = GetCurrentTimestamp();
+ slot->inactive_since = now;

In v47 this assignment used to call the function
'ReplicationSlotSetInactiveSince'. I recognise there is a very subtle
difference between direct assignment and the function, because the
function will skip assignment if the slot is already invalidated.
Anyway, if you are *deliberately* not wanting to call
ReplicationSlotSetInactiveSince here then I think this assignment
should be commented to explain the reason why not, otherwise someone
in the future might be tempted to think it was just an oversight and
add the call back in that you don't want.

======
src/test/recovery/t/050_invalidate_slots.pl

14.
+# Despite inactive timeout being set, the synced slot won't get invalidated on
+# its own on the standby. So, we must not see invalidation message in server
+# log.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+ 'postgres',
+ q{SELECT count(*) = 1 FROM pg_replication_slots
+   WHERE slot_name = 'sync_slot1'
+ AND invalidation_reason IS NULL;}
+ ),
+ "t",
+ 'check that synced slot sync_slot1 has not been invalidated on standby');
+

But, now, we are confirming this by another way -- not checking the
logs here, so the comment "So, we must not see invalidation message in
server log." is no longer appropriate here.

======
[1]: /messages/by-id/CAA4eK1JQFdssaBBh-oQskpKM-UpG8jPyUdtmGWa_0qCDy+K7_A@mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

#276

vignesh C

vignesh21@gmail.com

about 1 year ago

In reply to: Nisha Moond (#273)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, 13 Nov 2024 at 15:00, Nisha Moond <nisha.moond412@gmail.com> wrote:

Please find the v48 patch attached.

On Thu, Sep 19, 2024 at 9:40 AM shveta malik <shveta.malik@gmail.com> wrote:

When we promote hot standby with synced logical slots to become new
primary, the logical slots are never invalidated with
'inactive_timeout' on new primary. It seems the check in
SlotInactiveTimeoutCheckAllowed() is wrong. We should allow
invalidation of slots on primary even if they are marked as 'synced'.

fixed.

I have raised 4 issues so far on v46, the first 3 are in [1],[2],[3].
Once all these are addressed, I can continue reviewing further.

Fixed issues reported in [1], [2].

Few comments:
1) Since we don't change the value of now in
ReplicationSlotSetInactiveSince, the function parameter can be passed
by value:
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz *now,
+                                                               bool
acquire_lock)
+{
+       if (s->data.invalidated != RS_INVAL_NONE)
+               return;
+
+       if (acquire_lock)
+               SpinLockAcquire(&s->mutex);
+
+       s->inactive_since = *now;

2) Currently it allows a minimum value of less than 1 second like in
milliseconds, I feel we can have some minimum value at least something
like checkpoint_timeout:
diff --git a/src/backend/utils/misc/guc_tables.c
b/src/backend/utils/misc/guc_tables.c
index 8a67f01200..367f510118 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] =
                NULL, NULL, NULL
        },

+       {
+               {"replication_slot_inactive_timeout", PGC_SIGHUP,
REPLICATION_SENDING,
+                       gettext_noop("Sets the amount of time a
replication slot can remain inactive before "
+                                                "it will be invalidated."),
+                       NULL,
+                       GUC_UNIT_S
+               },
+               &replication_slot_inactive_timeout,
+               0, 0, INT_MAX,
+               NULL, NULL, NULL
+       },

3) Since SlotInactiveTimeoutCheckAllowed check is just done above and
the current time has been retrieved can we used "now" variable instead
of SlotInactiveTimeoutCheckAllowed again second time:
@@ -1651,6 +1713,26 @@
InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
                                        if (SlotIsLogical(s))
                                                invalidation_cause = cause;
                                        break;
+                               case RS_INVAL_INACTIVE_TIMEOUT:
+
+                                       /*
+                                        * Check if the slot needs to
be invalidated due to
+                                        *
replication_slot_inactive_timeout GUC.
+                                        */
+                                       if
(SlotInactiveTimeoutCheckAllowed(s) &&
+
TimestampDifferenceExceeds(s->inactive_since, now,
+
                            replication_slot_inactive_timeout * 1000))
+                                       {
+                                               invalidation_cause = cause;
+                                               inactive_since =
s->inactive_since;

4) I'm not sure if this change required by this patch or is it a
general optimization, if it is required for this patch we can detail
the comments:
@@ -2208,6 +2328,7 @@ RestoreSlotFromDisk(const char *name)
bool restored = false;
int readBytes;
pg_crc32c checksum;
+ TimestampTz now;

/* no need to lock here, no concurrent access allowed yet */

@@ -2368,6 +2489,9 @@ RestoreSlotFromDisk(const char *name)
NameStr(cp.slotdata.name)),
errhint("Change \"wal_level\" to be
\"replica\" or higher.")));

+       /* Use same inactive_since time for all slots */
+       now = GetCurrentTimestamp();
+
        /* nothing can be active yet, don't lock anything */
        for (i = 0; i < max_replication_slots; i++)
        {
@@ -2400,7 +2524,7 @@ RestoreSlotFromDisk(const char *name)
                 * slot from the disk into memory. Whoever acquires
the slot i.e.
                 * makes the slot active will reset it.
                 */
-               slot->inactive_since = GetCurrentTimestamp();
+               slot->inactive_since = now;

5) Why should the slot invalidation be updated during shutdown,
shouldn't the inactive_since value be intact during shutdown?
-        <literal>NULL</literal> if the slot is currently being used.
-        Note that for slots on the standby that are being synced from a
+        <literal>NULL</literal> if the slot is currently being used. Once the
+        slot is invalidated, this value will remain unchanged until we shutdown
+        the server. Note that for slots on the standby that are being
synced from a

6) New Style of ereport does not need braces around errcode, it can be
changed similarly:
+       if (error_if_invalid &&
+               s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT)
+       {
+               Assert(s->inactive_since > 0);
+               ereport(ERROR,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                                errmsg("can no longer get changes
from replication slot \"%s\"",
+                                               NameStr(s->data.name)),
+                                errdetail("This slot has been
invalidated because it was inactive for longer than the amount of time
specified by \"%s\".",
+
"replication_slot_inactive_timeout")));

Regards,
Vignesh

#277

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: vignesh C (#276)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Attached is the v49 patch set:
- Fixed the bug reported in [1]/messages/by-id/CALDaNm2mwkVFLfe8pLcU1W5Oy1vRr1Wzp53XGV08kr4Z2=SJpA@mail.gmail.com.
- Addressed comments in [2]/messages/by-id/CAHut+Pt6s-qNPdxH5=-fr2QKLEv0h16sQ8EvLiGJ-SdQNS6pbw@mail.gmail.com and [3]/messages/by-id/CALDaNm2VQW_gpOJ-QWkEA_h18DN31ELEz2_7QmwWCAg9=Zew4A@mail.gmail.com.

I've split the patch into two, implementing the suggested idea in
comment #5 of [2]/messages/by-id/CAHut+Pt6s-qNPdxH5=-fr2QKLEv0h16sQ8EvLiGJ-SdQNS6pbw@mail.gmail.com separately in 001:

Patch-001: Adds additional error reports (for all invalidation types)
in ReplicationSlotAcquire() for invalid slots when error_if_invalid =
true.
Patch-002: The original patch with comments addressed.

~~~~

[1]: /messages/by-id/CALDaNm2mwkVFLfe8pLcU1W5Oy1vRr1Wzp53XGV08kr4Z2=SJpA@mail.gmail.com
[2]: /messages/by-id/CAHut+Pt6s-qNPdxH5=-fr2QKLEv0h16sQ8EvLiGJ-SdQNS6pbw@mail.gmail.com
[3]: /messages/by-id/CALDaNm2VQW_gpOJ-QWkEA_h18DN31ELEz2_7QmwWCAg9=Zew4A@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v49-0001-Add-error-handling-while-acquiring-a-replication.patchapplication/octet-stream; name=v49-0001-Add-error-handling-while-acquiring-a-replication.patchDownload

From 795a0c837329ec376f1214c4278ca95a2378e8f3 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:13:26 +0530
Subject: [PATCH v49 1/2] Add error handling while acquiring a replication slot

In ReplicationSlotAcquire(), raise an error for invalid slots if caller
specify error_if_invalid=true.
---
 .../replication/logical/logicalfuncs.c        |  2 +-
 src/backend/replication/logical/slotsync.c    |  4 +-
 src/backend/replication/slot.c                | 48 +++++++++++++++++--
 src/backend/replication/slotfuncs.c           |  2 +-
 src/backend/replication/walsender.c           |  4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |  2 +-
 src/include/replication/slot.h                |  3 +-
 src/test/recovery/t/019_replslot_limit.pl     |  2 +-
 8 files changed, 55 insertions(+), 12 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index d62186a510..69a32422bf 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 6828100cf1..0cbfd4fbaf 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -535,9 +535,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot has been
+ * invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +618,45 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot has been
+	 * previously invalidated.
+	 */
+	if (error_if_invalid &&
+		s->data.invalidated != RS_INVAL_NONE)
+	{
+		StringInfoData err_detail;
+
+		initStringInfo(&err_detail);
+		appendStringInfo(&err_detail, _("This slot has been invalidated because "));
+
+		switch (s->data.invalidated)
+		{
+			case RS_INVAL_WAL_REMOVED:
+				appendStringInfo(&err_detail, _("the required WAL has been removed."));
+				break;
+
+			case RS_INVAL_HORIZON:
+				appendStringInfo(&err_detail, _("the required rows have been removed."));
+				break;
+
+			case RS_INVAL_WAL_LEVEL:
+				appendStringInfo(&err_detail, _("wal_level is insufficient for slot."));
+				break;
+
+			case RS_INVAL_NONE:
+				pg_unreachable();
+		}
+
+		ereport(ERROR,
+				errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				errmsg("can no longer get changes from replication slot \"%s\"",
+					   NameStr(s->data.name)),
+				errdetail_internal("%s", err_detail.data));
+
+		pfree(err_detail.data);
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +827,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +854,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 488a161b3e..578cff64c8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 371eef3ddd..b36ae90b2c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 8a45b5827e..3322848e03 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 45582cf9d8..2c325fd942 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -249,7 +249,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index efb4ba3af1..333e040e7f 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This slot has been invalidated because the required WAL has been removed",
 			$logstart))
 	{
 		$failed = 1;
-- 
2.34.1

v49-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v49-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 2b8637c7654676c8c935506e986344a0b1c36133 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:54:50 +0530
Subject: [PATCH v49 2/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named replication_slot_inactive_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are inactive for longer than this amount of
time.

Note that the inactive timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  35 +++
 doc/src/sgml/system-views.sgml                |  12 +-
 src/backend/replication/logical/slotsync.c    |  13 +-
 src/backend/replication/slot.c                | 166 +++++++++--
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |  22 ++
 src/include/utils/guc_hooks.h                 |   2 +
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 266 ++++++++++++++++++
 10 files changed, 503 insertions(+), 27 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a84e60c09b..25ca232a02 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4585,6 +4585,41 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the inactive timeout invalidation mechanism. This parameter can only
+        be set in the <filename>postgresql.conf</filename> file or on the
+        server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to inactive timeout occurs during checkpoint.
+        The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 61d28e701f..7eec92eac9 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2567,8 +2567,9 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time since the slot has become inactive.
-        <literal>NULL</literal> if the slot is currently being used.
-        Note that for slots on the standby that are being synced from a
+        <literal>NULL</literal> if the slot is currently being used. Once the
+        slot is invalidated, this value will remain unchanged until we shutdown
+        the server. Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the
         <structfield>inactive_since</structfield> indicates the last
@@ -2618,6 +2619,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has remained
+          inactive beyond the duration specified by the
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 69a32422bf..4dc2811f10 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,13 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 0cbfd4fbaf..6dd5d113a0 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout_sec = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -644,6 +646,11 @@ retry:
 				appendStringInfo(&err_detail, _("wal_level is insufficient for slot."));
 				break;
 
+			case RS_INVAL_INACTIVE_TIMEOUT:
+				appendStringInfo(&err_detail, _("inactivity exceeded the time limit set by \"%s\"."),
+								 "replication_slot_inactive_timeout");
+				break;
+
 			case RS_INVAL_NONE:
 				pg_unreachable();
 		}
@@ -745,16 +752,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1550,7 +1553,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1580,6 +1584,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			Assert(inactive_since > 0);
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s, exceeding the time limit set by \"%s\"."),
+							 timestamptz_to_str(inactive_since),
+							 "replication_slot_inactive_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1596,6 +1609,30 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Is inactive timeout invalidation possible for this replication slot?
+ *
+ * Inactive timeout invalidation is allowed only when:
+ *
+ * 1. Inactive timeout is set
+ * 2. Slot is inactive
+ * 3. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the inactive timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+IsSlotInactiveTimeoutPossible(ReplicationSlot *s)
+{
+	return (replication_slot_inactive_timeout_sec > 0 &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1623,6 +1660,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1630,6 +1668,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1640,6 +1679,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT &&
+			IsSlotInactiveTimeoutPossible(s))
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1693,6 +1742,26 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * replication_slot_inactive_timeout GUC.
+					 */
+					if (now &&
+						TimestampDifferenceExceeds(s->inactive_since, now,
+												   replication_slot_inactive_timeout_sec * 1000))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+
+						/*
+						 * Invalidation due to inactive timeout implies that
+						 * no one is using the slot.
+						 */
+						Assert(s->active_pid == 0);
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1718,11 +1787,13 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s &&
+			 active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1777,7 +1848,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1823,7 +1895,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1846,6 +1919,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1898,7 +1972,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1956,6 +2031,38 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do inactive_timeout invalidation
+		 * of thousands of replication slots here. If it is ever proven that
+		 * this assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move inactive_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2250,6 +2357,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2410,6 +2518,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2440,9 +2551,11 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
@@ -2845,3 +2958,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for replication_slot_inactive_timeout
+ *
+ * We don't allow the value of replication_slot_inactive_timeout other than 0
+ * during the binary upgrade.
+ */
+bool
+check_replication_slot_inactive_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("\"%s\" must be set to 0 during binary upgrade mode.",
+							"replication_slot_inactive_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8a67f01200..f5a6bf4c61 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time a replication slot can remain inactive before "
+						 "it will be invalidated."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout_sec,
+		0, 0, INT_MAX,
+		check_replication_slot_inactive_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 39a3ac2312..7c6ae1baa2 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -336,6 +336,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 2c325fd942..98f858659c 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -224,6 +226,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -233,6 +254,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int replication_slot_inactive_timeout_sec;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 5813dba0a2..e36b6cfe21 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_replication_slot_inactive_timeout(int *newval, void **extra,
+													GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..e4f015e302
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,266 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to inactive timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+primary_conninfo = '$connstr dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+my $logstart = -s $standby1->logfile;
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $inactive_timeout = 1;
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$standby1->reload;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time
+sleep($inactive_timeout + 1);
+
+# Despite inactive timeout being set, the synced slot won't get invalidated on
+# its own on the standby.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+$logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout}s';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# inactive timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$inactive_timeout);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to inactive timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+$standby1->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'sync_slot1' AND
+		invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for sync_slot1 invalidation to be synced on standby";
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$inactive_timeout);
+
+# Testcase end
+# =============================================================================
+
+# =============================================================================
+# Testcase start
+# Invalidate logical subscriber slot due to inactive timeout.
+
+my $publisher = $primary;
+
+# Prepare for test
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;
+
+# Create subscriber
+my $subscriber = PostgreSQL::Test::Cluster->new('sub');
+$subscriber->init;
+$subscriber->start;
+
+# Create tables
+$publisher->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some data
+$publisher->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $publisher->connstr . ' dbname=postgres';
+$publisher->safe_psql('postgres', "CREATE PUBLICATION pub FOR ALL TABLES");
+$publisher->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_logical_replication_slot(slot_name := 'lsub1_slot', plugin := 'pgoutput');
+]);
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub1_slot', create_slot = false)"
+);
+$subscriber->wait_for_subscription_sync($publisher, 'sub');
+my $result =
+  $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+is($result, qq(5), "check initial copy was done");
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$publisher->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO ' ${inactive_timeout}s';
+]);
+$publisher->reload;
+
+$logstart = -s $publisher->logfile;
+
+# Make subscriber slot on publisher inactive and check for invalidation
+$subscriber->stop;
+wait_for_slot_invalidation($publisher, 'lsub1_slot', $logstart,
+	$inactive_timeout);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become inactive and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $node_name = $node->name;
+
+	# Wait for slot to become inactive
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND active = 'f' AND
+				  inactive_since IS NOT NULL;
+	])
+	  or die
+	  "Timed out while waiting for slot $slot to become inactive on node $node_name";
+
+	trigger_slot_invalidation($node, $slot, $offset, $inactive_timeout);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time to avoid multiple checkpoints
+	sleep($inactive_timeout + 1);
+
+	for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+	{
+		$node->safe_psql('postgres', "CHECKPOINT");
+		if ($node->log_contains(
+				"invalidating obsolete replication slot \"$slot\"", $offset))
+		{
+			$invalidated = 1;
+			last;
+		}
+		usleep(100_000);
+	}
+	ok($invalidated,
+		"check that slot $slot invalidation has been logged on node $node_name"
+	);
+
+	# Check that the invalidation reason is 'inactive_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#278

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: Peter Smith (#275)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Nov 14, 2024 at 5:29 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha.

Thanks for the recent patch updates. Here are my review comments for
the latest patch v48-0001.

Thank you for the review. Comments are addressed in v49 version.
Below is my response to comments that may require further discussion.

======
doc/src/sgml/system-views.sgml

2.
<para>
The time since the slot has become inactive.
-        <literal>NULL</literal> if the slot is currently being used.
-        Note that for slots on the standby that are being synced from a
+        <literal>NULL</literal> if the slot is currently being used. Once the
+        slot is invalidated, this value will remain unchanged until we shutdown
+        the server. Note that for slots on the standby that are being
synced from a
primary server (whose <structfield>synced</structfield> field is
<literal>true</literal>), the

Is this change related to the new inactivity timeout feature or are
you just clarifying the existing behaviour of the 'active_since'
field.

Yes, this patch introduces inactive_timeout invalidation and prevents
updates to inactive_since for invalid slots. Only a node restart can
modify it, so, I believe we should retain these lines in this patch.

Note there is already another thread [1] created to patch/clarify this
same field. So if you are just clarifying existing behavior then IMO
it would be better if you can to try and get your desired changes
included there quickly before that other patch gets pushed.

Thanks for the reference, I have posted my suggestion on the thread.

ReplicationSlotAcquire:
5.
+ *
+ * An error is raised if error_if_invalid is true and the slot has been
+ * invalidated previously.
*/
void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
This function comment makes it seem like "invalidated previously"
might mean *any* kind of invalidation, but later in the body of the
function we find the logic is really only used for inactive timeout.
+ /*
+ * An error is raised if error_if_invalid is true and the slot has been
+ * previously invalidated due to inactive timeout.
+ */
So, I think a better name for that parameter might be
'error_if_inactive_timeout'

OTOH, if it really is supposed to erro for *any* kind of invalidation
then there needs to be more ereports.

+1 to the idea.
I have created a separate patch v49-0001 adding more ereports for all
kinds of invalidations.

~~~
SlotInactiveTimeoutCheckAllowed:

8.
+/*
+ * Is this replication slot allowed for inactive timeout invalidation check?
+ *
+ * Inactive timeout invalidation is allowed only when:
+ *
+ * 1. Inactive timeout is set
+ * 2. Slot is inactive
+ * 3. Server is in recovery and slot is not being synced from the primary
+ *
+ * Note that the inactive timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */

8a.
Somehow that first sentence seems strange. Would it be better to write it like:

SUGGESTION
Can this replication slot timeout due to inactivity?

I feel the suggestion is not very clear on the purpose of the
function, This function doesn't check inactivity or decide slot
timeout invalidation. It only pre-checks if the slot qualifies for an
inactivity check, which the caller will perform.
As I have changed function name too as per commnet#9, I used the following -
"Is inactive timeout invalidation possible for this replication slot?"
Thoughts?

~
8c.
Similarly, I think something about that "Note that the inactive
timeout invalidation mechanism is not applicable..." paragraph needs
tweaking because IMO that should also now be saying something about
'RecoveryInProgress'.

'RecoveryInProgress' check indicates that the server is a standby, and
the mentioned paragraph uses the term "standby" to describe the
condition. It seems unnecessary to mention RecoveryInProgress
separately.

~~~

InvalidatePossiblyObsoleteSlot:
10.
break;
+ case RS_INVAL_INACTIVE_TIMEOUT:
+
+ /*
+ * Check if the slot needs to be invalidated due to
+ * replication_slot_inactive_timeout GUC.
+ */
Since there are no other blank lines anywhere in this switch, the
introduction of this one in v48 looks out of place to me.

pgindent automatically added this blank line after 'case
RS_INVAL_INACTIVE_TIMEOUT'.

IMO it would
be more readable if a blank line followed each/every of the breaks,
but then that is not a necessary change for this patch so...

Since it's not directly related to the patch, I feel it might be best
to leave it as is for now.

~~~
11.
+ /*
+ * Invalidation due to inactive timeout implies that
+ * no one is using the slot.
+ */
+ Assert(s->active_pid == 0);
Given this assertion, does it mean that "(s->active_pid == 0)" should
have been another condition done up-front in the function
'SlotInactiveTimeoutCheckAllowed'?

I don't think it's a good idea to check (s->active_pid == 0) upfront,
before the timeout-invalidation check. AFAIU, this assertion is meant
to ensure active_pid = 0 only if the slot is going to be invalidated,
i.e., when the following condition is true:

TimestampDifferenceExceeds(s->inactive_since, now,

replication_slot_inactive_timeout_sec * 1000)

Thoughts? Open to others' opinions too.

~~~
12.
/*
- * If the slot can be acquired, do so and mark it invalidated
- * immediately.  Otherwise we'll signal the owning process, below, and
- * retry.
+ * If the slot can be acquired, do so and mark it as invalidated. If
+ * the slot is already ours, mark it as invalidated. Otherwise, we'll
+ * signal the owning process below and retry.
*/
- if (active_pid == 0)
+ if (active_pid == 0 ||
+ (MyReplicationSlot == s &&
+ active_pid == MyProcPid))
I wasn't sure how this change belongs to this patch, because the logic
of the previous review comment said for the case of invalidation due
to inactivity that active_id must be 0. e.g. Assert(s->active_pid ==
0);

I don't fully understand the purpose of this change yet. I'll look
into it further and get back.

~~~

RestoreSlotFromDisk:
13.
- slot->inactive_since = GetCurrentTimestamp();
+ slot->inactive_since = now;
In v47 this assignment used to call the function
'ReplicationSlotSetInactiveSince'. I recognise there is a very subtle
difference between direct assignment and the function, because the
function will skip assignment if the slot is already invalidated.
Anyway, if you are *deliberately* not wanting to call
ReplicationSlotSetInactiveSince here then I think this assignment
should be commented to explain the reason why not, otherwise someone
in the future might be tempted to think it was just an oversight and
add the call back in that you don't want.

Added comment saying avoid using ReplicationSlotSetInactiveSince()
here as it will skip the invalid slots.

~~~~

--
Thanks,
Nisha

#279

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: vignesh C (#276)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Nov 14, 2024 at 9:14 AM vignesh C <vignesh21@gmail.com> wrote:

On Wed, 13 Nov 2024 at 15:00, Nisha Moond <nisha.moond412@gmail.com> wrote:

Please find the v48 patch attached.

On Thu, Sep 19, 2024 at 9:40 AM shveta malik <shveta.malik@gmail.com> wrote:

When we promote hot standby with synced logical slots to become new
primary, the logical slots are never invalidated with
'inactive_timeout' on new primary. It seems the check in
SlotInactiveTimeoutCheckAllowed() is wrong. We should allow
invalidation of slots on primary even if they are marked as 'synced'.

fixed.

I have raised 4 issues so far on v46, the first 3 are in [1],[2],[3].
Once all these are addressed, I can continue reviewing further.

Fixed issues reported in [1], [2].

Few comments:

Thanks for the review.

2) Currently it allows a minimum value of less than 1 second like in
milliseconds, I feel we can have some minimum value at least something
like checkpoint_timeout:
diff --git a/src/backend/utils/misc/guc_tables.c
b/src/backend/utils/misc/guc_tables.c
index 8a67f01200..367f510118 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},

+       {
+               {"replication_slot_inactive_timeout", PGC_SIGHUP,
REPLICATION_SENDING,
+                       gettext_noop("Sets the amount of time a
replication slot can remain inactive before "
+                                                "it will be invalidated."),
+                       NULL,
+                       GUC_UNIT_S
+               },
+               &replication_slot_inactive_timeout,
+               0, 0, INT_MAX,
+               NULL, NULL, NULL
+       },

Currently, the feature is disabled by default when
replication_slot_inactive_timeout = 0. However, if we set a minimum
value, the default_val cannot be less than min_val, making it
impossible to use 0 to disable the feature.
Thoughts or any suggestions?

4) I'm not sure if this change required by this patch or is it a
general optimization, if it is required for this patch we can detail
the comments:
@@ -2208,6 +2328,7 @@ RestoreSlotFromDisk(const char *name)
bool restored = false;
int readBytes;
pg_crc32c checksum;
+ TimestampTz now;

/* no need to lock here, no concurrent access allowed yet */

@@ -2368,6 +2489,9 @@ RestoreSlotFromDisk(const char *name)
NameStr(cp.slotdata.name)),
errhint("Change \"wal_level\" to be
\"replica\" or higher.")));
+       /* Use same inactive_since time for all slots */
+       now = GetCurrentTimestamp();
+
/* nothing can be active yet, don't lock anything */
for (i = 0; i < max_replication_slots; i++)
{
@@ -2400,7 +2524,7 @@ RestoreSlotFromDisk(const char *name)
* slot from the disk into memory. Whoever acquires
the slot i.e.
* makes the slot active will reset it.
*/
-               slot->inactive_since = GetCurrentTimestamp();
+               slot->inactive_since = now;

After removing the "ReplicationSlotSetInactiveSince" from here, it
became irrelevant to this patch. Now, it is a general optimization to
set the same timestamp for all slots while restoring from disk. I have
added a few comments as per Peter's suggestion.

5) Why should the slot invalidation be updated during shutdown,
shouldn't the inactive_since value be intact during shutdown?
-        <literal>NULL</literal> if the slot is currently being used.
-        Note that for slots on the standby that are being synced from a
+        <literal>NULL</literal> if the slot is currently being used. Once the
+        slot is invalidated, this value will remain unchanged until we shutdown
+        the server. Note that for slots on the standby that are being
synced from a

The "inactive_since" data of a slot is not stored on disk, so the
older value cannot be restored after a restart.

--
Thanks,
Nisha

#280

vignesh C

vignesh21@gmail.com

about 1 year ago

In reply to: Nisha Moond (#277)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, 19 Nov 2024 at 12:43, Nisha Moond <nisha.moond412@gmail.com> wrote:

Attached is the v49 patch set:
- Fixed the bug reported in [1].
- Addressed comments in [2] and [3].

I've split the patch into two, implementing the suggested idea in
comment #5 of [2] separately in 001:

Patch-001: Adds additional error reports (for all invalidation types)
in ReplicationSlotAcquire() for invalid slots when error_if_invalid =
true.
Patch-002: The original patch with comments addressed.

Few comments:
1) I felt this check in wait_for_slot_invalidation is not required as
there is a call to trigger_slot_invalidation which sleeps for
inactive_timeout seconds and ensures checkpoint is triggered, also the
test passes without this:
+       # Wait for slot to become inactive
+       $node->poll_query_until(
+               'postgres', qq[
+               SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+                       WHERE slot_name = '$slot' AND active = 'f' AND
+                                 inactive_since IS NOT NULL;
+       ])
+         or die
+         "Timed out while waiting for slot $slot to become inactive
on node $node_name";

2) Instead of calling this in a loop, won't it be enough to call
checkpoint only once explicitly:
+       for (my $i = 0; $i < 10 *
$PostgreSQL::Test::Utils::timeout_default; $i++)
+       {
+               $node->safe_psql('postgres', "CHECKPOINT");
+               if ($node->log_contains(
+                               "invalidating obsolete replication
slot \"$slot\"", $offset))
+               {
+                       $invalidated = 1;
+                       last;
+               }
+               usleep(100_000);
+       }
+       ok($invalidated,
+               "check that slot $slot invalidation has been logged on
node $node_name"
+       );

3) Since pg_sync_replication_slots is a sync call, we can directly use
"is( $standby1->safe_psql('postgres', SELECT COUNT(slot_name) = 1 FROM
pg_replication_slots..." instead of poll_query_until:
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+$standby1->poll_query_until(
+       'postgres', qq[
+       SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+               WHERE slot_name = 'sync_slot1' AND
+               invalidation_reason = 'inactive_timeout';
+])
+  or die
+  "Timed out while waiting for sync_slot1 invalidation to be synced
on standby";

4) Since this variable is being referred to at many places, how about
changing it to inactive_timeout_1s so that it is easier while
reviewing across many places:
# Set timeout GUC on the standby to verify that the next checkpoint will not
# invalidate synced slots.
my $inactive_timeout = 1;

5) Since we have already tested invalidation of logical replication
slot 'sync_slot1' above, this test might not be required:
+# =============================================================================
+# Testcase start
+# Invalidate logical subscriber slot due to inactive timeout.
+
+my $publisher = $primary;
+
+# Prepare for test
+$publisher->safe_psql(
+       'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '0';
+]);
+$publisher->reload;

Regards,
Vignesh

#281

vignesh C

vignesh21@gmail.com

about 1 year ago

In reply to: Nisha Moond (#277)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, 19 Nov 2024 at 12:43, Nisha Moond <nisha.moond412@gmail.com> wrote:

Attached is the v49 patch set:
- Fixed the bug reported in [1].
- Addressed comments in [2] and [3].

I've split the patch into two, implementing the suggested idea in
comment #5 of [2] separately in 001:

Patch-001: Adds additional error reports (for all invalidation types)
in ReplicationSlotAcquire() for invalid slots when error_if_invalid =
true.
Patch-002: The original patch with comments addressed.

This Assert can fail:
+                                       /*
+                                        * Check if the slot needs to
be invalidated due to
+                                        *
replication_slot_inactive_timeout GUC.
+                                        */
+                                       if (now &&
+
TimestampDifferenceExceeds(s->inactive_since, now,
+
                            replication_slot_inactive_timeout_sec *
1000))
+                                       {
+                                               invalidation_cause = cause;
+                                               inactive_since =
s->inactive_since;
+
+                                               /*
+                                                * Invalidation due to
inactive timeout implies that
+                                                * no one is using the slot.
+                                                */
+                                               Assert(s->active_pid == 0);

With the following scenario:
Set replication_slot_inactive_timeout to 10 seconds
-- Create a slot
postgres=# select pg_create_logical_replication_slot ('test',
'pgoutput', true, true);
pg_create_logical_replication_slot
------------------------------------
(test,0/1748068)
(1 row)

-- Wait for 10 seconds and execute checkpoint
postgres=# checkpoint;
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly

The assert fails:
#5 0x00005b074f0c922f in ExceptionalCondition
(conditionName=0x5b074f2f0b4c "s->active_pid == 0",
fileName=0x5b074f2f0010 "slot.c", lineNumber=1762) at assert.c:66
#6 0x00005b074ee26ead in InvalidatePossiblyObsoleteSlot
(cause=RS_INVAL_INACTIVE_TIMEOUT, s=0x740925361780, oldestLSN=0,
dboid=0, snapshotConflictHorizon=0, invalidated=0x7fffaee87e63) at
slot.c:1762
#7 0x00005b074ee273b2 in InvalidateObsoleteReplicationSlots
(cause=RS_INVAL_INACTIVE_TIMEOUT, oldestSegno=0, dboid=0,
snapshotConflictHorizon=0) at slot.c:1952
#8 0x00005b074ee27678 in CheckPointReplicationSlots
(is_shutdown=false) at slot.c:2061
#9 0x00005b074e9dfda7 in CheckPointGuts (checkPointRedo=24412528,
flags=108) at xlog.c:7513
#10 0x00005b074e9df4ad in CreateCheckPoint (flags=108) at xlog.c:7179
#11 0x00005b074edc6bfc in CheckpointerMain (startup_data=0x0,
startup_data_len=0) at checkpointer.c:463

Regards,
Vignesh

#282

vignesh C

vignesh21@gmail.com

about 1 year ago

In reply to: Nisha Moond (#279)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, 19 Nov 2024 at 12:51, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Thu, Nov 14, 2024 at 9:14 AM vignesh C <vignesh21@gmail.com> wrote:
On Wed, 13 Nov 2024 at 15:00, Nisha Moond <nisha.moond412@gmail.com> wrote:

Please find the v48 patch attached.
2) Currently it allows a minimum value of less than 1 second like in
milliseconds, I feel we can have some minimum value at least something
like checkpoint_timeout:
diff --git a/src/backend/utils/misc/guc_tables.c
b/src/backend/utils/misc/guc_tables.c
index 8a67f01200..367f510118 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] =
NULL, NULL, NULL
},
+       {
+               {"replication_slot_inactive_timeout", PGC_SIGHUP,
REPLICATION_SENDING,
+                       gettext_noop("Sets the amount of time a
replication slot can remain inactive before "
+                                                "it will be invalidated."),
+                       NULL,
+                       GUC_UNIT_S
+               },
+               &replication_slot_inactive_timeout,
+               0, 0, INT_MAX,
+               NULL, NULL, NULL
+       },
Currently, the feature is disabled by default when
replication_slot_inactive_timeout = 0. However, if we set a minimum
value, the default_val cannot be less than min_val, making it
impossible to use 0 to disable the feature.
Thoughts or any suggestions?

We could implement this similarly to how the vacuum_buffer_usage_limit
GUC is handled. Setting the value to 0 would allow the operation to
use any amount of shared_buffers. Otherwise, valid sizes would range
from 128 kB to 16 GB. Similarly, we can modify
check_replication_slot_inactive_timeout to behave in the same way as
check_vacuum_buffer_usage_limit function.

Regards,
Vignesh

#283

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: vignesh C (#281)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Nov 20, 2024 at 1:29 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, 19 Nov 2024 at 12:43, Nisha Moond <nisha.moond412@gmail.com> wrote:

Attached is the v49 patch set:
- Fixed the bug reported in [1].
- Addressed comments in [2] and [3].

I've split the patch into two, implementing the suggested idea in
comment #5 of [2] separately in 001:

Patch-001: Adds additional error reports (for all invalidation types)
in ReplicationSlotAcquire() for invalid slots when error_if_invalid =
true.
Patch-002: The original patch with comments addressed.

This Assert can fail:

Attached v50 patch-set addressing review comments in [1]/messages/by-id/CALDaNm2UUTfJczjR-rEQwKgmx=iFnuMnR1cXv7ccB+O9P15mYg@mail.gmail.com and [2]/messages/by-id/CALDaNm0g86wD2=bQdFOy0smsP0MZWyz0CUqXej=Qi-hCEeqkag@mail.gmail.com.

Regarding the assert issue reported in [2]/messages/by-id/CALDaNm0g86wD2=bQdFOy0smsP0MZWyz0CUqXej=Qi-hCEeqkag@mail.gmail.com:
- For temporary replication slots, the current session's pid serves as
the active_pid for the slot, which is expected behavior.
- Therefore, the ASSERT has been removed in v50. Now, if a temporary
slot qualifies for a timeout invalidation, the holding process will be
terminated, and the slot will be invalidated.

[1]: /messages/by-id/CALDaNm2UUTfJczjR-rEQwKgmx=iFnuMnR1cXv7ccB+O9P15mYg@mail.gmail.com
[2]: /messages/by-id/CALDaNm0g86wD2=bQdFOy0smsP0MZWyz0CUqXej=Qi-hCEeqkag@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v50-0001-Add-error-handling-while-acquiring-a-replication.patchapplication/octet-stream; name=v50-0001-Add-error-handling-while-acquiring-a-replication.patchDownload

From 6df6c33d07ec27a8d633e17c004cf2d6bb61c55b Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:13:26 +0530
Subject: [PATCH v50 1/2] Add error handling while acquiring a replication slot

In ReplicationSlotAcquire(), raise an error for invalid slots if caller
specify error_if_invalid=true.
---
 .../replication/logical/logicalfuncs.c        |  2 +-
 src/backend/replication/logical/slotsync.c    |  4 +-
 src/backend/replication/slot.c                | 48 +++++++++++++++++--
 src/backend/replication/slotfuncs.c           |  2 +-
 src/backend/replication/walsender.c           |  4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |  2 +-
 src/include/replication/slot.h                |  3 +-
 src/test/recovery/t/019_replslot_limit.pl     |  2 +-
 8 files changed, 55 insertions(+), 12 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index d62186a510..69a32422bf 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 6828100cf1..0cbfd4fbaf 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -535,9 +535,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot has been
+ * invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +618,45 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot has been
+	 * previously invalidated.
+	 */
+	if (error_if_invalid &&
+		s->data.invalidated != RS_INVAL_NONE)
+	{
+		StringInfoData err_detail;
+
+		initStringInfo(&err_detail);
+		appendStringInfo(&err_detail, _("This slot has been invalidated because "));
+
+		switch (s->data.invalidated)
+		{
+			case RS_INVAL_WAL_REMOVED:
+				appendStringInfo(&err_detail, _("the required WAL has been removed."));
+				break;
+
+			case RS_INVAL_HORIZON:
+				appendStringInfo(&err_detail, _("the required rows have been removed."));
+				break;
+
+			case RS_INVAL_WAL_LEVEL:
+				appendStringInfo(&err_detail, _("wal_level is insufficient for slot."));
+				break;
+
+			case RS_INVAL_NONE:
+				pg_unreachable();
+		}
+
+		ereport(ERROR,
+				errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				errmsg("can no longer get changes from replication slot \"%s\"",
+					   NameStr(s->data.name)),
+				errdetail_internal("%s", err_detail.data));
+
+		pfree(err_detail.data);
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +827,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +854,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 488a161b3e..578cff64c8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 371eef3ddd..b36ae90b2c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 8a45b5827e..3322848e03 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 45582cf9d8..2c325fd942 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -249,7 +249,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index efb4ba3af1..333e040e7f 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This slot has been invalidated because the required WAL has been removed",
 			$logstart))
 	{
 		$failed = 1;
-- 
2.34.1

v50-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v50-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 8871cb932a047f04ca4a6c38d24dacd77b6fef6c Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:54:50 +0530
Subject: [PATCH v50 2/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named replication_slot_inactive_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are inactive for longer than this amount of
time.

Note that the inactive timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  35 ++++
 doc/src/sgml/system-views.sgml                |  12 +-
 src/backend/replication/logical/slotsync.c    |  13 +-
 src/backend/replication/slot.c                | 160 +++++++++++++--
 src/backend/utils/misc/guc_tables.c           |  12 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |  22 ++
 src/include/utils/guc_hooks.h                 |   2 +
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 190 ++++++++++++++++++
 10 files changed, 421 insertions(+), 27 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a84e60c09b..25ca232a02 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4585,6 +4585,41 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the inactive timeout invalidation mechanism. This parameter can only
+        be set in the <filename>postgresql.conf</filename> file or on the
+        server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to inactive timeout occurs during checkpoint.
+        The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 61d28e701f..7eec92eac9 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2567,8 +2567,9 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time since the slot has become inactive.
-        <literal>NULL</literal> if the slot is currently being used.
-        Note that for slots on the standby that are being synced from a
+        <literal>NULL</literal> if the slot is currently being used. Once the
+        slot is invalidated, this value will remain unchanged until we shutdown
+        the server. Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the
         <structfield>inactive_since</structfield> indicates the last
@@ -2618,6 +2619,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has remained
+          inactive beyond the duration specified by the
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 69a32422bf..4dc2811f10 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,13 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 0cbfd4fbaf..d65b8a6376 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout_sec = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -644,6 +646,11 @@ retry:
 				appendStringInfo(&err_detail, _("wal_level is insufficient for slot."));
 				break;
 
+			case RS_INVAL_INACTIVE_TIMEOUT:
+				appendStringInfo(&err_detail, _("inactivity exceeded the time limit set by \"%s\"."),
+								 "replication_slot_inactive_timeout");
+				break;
+
 			case RS_INVAL_NONE:
 				pg_unreachable();
 		}
@@ -745,16 +752,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1550,7 +1553,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1580,6 +1584,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			Assert(inactive_since > 0);
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s, exceeding the time limit set by \"%s\"."),
+							 timestamptz_to_str(inactive_since),
+							 "replication_slot_inactive_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1596,6 +1609,30 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Is inactive timeout invalidation possible for this replication slot?
+ *
+ * Inactive timeout invalidation is allowed only when:
+ *
+ * 1. Inactive timeout is set
+ * 2. Slot is inactive
+ * 3. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the inactive timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+IsSlotInactiveTimeoutPossible(ReplicationSlot *s)
+{
+	return (replication_slot_inactive_timeout_sec > 0 &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1623,6 +1660,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1630,6 +1668,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1640,6 +1679,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT &&
+			IsSlotInactiveTimeoutPossible(s))
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1693,6 +1742,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * replication_slot_inactive_timeout GUC.
+					 */
+					if (now &&
+						TimestampDifferenceExceeds(s->inactive_since, now,
+												   replication_slot_inactive_timeout_sec * 1000))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1718,11 +1781,13 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s &&
+			 active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1777,7 +1842,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1823,7 +1889,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1846,6 +1913,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1898,7 +1966,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1956,6 +2025,38 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do inactive_timeout invalidation
+		 * of thousands of replication slots here. If it is ever proven that
+		 * this assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move inactive_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2250,6 +2351,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2410,6 +2512,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2440,9 +2545,11 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
@@ -2845,3 +2952,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for replication_slot_inactive_timeout
+ *
+ * We don't allow the value of replication_slot_inactive_timeout other than 0
+ * during the binary upgrade.
+ */
+bool
+check_replication_slot_inactive_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"replication_slot_inactive_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8a67f01200..f5a6bf4c61 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3028,6 +3028,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time a replication slot can remain inactive before "
+						 "it will be invalidated."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout_sec,
+		0, 0, INT_MAX,
+		check_replication_slot_inactive_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 39a3ac2312..7c6ae1baa2 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -336,6 +336,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 2c325fd942..98f858659c 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -224,6 +226,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -233,6 +254,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int replication_slot_inactive_timeout_sec;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 5813dba0a2..e36b6cfe21 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_replication_slot_inactive_timeout(int *newval, void **extra,
+													GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..6a10cd0a54
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,190 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to inactive timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+primary_conninfo = '$connstr dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+my $logstart = -s $standby1->logfile;
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $inactive_timeout_1s = 1;
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout_1s}s';
+]);
+$standby1->reload;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time
+sleep($inactive_timeout_1s + 1);
+
+# Despite inactive timeout being set, the synced slot won't get invalidated on
+# its own on the standby.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+$logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout_1s}s';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# inactive timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$inactive_timeout_1s);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to inactive timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'inactive_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$inactive_timeout_1s);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become inactive and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout_1s) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $inactive_timeout_1s);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout_1s) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time to avoid multiple checkpoints
+	sleep($inactive_timeout_1s + 1);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'inactive_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#284

vignesh C

vignesh21@gmail.com

about 1 year ago

In reply to: Nisha Moond (#283)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, 21 Nov 2024 at 17:35, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Wed, Nov 20, 2024 at 1:29 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, 19 Nov 2024 at 12:43, Nisha Moond <nisha.moond412@gmail.com> wrote:

Attached is the v49 patch set:
- Fixed the bug reported in [1].
- Addressed comments in [2] and [3].

I've split the patch into two, implementing the suggested idea in
comment #5 of [2] separately in 001:

Patch-001: Adds additional error reports (for all invalidation types)
in ReplicationSlotAcquire() for invalid slots when error_if_invalid =
true.
Patch-002: The original patch with comments addressed.

This Assert can fail:

Attached v50 patch-set addressing review comments in [1] and [2].

We are setting inactive_since when the replication slot is released.
We are marking the slot as inactive only if it has been released.
However, there's a scenario where the network connection between the
publisher and subscriber may be lost where the replication slot is not
released, but no changes are replicated due to the network problem. In
this case, no updates would occur in the replication slot for a period
exceeding the replication_slot_inactive_timeout.
Should we invalidate these replication slots as well, or is it
intentionally left out?

Regards,
Vignesh

#285

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: Nisha Moond (#283)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha,

Here are my review comments for the patch v50-0001.

======
Commit message

1.
In ReplicationSlotAcquire(), raise an error for invalid slots if caller
specify error_if_invalid=true.

/caller/the caller/
/specify/specifies/

======
src/backend/replication/slot.c

ReplicationSlotAcquire:

2.
+ *
+ * An error is raised if error_if_invalid is true and the slot has been
+ * invalidated previously.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)

The "has been invalidated previously." sounds a bit tricky. Do you just mean:

"An error is raised if error_if_invalid is true and the slot is found
to be invalid."

3.
+ /*
+ * An error is raised if error_if_invalid is true and the slot has been
+ * previously invalidated.
+ */

(ditto previous comment)

4.
+ appendStringInfo(&err_detail, _("This slot has been invalidated because "));
+
+ switch (s->data.invalidated)
+ {
+ case RS_INVAL_WAL_REMOVED:
+ appendStringInfo(&err_detail, _("the required WAL has been removed."));
+ break;
+
+ case RS_INVAL_HORIZON:
+ appendStringInfo(&err_detail, _("the required rows have been removed."));
+ break;
+
+ case RS_INVAL_WAL_LEVEL:
+ appendStringInfo(&err_detail, _("wal_level is insufficient for slot."));
+ break;

4a.
I suspect that building the errdetail in 2 parts like this will be
troublesome for the translators of some languages. Probably it is
safer to have the entire errdetail for each case.

4b.
By convention, I think the GUC "wal_level" should be double-quoted in
the message.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#286

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: Nisha Moond (#283)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha,

Here are some review comments for the patch v50-0002.

======
src/backend/replication/slot.c

InvalidatePossiblyObsoleteSlot:

1.
+ if (now &&
+ TimestampDifferenceExceeds(s->inactive_since, now,
+    replication_slot_inactive_timeout_sec * 1000))

Previously this was using an additional call to SlotInactiveTimeoutCheckAllowed:

+ if (SlotInactiveTimeoutCheckAllowed(s) &&
+ TimestampDifferenceExceeds(s->inactive_since, now,
+    replication_slot_inactive_timeout * 1000))

Is it OK to skip that call? e.g. can the slot fields possibly change
between assigning the 'now' and acquiring the mutex? If not, then the
current code is fine. The only reason for asking is because it is
slightly suspicious that it was not done this "easy" way in the first
place.

~~~

check_replication_slot_inactive_timeout:

2.
+/*
+ * GUC check_hook for replication_slot_inactive_timeout
+ *
+ * We don't allow the value of replication_slot_inactive_timeout other than 0
+ * during the binary upgrade.
+ */

The "We don't allow..." sentence seems like a backward way of saying:
The value of replication_slot_inactive_timeout must be set to 0 during
the binary upgrade.

======
src/test/recovery/t/050_invalidate_slots.pl

3.
+# Despite inactive timeout being set, the synced slot won't get invalidated on
+# its own on the standby.

What does "on its own" mean here? Do you mean it won't get invalidated
unless the invalidation state is propagated from the primary? Maybe
the comment can be clearer.

4.
+# Wait for slot to first become inactive and then get invalidated
+sub wait_for_slot_invalidation
+{
+ my ($node, $slot, $offset, $inactive_timeout_1s) = @_;
+ my $node_name = $node->name;
+

It was OK to change the variable name to 'inactive_timeout_1s' outside
of here, but within the subroutine, I don't think it is appropriate
because this is a parameter that potentially could have any value.

5.
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+ my ($node, $slot, $offset, $inactive_timeout_1s) = @_;
+ my $node_name = $node->name;
+ my $invalidated = 0;

6.
+ # Give enough time to avoid multiple checkpoints
+ sleep($inactive_timeout_1s + 1);
+
+ # Run a checkpoint
+ $node->safe_psql('postgres', "CHECKPOINT");

Since you are not doing multiple checkpoints anymore, it looks like
that "Give enough time..." comment needs updating.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#287

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: Peter Smith (#286)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Nov 27, 2024 at 8:39 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha,

Here are some review comments for the patch v50-0002.

======
src/backend/replication/slot.c

InvalidatePossiblyObsoleteSlot:
1.
+ if (now &&
+ TimestampDifferenceExceeds(s->inactive_since, now,
+    replication_slot_inactive_timeout_sec * 1000))
Previously this was using an additional call to SlotInactiveTimeoutCheckAllowed:
+ if (SlotInactiveTimeoutCheckAllowed(s) &&
+ TimestampDifferenceExceeds(s->inactive_since, now,
+    replication_slot_inactive_timeout * 1000))
Is it OK to skip that call? e.g. can the slot fields possibly change
between assigning the 'now' and acquiring the mutex? If not, then the
current code is fine. The only reason for asking is because it is
slightly suspicious that it was not done this "easy" way in the first
place.

Good catch! While the mutex was being acquired right after the now
assignment, there was a rare chance of another process modifying the
slot in the meantime. So, I reverted the change in v51. To optimize
the SlotInactiveTimeoutCheckAllowed() call, it's sufficient to check
it here instead of during the 'now' assignment.

Attached v51 patch-set addressing all comments in [1]/messages/by-id/CAHut+PtuiQj1hwm=73xJ8hWuw-9cXbN4dHJHpM6EXxubDJgmFA@mail.gmail.com and [2]/messages/by-id/CAHut+Pvi-g+9+hjmjg44OzTN9L3YGQiCXBDAVaTVWvSn5SSwmw@mail.gmail.com.

[1]: /messages/by-id/CAHut+PtuiQj1hwm=73xJ8hWuw-9cXbN4dHJHpM6EXxubDJgmFA@mail.gmail.com
[2]: /messages/by-id/CAHut+Pvi-g+9+hjmjg44OzTN9L3YGQiCXBDAVaTVWvSn5SSwmw@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v51-0001-Add-error-handling-while-acquiring-a-replication.patchapplication/octet-stream; name=v51-0001-Add-error-handling-while-acquiring-a-replication.patchDownload

From 2319f1bf761fe2dea1aa84068f400b64f045e315 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:13:26 +0530
Subject: [PATCH v51 1/2] Add error handling while acquiring a replication slot

In ReplicationSlotAcquire(), raise an error for invalid slots if the
caller specifies error_if_invalid=true.
---
 .../replication/logical/logicalfuncs.c        |  2 +-
 src/backend/replication/logical/slotsync.c    |  4 +-
 src/backend/replication/slot.c                | 48 +++++++++++++++++--
 src/backend/replication/slotfuncs.c           |  2 +-
 src/backend/replication/walsender.c           |  4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |  2 +-
 src/include/replication/slot.h                |  3 +-
 src/test/recovery/t/019_replslot_limit.pl     |  2 +-
 8 files changed, 55 insertions(+), 12 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f4f80b2312..83d6e3811e 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 6828100cf1..8479d03c63 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -535,9 +535,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +618,45 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot is found to
+	 * be invalid.
+	 */
+	if (error_if_invalid &&
+		s->data.invalidated != RS_INVAL_NONE)
+	{
+		StringInfoData err_detail;
+
+		initStringInfo(&err_detail);
+
+		switch (s->data.invalidated)
+		{
+			case RS_INVAL_WAL_REMOVED:
+				appendStringInfo(&err_detail, _("This slot has been invalidated because the required WAL has been removed."));
+				break;
+
+			case RS_INVAL_HORIZON:
+				appendStringInfo(&err_detail, _("This slot has been invalidated because the required rows have been removed."));
+				break;
+
+			case RS_INVAL_WAL_LEVEL:
+				appendStringInfo(&err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."),
+								 "wal_level");
+				break;
+
+			case RS_INVAL_NONE:
+				pg_unreachable();
+		}
+
+		ereport(ERROR,
+				errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				errmsg("can no longer get changes from replication slot \"%s\"",
+					   NameStr(s->data.name)),
+				errdetail_internal("%s", err_detail.data));
+
+		pfree(err_detail.data);
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +827,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +854,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 488a161b3e..578cff64c8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 371eef3ddd..b36ae90b2c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 8a45b5827e..3322848e03 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, false);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index d2cf786fd5..f5f2d22163 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -253,7 +253,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index efb4ba3af1..333e040e7f 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This slot has been invalidated because the required WAL has been removed",
 			$logstart))
 	{
 		$failed = 1;
-- 
2.34.1

v51-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v51-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 548b813ed7e2e7933e3f10a3938bb21d8a57a866 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Wed, 27 Nov 2024 10:52:14 +0530
Subject: [PATCH v51 2/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named replication_slot_inactive_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are inactive for longer than this amount of
time.

Note that the inactive timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  35 ++++
 doc/src/sgml/system-views.sgml                |  10 +-
 src/backend/replication/logical/slotsync.c    |  13 +-
 src/backend/replication/slot.c                | 159 +++++++++++++--
 src/backend/utils/misc/guc_tables.c           |  12 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/replication/slot.h                |  22 ++
 src/include/utils/guc_hooks.h                 |   2 +
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 190 ++++++++++++++++++
 10 files changed, 419 insertions(+), 26 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 76ab72db96..b8d094447b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4585,6 +4585,41 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the inactive timeout invalidation mechanism. This parameter can only
+        be set in the <filename>postgresql.conf</filename> file or on the
+        server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to inactive timeout occurs during checkpoint.
+        The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a586156614..b90c163eb2 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. Once the slot is invalidated, this
+        value will remain unchanged until we shutdown the server.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has remained
+          inactive beyond the duration specified by the
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 83d6e3811e..9777c6a9cc 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,13 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 8479d03c63..1e82b7fab6 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout_sec = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -644,6 +646,11 @@ retry:
 								 "wal_level");
 				break;
 
+			case RS_INVAL_INACTIVE_TIMEOUT:
+				appendStringInfo(&err_detail, _("inactivity exceeded the time limit set by \"%s\"."),
+								 "replication_slot_inactive_timeout");
+				break;
+
 			case RS_INVAL_NONE:
 				pg_unreachable();
 		}
@@ -745,16 +752,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1550,7 +1553,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1580,6 +1584,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			Assert(inactive_since > 0);
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s, exceeding the time limit set by \"%s\"."),
+							 timestamptz_to_str(inactive_since),
+							 "replication_slot_inactive_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1596,6 +1609,30 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Is inactive timeout invalidation possible for this replication slot?
+ *
+ * Inactive timeout invalidation is allowed only when:
+ *
+ * 1. Inactive timeout is set
+ * 2. Slot is inactive
+ * 3. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the inactive timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+IsSlotInactiveTimeoutPossible(ReplicationSlot *s)
+{
+	return (replication_slot_inactive_timeout_sec > 0 &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1623,6 +1660,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1630,6 +1668,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1640,6 +1679,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1693,6 +1741,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * replication_slot_inactive_timeout GUC.
+					 */
+					if (IsSlotInactiveTimeoutPossible(s) &&
+						TimestampDifferenceExceeds(s->inactive_since, now,
+												   replication_slot_inactive_timeout_sec * 1000))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1718,11 +1780,13 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s &&
+			 active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1777,7 +1841,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1823,7 +1888,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1846,6 +1912,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1898,7 +1965,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1956,6 +2024,38 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do inactive_timeout invalidation
+		 * of thousands of replication slots here. If it is ever proven that
+		 * this assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move inactive_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2250,6 +2350,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2410,6 +2511,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2440,9 +2544,11 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
@@ -2845,3 +2951,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for replication_slot_inactive_timeout
+ *
+ * The replication_slot_inactive_timeout must be disabled (set to 0)
+ * during the binary upgrade.
+ */
+bool
+check_replication_slot_inactive_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"replication_slot_inactive_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 9845abd693..264ebd59ad 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3038,6 +3038,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time a replication slot can remain inactive before "
+						 "it will be invalidated."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&replication_slot_inactive_timeout_sec,
+		0, 0, INT_MAX,
+		check_replication_slot_inactive_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 407cd1e08c..18596686da 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -336,6 +336,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index f5f2d22163..7682f610ea 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int replication_slot_inactive_timeout_sec;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 5813dba0a2..e36b6cfe21 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_replication_slot_inactive_timeout(int *newval, void **extra,
+													GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..861509f050
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,190 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to inactive timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+primary_conninfo = '$connstr dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+my $logstart = -s $standby1->logfile;
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $inactive_timeout_1s = 1;
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout_1s}s';
+]);
+$standby1->reload;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep($inactive_timeout_1s + 1);
+
+# On standby, synced slots are not invalidated by the inactive timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+$logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout_1s}s';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# inactive timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$inactive_timeout_1s);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to inactive timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'inactive_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$inactive_timeout_1s);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become inactive and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $inactive_timeout);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	sleep($inactive_timeout + 1);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'inactive_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#288

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: Nisha Moond (#287)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha, here are my review comments for the patch v51-0001.

======
src/backend/replication/slot.c

ReplicationSlotAcquire:

1.
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("can no longer get changes from replication slot \"%s\"",
+    NameStr(s->data.name)),
+ errdetail_internal("%s", err_detail.data));
+
+ pfree(err_detail.data);
+ }
+

Won't the 'pfree' be unreachable due to the prior ereport ERROR?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#289

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: Nisha Moond (#287)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha. Here are some review comments for patch v51-0002.

======
doc/src/sgml/system-views.sgml

1.
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. Once the slot is invalidated, this
+        value will remain unchanged until we shutdown the server.
.

I think "Once the ..." kind of makes it sound like invalidation is
inevitable. Also maybe it's better to remove the "we".

SUGGESTION:
If the slot becomes invalidated, this value will remain unchanged
until server shutdown.

======
src/backend/replication/slot.c

ReplicationSlotAcquire:

2.
GENERAL.

This just is a question/idea. It may not be feasible to change. It
seems like there is a lot of overlap between the error messages in
'ReplicationSlotAcquire' which are saying "This slot has been
invalidated because...", and with the other function
'ReportSlotInvalidation' which is kind of the same but called in
different circumstances and with slightly different message text. I
wondered if there is a way to use common code to unify these messages
instead of having a nearly duplicate set of messages for all the
invalidation causes?

~~~

3.
+ case RS_INVAL_INACTIVE_TIMEOUT:
+ appendStringInfo(&err_detail, _("inactivity exceeded the time limit
set by \"%s\"."),
+ "replication_slot_inactive_timeout");
+ break;

Should this err_detail also say "This slot has been invalidated
because ..." like all the others?

~~~

InvalidatePossiblyObsoleteSlot:

4.
+ case RS_INVAL_INACTIVE_TIMEOUT:
+
+ /*
+ * Check if the slot needs to be invalidated due to
+ * replication_slot_inactive_timeout GUC.
+ */
+ if (IsSlotInactiveTimeoutPossible(s) &&
+ TimestampDifferenceExceeds(s->inactive_since, now,
+    replication_slot_inactive_timeout_sec * 1000))
+ {

Maybe this code should have Assert(now > 0); before the condition just
as a way to 'document' that it is assumed 'now' was already set this
outside the mutex.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#290

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

about 1 year ago

In reply to: Nisha Moond (#287)

RE: Introduce XID age and inactive timeout based replication slot invalidation

Dear Nisha,

Attached v51 patch-set addressing all comments in [1] and [2].

Thanks for working on the feature! I've stated to review the patch.
Here are my comments - sorry if there are something which have already been discussed.
The thread is too long to follow correctly.

Comments for 0001
=============

01. binary_upgrade_logical_slot_has_caught_up

ISTM that error_if_invalid is set to true when the slot can be moved forward, otherwise
it is set to false. Regarding the binary_upgrade_logical_slot_has_caught_up, however,
only valid slots will be passed to the funciton (see pg_upgrade/info.c) so I feel
it is OK to set to true. Thought?

02. ReplicationSlotAcquire

According to other functions, we are adding to a note to the translator when
parameters represent some common nouns, GUC names. I feel we should add a comment
for RS_INVAL_WAL_LEVEL part based on it.

Comments for 0002
=============

03. check_replication_slot_inactive_timeout

Can we overwrite replication_slot_inactive_timeout to zero when pg_uprade (and also
pg_createsubscriber?) starts a server process? Several parameters have already been
specified via -c option at that time. This can avoid an error while the upgrading.
Note that this part is still needed even if you accept the comment. Users can
manually boot with upgrade mode.

04. ReplicationSlotAcquire

Same comment as 02.

05. ReportSlotInvalidation

Same comment as 02.

06. found bug

While testing the patch, I found that slots can be invalidated too early when when
the GUC is quite large. I think because an overflow is caused in InvalidatePossiblyObsoleteSlot().

- Reproducer

I set the replication_slot_inactive_timeout to INT_MAX and executed below commands,
and found that the slot is invalidated.

```
postgres=# SHOW replication_slot_inactive_timeout;
replication_slot_inactive_timeout
-----------------------------------
2147483647s
(1 row)
postgres=# SELECT * FROM pg_create_logical_replication_slot('test', 'test_decoding');
slot_name | lsn
-----------+-----------
test | 0/18B7F38
(1 row)
postgres=# CHECKPOINT ;
CHECKPOINT
postgres=# SELECT slot_name, inactive_since, invalidation_reason FROM pg_replication_slots ;
slot_name | inactive_since | invalidation_reason
-----------+-------------------------------+---------------------
test | 2024-11-28 07:50:25.927594+00 | inactive_timeout
(1 row)
```

- analysis

In InvalidatePossiblyObsoleteSlot(), replication_slot_inactive_timeout_sec * 1000
is passed to the third argument of TimestampDifferenceExceeds(), which is also the
integer datatype. This causes an overflow and parameter is handled as the small
value.

- solution

I think there are two possible solutions. You can choose one of them:

a. Make the maximum INT_MAX/1000, or
b. Change the unit to millisecond.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#291

vignesh C

vignesh21@gmail.com

about 1 year ago

In reply to: vignesh C (#284)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, 22 Nov 2024 at 17:43, vignesh C <vignesh21@gmail.com> wrote:

On Thu, 21 Nov 2024 at 17:35, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Wed, Nov 20, 2024 at 1:29 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, 19 Nov 2024 at 12:43, Nisha Moond <nisha.moond412@gmail.com> wrote:

Attached is the v49 patch set:
- Fixed the bug reported in [1].
- Addressed comments in [2] and [3].

I've split the patch into two, implementing the suggested idea in
comment #5 of [2] separately in 001:

Patch-001: Adds additional error reports (for all invalidation types)
in ReplicationSlotAcquire() for invalid slots when error_if_invalid =
true.
Patch-002: The original patch with comments addressed.

This Assert can fail:

Attached v50 patch-set addressing review comments in [1] and [2].

We are setting inactive_since when the replication slot is released.
We are marking the slot as inactive only if it has been released.
However, there's a scenario where the network connection between the
publisher and subscriber may be lost where the replication slot is not
released, but no changes are replicated due to the network problem. In
this case, no updates would occur in the replication slot for a period
exceeding the replication_slot_inactive_timeout.
Should we invalidate these replication slots as well, or is it
intentionally left out?

On further thinking, I felt we can keep the current implementation as
is and simply add a brief comment in the code to address this.
Additionally, we can mention it in the commit message for clarity.

Regards,
Vignesh

#292

vignesh C

vignesh21@gmail.com

about 1 year ago

In reply to: Nisha Moond (#287)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, 27 Nov 2024 at 16:25, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Wed, Nov 27, 2024 at 8:39 AM Peter Smith <smithpb2250@gmail.com> wrote:
Hi Nisha,

Here are some review comments for the patch v50-0002.

======
src/backend/replication/slot.c

InvalidatePossiblyObsoleteSlot:
1.
+ if (now &&
+ TimestampDifferenceExceeds(s->inactive_since, now,
+    replication_slot_inactive_timeout_sec * 1000))
Previously this was using an additional call to SlotInactiveTimeoutCheckAllowed:
+ if (SlotInactiveTimeoutCheckAllowed(s) &&
+ TimestampDifferenceExceeds(s->inactive_since, now,
+    replication_slot_inactive_timeout * 1000))
Is it OK to skip that call? e.g. can the slot fields possibly change
between assigning the 'now' and acquiring the mutex? If not, then the
current code is fine. The only reason for asking is because it is
slightly suspicious that it was not done this "easy" way in the first
place.
Good catch! While the mutex was being acquired right after the now
assignment, there was a rare chance of another process modifying the
slot in the meantime. So, I reverted the change in v51. To optimize
the SlotInactiveTimeoutCheckAllowed() call, it's sufficient to check
it here instead of during the 'now' assignment.

Attached v51 patch-set addressing all comments in [1] and [2].

Few comments:
1) replication_slot_inactive_timeout can be mentioned in logical
replication config, we could mention something like:
Logical replication slot is also affected by replication_slot_inactive_timeout

2.a) Is this change applicable only for inactive timeout or it is
applicable to others like wal removed, wal level etc also? If it is
applicable to all of them we could move this to the first patch and
update the commit message:
+                * If the slot can be acquired, do so and mark it as
invalidated. If
+                * the slot is already ours, mark it as invalidated.
Otherwise, we'll
+                * signal the owning process below and retry.
                 */
-               if (active_pid == 0)
+               if (active_pid == 0 ||
+                       (MyReplicationSlot == s &&
+                        active_pid == MyProcPid))

2.b) Also this MyReplicationSlot and active_pid check can be in same line:
+                       (MyReplicationSlot == s &&
+                        active_pid == MyProcPid))

3) Error detail should start in upper case here similar to how others are done:
+                       case RS_INVAL_INACTIVE_TIMEOUT:
+                               appendStringInfo(&err_detail,
_("inactivity exceeded the time limit set by \"%s\"."),
+
"replication_slot_inactive_timeout");
+                               break;

4) Since this change is not related to this patch, we can move this to
the first patch and update the commit message:
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data,
size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-       TimestampTz now = 0;
+       TimestampTz now;

/*
* We need to update inactive_since only when we are promoting
standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
/* The slot sync worker or SQL function mustn't be running by now */
Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);

+       /* Use same inactive_since time for all slots */
+       now = GetCurrentTimestamp();

5) Since this change is not related to this patch, we can move this to
the first patch.
@@ -2250,6 +2350,7 @@ RestoreSlotFromDisk(const char *name)
bool restored = false;
int readBytes;
pg_crc32c checksum;
+ TimestampTz now;

/* no need to lock here, no concurrent access allowed yet */

@@ -2410,6 +2511,9 @@ RestoreSlotFromDisk(const char *name)
NameStr(cp.slotdata.name)),
errhint("Change \"wal_level\" to be
\"replica\" or higher.")));

+       /* Use same inactive_since time for all slots */
+       now = GetCurrentTimestamp();
+
        /* nothing can be active yet, don't lock anything */
        for (i = 0; i < max_replication_slots; i++)
        {
@@ -2440,9 +2544,11 @@ RestoreSlotFromDisk(const char *name)
                /*
                 * Set the time since the slot has become inactive
after loading the
                 * slot from the disk into memory. Whoever acquires
the slot i.e.
-                * makes the slot active will reset it.
+                * makes the slot active will reset it. Avoid calling
+                * ReplicationSlotSetInactiveSince() here, as it will
not set the time
+                * for invalid slots.
                 */
-               slot->inactive_since = GetCurrentTimestamp();
+               slot->inactive_since = now;

[1]: https://www.postgresql.org/docs/current/logical-replication-config.html

Regards,
Vignesh

#293

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: vignesh C (#291)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Nov 28, 2024 at 2:44 PM vignesh C <vignesh21@gmail.com> wrote:

We are setting inactive_since when the replication slot is released.
We are marking the slot as inactive only if it has been released.
However, there's a scenario where the network connection between the
publisher and subscriber may be lost where the replication slot is not
released, but no changes are replicated due to the network problem. In
this case, no updates would occur in the replication slot for a period
exceeding the replication_slot_inactive_timeout.
Should we invalidate these replication slots as well, or is it
intentionally left out?

On further thinking, I felt we can keep the current implementation as
is and simply add a brief comment in the code to address this.
Additionally, we can mention it in the commit message for clarity.

Thank you for the clarification. I’ve included the explanatory comment
in patch-002.

Attached the v52 patch-set addressing above as well as all other
comments till now in [1]/messages/by-id/CAHut+Pto1Yz9Fqp07LLP9uvx3sRHe5SOUKuFM1sUF9QA5aLfBA@mail.gmail.com, [2]/messages/by-id/CAHut+Ps=H6EBO1ssGfykrJfUQQGh76L0eKuU5XkR9GMs96ZT3g@mail.gmail.com, [3]/messages/by-id/TYAPR01MB56927564EEE26E5433198405F5292@TYAPR01MB5692.jpnprd01.prod.outlook.com, and [4]/messages/by-id/CALDaNm1F2YrswzM_WM37BYmiZ9Cf60UD_mgtm8HnMHRGA7tx4g@mail.gmail.com.

[1]: /messages/by-id/CAHut+Pto1Yz9Fqp07LLP9uvx3sRHe5SOUKuFM1sUF9QA5aLfBA@mail.gmail.com
[2]: /messages/by-id/CAHut+Ps=H6EBO1ssGfykrJfUQQGh76L0eKuU5XkR9GMs96ZT3g@mail.gmail.com
[3]: /messages/by-id/TYAPR01MB56927564EEE26E5433198405F5292@TYAPR01MB5692.jpnprd01.prod.outlook.com
[4]: /messages/by-id/CALDaNm1F2YrswzM_WM37BYmiZ9Cf60UD_mgtm8HnMHRGA7tx4g@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v52-0001-Enhance-replication-slot-error-handling-slot-inv.patchapplication/octet-stream; name=v52-0001-Enhance-replication-slot-error-handling-slot-inv.patchDownload

From 65e449aae21e77180935bf372ece0ff991f67388 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:13:26 +0530
Subject: [PATCH v52 1/2] Enhance replication slot error handling, slot
 invalidation, and inactive_since setting logic

In ReplicationSlotAcquire(), raise an error for invalid slots if the
caller specifies error_if_invalid=true.

Add check if slot is already acquired, then mark it invalidate directly.

Ensure same inactive_since time for all slots in update_synced_slots_inactive_since()
and RestoreSlotFromDisk().
---
 .../replication/logical/logicalfuncs.c        |  2 +-
 src/backend/replication/logical/slotsync.c    |  9 ++-
 src/backend/replication/slot.c                | 62 ++++++++++++++++---
 src/backend/replication/slotfuncs.c           |  2 +-
 src/backend/replication/walsender.c           |  4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |  2 +-
 src/include/replication/slot.h                |  3 +-
 src/test/recovery/t/019_replslot_limit.pl     |  2 +-
 8 files changed, 68 insertions(+), 18 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f4f80b2312..0bbd702d32 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 887e38d56e..c6b15bf601 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -535,9 +535,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +618,44 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot is found to
+	 * be invalid.
+	 */
+	if (error_if_invalid &&
+		s->data.invalidated != RS_INVAL_NONE)
+	{
+		StringInfoData err_detail;
+
+		initStringInfo(&err_detail);
+
+		switch (s->data.invalidated)
+		{
+			case RS_INVAL_WAL_REMOVED:
+				appendStringInfo(&err_detail, _("This slot has been invalidated because the required WAL has been removed."));
+				break;
+
+			case RS_INVAL_HORIZON:
+				appendStringInfo(&err_detail, _("This slot has been invalidated because the required rows have been removed."));
+				break;
+
+			case RS_INVAL_WAL_LEVEL:
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."),
+								 "wal_level");
+				break;
+
+			case RS_INVAL_NONE:
+				pg_unreachable();
+		}
+
+		ereport(ERROR,
+				errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				errmsg("can no longer get changes from replication slot \"%s\"",
+					   NameStr(s->data.name)),
+				errdetail_internal("%s", err_detail.data));
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +826,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +853,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1676,11 +1717,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s && active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -2208,6 +2250,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2368,6 +2411,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2400,7 +2446,7 @@ RestoreSlotFromDisk(const char *name)
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 488a161b3e..578cff64c8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 371eef3ddd..b36ae90b2c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 8a45b5827e..e8bc986c07 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, true);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index d2cf786fd5..f5f2d22163 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -253,7 +253,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index efb4ba3af1..333e040e7f 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This slot has been invalidated because the required WAL has been removed",
 			$logstart))
 	{
 		$failed = 1;
-- 
2.34.1

v52-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v52-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From df97e1a5dc72adff0fd0933a16b1736353d8d288 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Fri, 29 Nov 2024 12:03:41 +0530
Subject: [PATCH v52 2/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named replication_slot_inactive_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are inactive for longer than this amount of
time.

Note that the inactive timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  35 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |  10 +-
 src/backend/replication/logical/slotsync.c    |   8 +-
 src/backend/replication/slot.c                | 153 ++++++++++++--
 src/backend/utils/misc/guc_tables.c           |  12 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 ++
 src/include/utils/guc_hooks.h                 |   2 +
 src/test/recovery/meson.build                 |   1 +
 src/test/recovery/t/050_invalidate_slots.pl   | 190 ++++++++++++++++++
 13 files changed, 430 insertions(+), 20 deletions(-)
 create mode 100644 src/test/recovery/t/050_invalidate_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 76ab72db96..b8d094447b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4585,6 +4585,41 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time. If this value is specified without units,
+        it is taken as seconds. A value of zero (which is default) disables
+        the inactive timeout invalidation mechanism. This parameter can only
+        be set in the <filename>postgresql.conf</filename> file or on the
+        server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to inactive timeout occurs during checkpoint.
+        The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 8290cd1a08..8a92a422ef 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2163,6 +2163,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slot is also affected by
+    <link linkend="guc-replication-slot-inactive-timeout"><varname>replication_slot_inactive_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a586156614..9e22064de8 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has remained
+          inactive beyond the duration specified by the
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 0bbd702d32..9777c6a9cc 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1540,13 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index c6b15bf601..e99d8037f9 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -140,6 +141,7 @@ ReplicationSlot *MyReplicationSlot = NULL;
 /* GUC variables */
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
+int			replication_slot_inactive_timeout_ms = 0;
 
 /*
  * This GUC lists streaming replication standby server slot names that
@@ -645,6 +647,12 @@ retry:
 								 "wal_level");
 				break;
 
+			case RS_INVAL_INACTIVE_TIMEOUT:
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_detail, _("This slot has been invalidated because inactivity exceeded the time limit set by \"%s\"."),
+								 "replication_slot_inactive_timeout");
+				break;
+
 			case RS_INVAL_NONE:
 				pg_unreachable();
 		}
@@ -744,16 +752,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1549,7 +1553,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1579,6 +1584,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			Assert(inactive_since > 0);
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s, exceeding the time limit set by \"%s\"."),
+							 timestamptz_to_str(inactive_since),
+							 "replication_slot_inactive_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1595,6 +1610,30 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Is inactive timeout invalidation possible for this replication slot?
+ *
+ * Inactive timeout invalidation is allowed only when:
+ *
+ * 1. Inactive timeout is set
+ * 2. Slot is inactive
+ * 3. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the inactive timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+IsSlotInactiveTimeoutPossible(ReplicationSlot *s)
+{
+	return (replication_slot_inactive_timeout_ms > 0 &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1622,6 +1661,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1629,6 +1669,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1639,6 +1680,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1692,6 +1742,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * replication_slot_inactive_timeout GUC.
+					 */
+					if (IsSlotInactiveTimeoutPossible(s) &&
+						TimestampDifferenceExceeds(s->inactive_since, now,
+												   replication_slot_inactive_timeout_ms))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1777,7 +1842,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1823,7 +1889,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1846,6 +1913,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1898,7 +1966,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1956,6 +2025,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do inactive_timeout invalidation
+		 * of thousands of replication slots here. If it is ever proven that
+		 * this assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move inactive_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'inactive_timeout' occurs only for
+		 * released slots, based on 'replication_slot_inactive_timeout'.
+		 * Active slots in use for replication are excluded, preventing
+		 * accidental invalidation. Slots where communication between the
+		 * publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2444,7 +2552,9 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
 		slot->inactive_since = now;
 
@@ -2849,3 +2959,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for replication_slot_inactive_timeout
+ *
+ * The replication_slot_inactive_timeout must be disabled (set to 0)
+ * during the binary upgrade.
+ */
+bool
+check_replication_slot_inactive_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"replication_slot_inactive_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 9845abd693..40dbf16c04 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3038,6 +3038,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time a replication slot can remain inactive before "
+						 "it will be invalidated."),
+			NULL,
+			GUC_UNIT_MS
+		},
+		&replication_slot_inactive_timeout_ms,
+		0, 0, INT_MAX,
+		check_replication_slot_inactive_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 407cd1e08c..18596686da 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -336,6 +336,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in seconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index e96370a9ec..7a5567ceef 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c replication_slot_inactive_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index 91bcb4dbc7..203ab89706 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use replication_slot_inactive_timeout=0 to prevent slot invalidation
+	 * due to inactive_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c replication_slot_inactive_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index f5f2d22163..1a79671bb4 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int replication_slot_inactive_timeout_ms;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 5813dba0a2..e36b6cfe21 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_replication_slot_inactive_timeout(int *newval, void **extra,
+													GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/050_invalidate_slots.pl b/src/test/recovery/t/050_invalidate_slots.pl
new file mode 100644
index 0000000000..861509f050
--- /dev/null
+++ b/src/test/recovery/t/050_invalidate_slots.pl
@@ -0,0 +1,190 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+use Time::HiRes qw(usleep);
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to inactive timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+primary_conninfo = '$connstr dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+my $logstart = -s $standby1->logfile;
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $inactive_timeout_1s = 1;
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout_1s}s';
+]);
+$standby1->reload;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep($inactive_timeout_1s + 1);
+
+# On standby, synced slots are not invalidated by the inactive timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+$logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout_1s}s';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# inactive timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$inactive_timeout_1s);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to inactive timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'inactive_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$inactive_timeout_1s);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become inactive and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $inactive_timeout);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	sleep($inactive_timeout + 1);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'inactive_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#294

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: Nisha Moond (#278)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Nov 19, 2024 at 12:47 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

On Thu, Nov 14, 2024 at 5:29 AM Peter Smith <smithpb2250@gmail.com> wrote:
12.
/*
- * If the slot can be acquired, do so and mark it invalidated
- * immediately.  Otherwise we'll signal the owning process, below, and
- * retry.
+ * If the slot can be acquired, do so and mark it as invalidated. If
+ * the slot is already ours, mark it as invalidated. Otherwise, we'll
+ * signal the owning process below and retry.
*/
- if (active_pid == 0)
+ if (active_pid == 0 ||
+ (MyReplicationSlot == s &&
+ active_pid == MyProcPid))
I wasn't sure how this change belongs to this patch, because the logic
of the previous review comment said for the case of invalidation due
to inactivity that active_id must be 0. e.g. Assert(s->active_pid ==
0);
I don't fully understand the purpose of this change yet. I'll look
into it further and get back.

This change applies to all types of invalidation, not just
inactive_timeout case, so moved the change to patch-001. It’s a
general optimization for the case when the current process is the
active PID for the slot.
Also, the Assert(s->active_pid == 0); has been removed (in v50) as it
was unnecessary.

--
Thanks,
Nisha

#295

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: Hayato Kuroda (Fujitsu) (#290)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Nov 28, 2024 at 1:29 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Nisha,

Attached v51 patch-set addressing all comments in [1] and [2].

Thanks for working on the feature! I've stated to review the patch.
Here are my comments - sorry if there are something which have already been discussed.
The thread is too long to follow correctly.

Comments for 0001
=============

01. binary_upgrade_logical_slot_has_caught_up

ISTM that error_if_invalid is set to true when the slot can be moved forward, otherwise
it is set to false. Regarding the binary_upgrade_logical_slot_has_caught_up, however,
only valid slots will be passed to the funciton (see pg_upgrade/info.c) so I feel
it is OK to set to true. Thought?

Right, corrected the call with error_if_invalid as true.

Comments for 0002
=============

03. check_replication_slot_inactive_timeout

Can we overwrite replication_slot_inactive_timeout to zero when pg_uprade (and also
pg_createsubscriber?) starts a server process? Several parameters have already been
specified via -c option at that time. This can avoid an error while the upgrading.
Note that this part is still needed even if you accept the comment. Users can
manually boot with upgrade mode.

Done.

06. found bug

While testing the patch, I found that slots can be invalidated too early when when
the GUC is quite large. I think because an overflow is caused in InvalidatePossiblyObsoleteSlot().

- Reproducer

I set the replication_slot_inactive_timeout to INT_MAX and executed below commands,
and found that the slot is invalidated.

```
postgres=# SHOW replication_slot_inactive_timeout;
replication_slot_inactive_timeout
-----------------------------------
2147483647s
(1 row)
postgres=# SELECT * FROM pg_create_logical_replication_slot('test', 'test_decoding');
slot_name | lsn
-----------+-----------
test | 0/18B7F38
(1 row)
postgres=# CHECKPOINT ;
CHECKPOINT
postgres=# SELECT slot_name, inactive_since, invalidation_reason FROM pg_replication_slots ;
slot_name | inactive_since | invalidation_reason
-----------+-------------------------------+---------------------
test | 2024-11-28 07:50:25.927594+00 | inactive_timeout
(1 row)
```

- analysis

In InvalidatePossiblyObsoleteSlot(), replication_slot_inactive_timeout_sec * 1000
is passed to the third argument of TimestampDifferenceExceeds(), which is also the
integer datatype. This causes an overflow and parameter is handled as the small
value.

- solution

I think there are two possible solutions. You can choose one of them:

a. Make the maximum INT_MAX/1000, or
b. Change the unit to millisecond.

Fixed. It is reasonable to align with other timeout parameters by
using milliseconds as the unit.

--
Thanks,
Nisha

#296

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: Peter Smith (#289)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Nov 28, 2024 at 5:20 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha. Here are some review comments for patch v51-0002.

======
src/backend/replication/slot.c

ReplicationSlotAcquire:

2.
GENERAL.

This just is a question/idea. It may not be feasible to change. It
seems like there is a lot of overlap between the error messages in
'ReplicationSlotAcquire' which are saying "This slot has been
invalidated because...", and with the other function
'ReportSlotInvalidation' which is kind of the same but called in
different circumstances and with slightly different message text. I
wondered if there is a way to use common code to unify these messages
instead of having a nearly duplicate set of messages for all the
invalidation causes?

The error handling could be moved to a new function; however, as you
pointed out, the contexts in which these functions are called differ.
IMO, a single error message may not suit both cases. For example,
ReportSlotInvalidation provides additional details and a hint in its
message, which isn’t necessary for ReplicationSlotAcquire.
Thoughts?

--
Thanks,
Nisha

#297

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: Nisha Moond (#293)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha, here are a couple of review comments for patch v52-0001.

======
Commit Message

Add check if slot is already acquired, then mark it invalidate directly.

/slot/the slot/

"mark it invalidate" ?

Maybe you meant:
"then invalidate it directly", or
"then mark it 'invalidated' directly", or
etc.

======
src/backend/replication/logical/slotsync.c

1.
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data,
size_t startup_data_len)
static void
update_synced_slots_inactive_since(void)
{
- TimestampTz now = 0;
+ TimestampTz now;

/*
* We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
/* The slot sync worker or SQL function mustn't be running by now */
Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);

+ /* Use same inactive_since time for all slots */
+ now = GetCurrentTimestamp();
+

Something is broken with these changes.

AFAICT, the result after applying patch 0001 still has code:
/* Use the same inactive_since time for all the slots. */
if (now == 0)
now = GetCurrentTimestamp();

So the end result has multiple/competing assignments to variable 'now'.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#298

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

about 1 year ago

In reply to: Nisha Moond (#293)

RE: Introduce XID age and inactive timeout based replication slot invalidation

Dear Nisha,

Thanks for updating the patch!

Fixed. It is reasonable to align with other timeout parameters by
using milliseconds as the unit.

It looks you just replaced to GUC_UNIT_MS, but the documentation and
postgresql.conf.sample has not been changed yet. They should follow codes.
Anyway, here are other comments, mostly cosmetic.

01. slot.c

```
+int replication_slot_inactive_timeout_ms = 0;
```

According to other lines, we should add a short comment for the GUC.

02. 050_invalidate_slots.pl

Do you have a reason why you use the number 050? I feel it can be 043.

03. 050_invalidate_slots.pl

Also, not sure the file name is correct. This file contains only a slot invalidation due to the
replication_slot_inactive_timeout. But I feel current name is too general.

04. 050_invalidate_slots.pl

```
+use Time::HiRes qw(usleep);
```

This line is not needed because usleep() is not used in this file.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#299

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: Hayato Kuroda (Fujitsu) (#298)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Dec 3, 2024 at 1:09 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Nisha,

Thanks for updating the patch!

Fixed. It is reasonable to align with other timeout parameters by
using milliseconds as the unit.

It looks you just replaced to GUC_UNIT_MS, but the documentation and
postgresql.conf.sample has not been changed yet. They should follow codes.
Anyway, here are other comments, mostly cosmetic.

Here is v53 patch-set addressing all the comments in [1]/messages/by-id/CAHut+PsQM79f34LLBGq4UeRuZ1URWP6JNZtdN2khYPrLc1YqrQ@mail.gmail.com and [2]/messages/by-id/TYAPR01MB5692B7687EE7981AA91BA5B9F5362@TYAPR01MB5692.jpnprd01.prod.outlook.com.

[1]: /messages/by-id/CAHut+PsQM79f34LLBGq4UeRuZ1URWP6JNZtdN2khYPrLc1YqrQ@mail.gmail.com
[2]: /messages/by-id/TYAPR01MB5692B7687EE7981AA91BA5B9F5362@TYAPR01MB5692.jpnprd01.prod.outlook.com

--
Thanks,
Nisha

Attachments:

v53-0001-Enhance-replication-slot-error-handling-slot-inv.patchapplication/x-patch; name=v53-0001-Enhance-replication-slot-error-handling-slot-inv.patchDownload

From b7e55950ae7577dfc8ae8893157d6b6124fdba01 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:13:26 +0530
Subject: [PATCH v53 1/2] Enhance replication slot error handling, slot
 invalidation, and inactive_since setting logic

In ReplicationSlotAcquire(), raise an error for invalid slots if the
caller specifies error_if_invalid=true.

Add check if the slot is already acquired, then mark it invalidated directly.

Ensure same inactive_since time for all slots in update_synced_slots_inactive_since()
and RestoreSlotFromDisk().
---
 .../replication/logical/logicalfuncs.c        |  2 +-
 src/backend/replication/logical/slotsync.c    | 13 ++--
 src/backend/replication/slot.c                | 62 ++++++++++++++++---
 src/backend/replication/slotfuncs.c           |  2 +-
 src/backend/replication/walsender.c           |  4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |  2 +-
 src/include/replication/slot.h                |  3 +-
 src/test/recovery/t/019_replslot_limit.pl     |  2 +-
 8 files changed, 68 insertions(+), 22 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f4f80b2312..e3645aea53 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,10 +1540,6 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
 			SpinLockAcquire(&s->mutex);
 			s->inactive_since = now;
 			SpinLockRelease(&s->mutex);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 4a206f9527..7e84e46fef 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -535,9 +535,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +618,44 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot is found to
+	 * be invalid.
+	 */
+	if (error_if_invalid &&
+		s->data.invalidated != RS_INVAL_NONE)
+	{
+		StringInfoData err_detail;
+
+		initStringInfo(&err_detail);
+
+		switch (s->data.invalidated)
+		{
+			case RS_INVAL_WAL_REMOVED:
+				appendStringInfo(&err_detail, _("This slot has been invalidated because the required WAL has been removed."));
+				break;
+
+			case RS_INVAL_HORIZON:
+				appendStringInfo(&err_detail, _("This slot has been invalidated because the required rows have been removed."));
+				break;
+
+			case RS_INVAL_WAL_LEVEL:
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."),
+								 "wal_level");
+				break;
+
+			case RS_INVAL_NONE:
+				pg_unreachable();
+		}
+
+		ereport(ERROR,
+				errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				errmsg("can no longer get changes from replication slot \"%s\"",
+					   NameStr(s->data.name)),
+				errdetail_internal("%s", err_detail.data));
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +826,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +853,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1676,11 +1717,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s && active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -2208,6 +2250,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2368,6 +2411,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2400,7 +2446,7 @@ RestoreSlotFromDisk(const char *name)
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 488a161b3e..578cff64c8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 371eef3ddd..b36ae90b2c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 8a45b5827e..e8bc986c07 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, true);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index d2cf786fd5..f5f2d22163 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -253,7 +253,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index efb4ba3af1..333e040e7f 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This slot has been invalidated because the required WAL has been removed",
 			$logstart))
 	{
 		$failed = 1;
-- 
2.34.1

v53-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/x-patch; name=v53-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From e4d9d988b6b131b3ff1ce9b3d2b9b625b07bc8a7 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Wed, 4 Dec 2024 12:51:22 +0530
Subject: [PATCH v53 2/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named replication_slot_inactive_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are inactive for longer than this amount of
time.

Note that the inactive timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  35 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |  10 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 155 ++++++++++++--
 src/backend/utils/misc/guc_tables.c           |  12 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 ++
 src/include/utils/guc_hooks.h                 |   2 +
 src/test/recovery/meson.build                 |   1 +
 .../t/043_invalidate_inactive_slots.pl        | 189 ++++++++++++++++++
 13 files changed, 431 insertions(+), 16 deletions(-)
 create mode 100644 src/test/recovery/t/043_invalidate_inactive_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e0c8325a39..b59275fcad 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4593,6 +4593,41 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replication-slot-inactive-timeout" xreflabel="replication_slot_inactive_timeout">
+      <term><varname>replication_slot_inactive_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replication_slot_inactive_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are inactive for longer than this
+        amount of time. If this value is specified without units,
+        it is taken as milliseconds. A value of zero (which is default) disables
+        the inactive timeout invalidation mechanism. This parameter can only
+        be set in the <filename>postgresql.conf</filename> file or on the
+        server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to inactive timeout occurs during checkpoint.
+        The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 8290cd1a08..8a92a422ef 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2163,6 +2163,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slot is also affected by
+    <link linkend="guc-replication-slot-inactive-timeout"><varname>replication_slot_inactive_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a586156614..9e22064de8 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>inactive_timeout</literal> means that the slot has remained
+          inactive beyond the duration specified by the
+          <xref linkend="guc-replication-slot-inactive-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index e3645aea53..9777c6a9cc 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1540,9 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 7e84e46fef..4816d28b62 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_INACTIVE_TIMEOUT] = "inactive_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_INACTIVE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,9 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/* Invalidate replication slots inactive beyond this time; '0' disables it */
+int			replication_slot_inactive_timeout_ms = 0;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -645,6 +649,12 @@ retry:
 								 "wal_level");
 				break;
 
+			case RS_INVAL_INACTIVE_TIMEOUT:
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_detail, _("This slot has been invalidated because inactivity exceeded the time limit set by \"%s\"."),
+								 "replication_slot_inactive_timeout");
+				break;
+
 			case RS_INVAL_NONE:
 				pg_unreachable();
 		}
@@ -744,16 +754,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1549,7 +1555,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1579,6 +1586,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_INACTIVE_TIMEOUT:
+			Assert(inactive_since > 0);
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s, exceeding the time limit set by \"%s\"."),
+							 timestamptz_to_str(inactive_since),
+							 "replication_slot_inactive_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1595,6 +1612,30 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Is inactive timeout invalidation possible for this replication slot?
+ *
+ * Inactive timeout invalidation is allowed only when:
+ *
+ * 1. Inactive timeout is set
+ * 2. Slot is inactive
+ * 3. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the inactive timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+IsSlotInactiveTimeoutPossible(ReplicationSlot *s)
+{
+	return (replication_slot_inactive_timeout_ms > 0 &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1622,6 +1663,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1629,6 +1671,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1639,6 +1682,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_INACTIVE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1692,6 +1744,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_INACTIVE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * replication_slot_inactive_timeout GUC.
+					 */
+					if (IsSlotInactiveTimeoutPossible(s) &&
+						TimestampDifferenceExceeds(s->inactive_since, now,
+												   replication_slot_inactive_timeout_ms))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1777,7 +1844,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1823,7 +1891,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1846,6 +1915,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_INACTIVE_TIMEOUT: inactive slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1898,7 +1968,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1956,6 +2027,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do inactive_timeout invalidation
+		 * of thousands of replication slots here. If it is ever proven that
+		 * this assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move inactive_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'inactive_timeout' occurs only for
+		 * released slots, based on 'replication_slot_inactive_timeout'.
+		 * Active slots in use for replication are excluded, preventing
+		 * accidental invalidation. Slots where communication between the
+		 * publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_INACTIVE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2444,7 +2554,9 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
 		slot->inactive_since = now;
 
@@ -2839,3 +2951,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for replication_slot_inactive_timeout
+ *
+ * The replication_slot_inactive_timeout must be disabled (set to 0)
+ * during the binary upgrade.
+ */
+bool
+check_replication_slot_inactive_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"replication_slot_inactive_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8cf1afbad2..0e00c07f91 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3047,6 +3047,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"replication_slot_inactive_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time a replication slot can remain inactive before "
+						 "it will be invalidated."),
+			NULL,
+			GUC_UNIT_MS
+		},
+		&replication_slot_inactive_timeout_ms,
+		0, 0, INT_MAX,
+		check_replication_slot_inactive_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a2ac7575ca..2f0fa84e6d 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -337,6 +337,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#replication_slot_inactive_timeout = 0	# in milliseconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index e96370a9ec..7a5567ceef 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c replication_slot_inactive_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index 91bcb4dbc7..203ab89706 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use replication_slot_inactive_timeout=0 to prevent slot invalidation
+	 * due to inactive_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c replication_slot_inactive_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index f5f2d22163..1a79671bb4 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* inactive slot timeout has occurred */
+	RS_INVAL_INACTIVE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int replication_slot_inactive_timeout_ms;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 5813dba0a2..e36b6cfe21 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_replication_slot_inactive_timeout(int *newval, void **extra,
+													GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/043_invalidate_inactive_slots.pl b/src/test/recovery/t/043_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..6aae73a548
--- /dev/null
+++ b/src/test/recovery/t/043_invalidate_inactive_slots.pl
@@ -0,0 +1,189 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to inactive timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+primary_conninfo = '$connstr dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+my $logstart = -s $standby1->logfile;
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $inactive_timeout_1s = 1;
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout_1s}s';
+]);
+$standby1->reload;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep($inactive_timeout_1s + 1);
+
+# On standby, synced slots are not invalidated by the inactive timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+$logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET replication_slot_inactive_timeout TO '${inactive_timeout_1s}s';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# inactive timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$inactive_timeout_1s);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to inactive timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'inactive_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$inactive_timeout_1s);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become inactive and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $inactive_timeout);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $inactive_timeout) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	sleep($inactive_timeout + 1);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'inactive_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'inactive_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#300

vignesh C

vignesh21@gmail.com

about 1 year ago

In reply to: Nisha Moond (#299)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, 4 Dec 2024 at 15:01, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Tue, Dec 3, 2024 at 1:09 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Nisha,

Thanks for updating the patch!

Fixed. It is reasonable to align with other timeout parameters by
using milliseconds as the unit.

It looks you just replaced to GUC_UNIT_MS, but the documentation and
postgresql.conf.sample has not been changed yet. They should follow codes.
Anyway, here are other comments, mostly cosmetic.

Here is v53 patch-set addressing all the comments in [1] and [2].

Currently, replication slots are invalidated based on the
replication_slot_inactive_timeout only during a checkpoint. This means
that if the checkpoint_timeout is set to a higher value than the
replication_slot_inactive_timeout, slot invalidation will occur only
when the checkpoint is triggered. Identifying the invalidation slots
might be slightly delayed in this case. As an alternative, users can
forcefully invalidate inactive slots that have exceeded the
replication_slot_inactive_timeout by forcing a checkpoint. I was
thinking we could suggest this in the documentation.

+       <para>
+        Slot invalidation due to inactive timeout occurs during checkpoint.
+        The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+

We could accurately invalidate the slots using the checkpointer
process by calculating the invalidation time based on the active_since
timestamp and the replication_slot_inactive_timeout, and then set the
checkpointer's main wait-latch accordingly for triggering the next
checkpoint. Ideally, a different process handling this task would be
better, but there is currently no dedicated daemon capable of
identifying and managing slots across streaming replication, logical
replication, and other slots used by plugins. Additionally,
overloading the checkpointer with this responsibility may not be
ideal. As an alternative, we could document about this delay in
identifying and mention that it could be triggered by forceful manual
checkpoint.

Regards,
Vignesh

#301

vignesh C

vignesh21@gmail.com

about 1 year ago

In reply to: Nisha Moond (#299)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, 4 Dec 2024 at 15:01, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Tue, Dec 3, 2024 at 1:09 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Nisha,

Thanks for updating the patch!

Fixed. It is reasonable to align with other timeout parameters by
using milliseconds as the unit.

It looks you just replaced to GUC_UNIT_MS, but the documentation and
postgresql.conf.sample has not been changed yet. They should follow codes.
Anyway, here are other comments, mostly cosmetic.

Here is v53 patch-set addressing all the comments in [1] and [2].

CFBot is failing at [1] because the file name is changed to
043_invalidate_inactive_slots, the meson.build file should be updated
accordingly:
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..708a2a3798 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/050_invalidate_slots.pl',
     ],
   },
 }

[1]: https://cirrus-ci.com/task/6266479424831488

Regards,
Vignesh

#302

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: Nisha Moond (#299)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha,

Here are my review comments for the v53* patch set

//////////

Patch v53-0001.

======
src/backend/replication/slot.c

1.
+ if (error_if_invalid &&
+ s->data.invalidated != RS_INVAL_NONE)

Looks like some unnecessary wrapping here. I think this condition can
be on one line.

//////////

Patch v53-0002.

======
GENERAL - How about using the term "idle"?

1.
I got to wondering why this new GUC was called
"replication_slot_inactive_timeout", with invalidation_reason =
"inactive_timeout". When I look at similar GUCs I don't see words like
"inactivity" or "inactive" anywhere; Instead, they are using the term
"idle" to refer to when something is inactive:
e.g.
#idle_in_transaction_session_timeout = 0 # in milliseconds, 0 is disabled
#idle_session_timeout = 0 # in milliseconds, 0 is disabled

I know the "inactive" term is used a bit in the slot code but that is
(mostly) not exposed to the user. Therefore, I am beginning to feel it
would be better (e.g. more consistent) to use "idle" for the
user-facing stuff. e.g.
New Slot GUC = "idle_replication_slot_timeout"
Slot invalidation_reason = "idle_timeout"

Of course, changing this will cascade to impact quite a lot of other
things in the patch -- comments, error messages, some function names
etc.

======
doc/src/sgml/logical-replication.sgml

2.
+   <para>
+    Logical replication slot is also affected by
+    <link linkend="guc-replication-slot-inactive-timeout"><varname>replication_slot_inactive_timeout</varname></link>.
+   </para>
+

/Logical replication slot is also affected by/Logical replication
slots are also affected by/

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#303

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: vignesh C (#300)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Dec 4, 2024 at 9:27 PM vignesh C <vignesh21@gmail.com> wrote:

...

Currently, replication slots are invalidated based on the
replication_slot_inactive_timeout only during a checkpoint. This means
that if the checkpoint_timeout is set to a higher value than the
replication_slot_inactive_timeout, slot invalidation will occur only
when the checkpoint is triggered. Identifying the invalidation slots
might be slightly delayed in this case. As an alternative, users can
forcefully invalidate inactive slots that have exceeded the
replication_slot_inactive_timeout by forcing a checkpoint. I was
thinking we could suggest this in the documentation.
+       <para>
+        Slot invalidation due to inactive timeout occurs during checkpoint.
+        The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
We could accurately invalidate the slots using the checkpointer
process by calculating the invalidation time based on the active_since
timestamp and the replication_slot_inactive_timeout, and then set the
checkpointer's main wait-latch accordingly for triggering the next
checkpoint. Ideally, a different process handling this task would be
better, but there is currently no dedicated daemon capable of
identifying and managing slots across streaming replication, logical
replication, and other slots used by plugins. Additionally,
overloading the checkpointer with this responsibility may not be
ideal. As an alternative, we could document about this delay in
identifying and mention that it could be triggered by forceful manual
checkpoint.

Hi Vignesh.

I felt that manipulating the checkpoint timing behind the scenes
without the user's consent might be a bit of an overreach.

But there might still be something else we could do:

1. We can add the documentation note like you suggested ("we could
document about this delay in identifying and mention that it could be
triggered by forceful manual checkpoint").

2. We can also detect such delays in the code. When the invalidation
occurs (e.g. code fragment below) we could check if there was some
excessive lag between the slot becoming idle and it being invalidated.
If the lag is too much (whatever "too much" means) we can log a hint
for the user to increase the checkpoint frequency (or whatever else we
might advise them to do).

+ /*
+ * Check if the slot needs to be invalidated due to
+ * replication_slot_inactive_timeout GUC.
+ */
+ if (IsSlotInactiveTimeoutPossible(s) &&
+ TimestampDifferenceExceeds(s->inactive_since, now,
+    replication_slot_inactive_timeout_ms))
+ {
+ invalidation_cause = cause;
+ inactive_since = s->inactive_since;

pseudo-code:
if (slot invalidation occurred much later after the
replication_slot_inactive_timeout GUC elapsed)
{
elog(LOG, "This slot was inactive for a period of %s. Slot timeout
invalidation only occurs at a checkpoint so if you want inactive slots
to be invalidated in a more timely manner consider reducing the time
between checkpoints or executing a manual checkpoint.
(replication_slot_inactive_timeout = %s; checkpoint_timeout = %s,
....)"
}

+ }

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#304

vignesh C

vignesh21@gmail.com

about 1 year ago

In reply to: Peter Smith (#303)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, 5 Dec 2024 at 06:44, Peter Smith <smithpb2250@gmail.com> wrote:

On Wed, Dec 4, 2024 at 9:27 PM vignesh C <vignesh21@gmail.com> wrote:

...
Currently, replication slots are invalidated based on the
replication_slot_inactive_timeout only during a checkpoint. This means
that if the checkpoint_timeout is set to a higher value than the
replication_slot_inactive_timeout, slot invalidation will occur only
when the checkpoint is triggered. Identifying the invalidation slots
might be slightly delayed in this case. As an alternative, users can
forcefully invalidate inactive slots that have exceeded the
replication_slot_inactive_timeout by forcing a checkpoint. I was
thinking we could suggest this in the documentation.
+       <para>
+        Slot invalidation due to inactive timeout occurs during checkpoint.
+        The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
We could accurately invalidate the slots using the checkpointer
process by calculating the invalidation time based on the active_since
timestamp and the replication_slot_inactive_timeout, and then set the
checkpointer's main wait-latch accordingly for triggering the next
checkpoint. Ideally, a different process handling this task would be
better, but there is currently no dedicated daemon capable of
identifying and managing slots across streaming replication, logical
replication, and other slots used by plugins. Additionally,
overloading the checkpointer with this responsibility may not be
ideal. As an alternative, we could document about this delay in
identifying and mention that it could be triggered by forceful manual
checkpoint.
Hi Vignesh.

I felt that manipulating the checkpoint timing behind the scenes
without the user's consent might be a bit of an overreach.

Agree

But there might still be something else we could do:

1. We can add the documentation note like you suggested ("we could
document about this delay in identifying and mention that it could be
triggered by forceful manual checkpoint").

Yes, that makes sense

2. We can also detect such delays in the code. When the invalidation
occurs (e.g. code fragment below) we could check if there was some
excessive lag between the slot becoming idle and it being invalidated.
If the lag is too much (whatever "too much" means) we can log a hint
for the user to increase the checkpoint frequency (or whatever else we
might advise them to do).
+ /*
+ * Check if the slot needs to be invalidated due to
+ * replication_slot_inactive_timeout GUC.
+ */
+ if (IsSlotInactiveTimeoutPossible(s) &&
+ TimestampDifferenceExceeds(s->inactive_since, now,
+    replication_slot_inactive_timeout_ms))
+ {
+ invalidation_cause = cause;
+ inactive_since = s->inactive_since;
pseudo-code:
if (slot invalidation occurred much later after the
replication_slot_inactive_timeout GUC elapsed)
{
elog(LOG, "This slot was inactive for a period of %s. Slot timeout
invalidation only occurs at a checkpoint so if you want inactive slots
to be invalidated in a more timely manner consider reducing the time
between checkpoints or executing a manual checkpoint.
(replication_slot_inactive_timeout = %s; checkpoint_timeout = %s,
....)"
}

+ }

Determining the correct time may be challenging for users, as it
depends on when the active_since value is set, as well as when the
checkpoint_timeout occurs and the subsequent checkpoint is triggered.
Even if the user sets it to an appropriate value, there is still a
possibility of delayed identification due to the timing of when the
slot's active_timeout is being set. Including this information in the
documentation should be sufficient.

Regards,
Vignesh

#305

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: vignesh C (#304)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Dec 6, 2024 at 11:04 AM vignesh C <vignesh21@gmail.com> wrote:

Determining the correct time may be challenging for users, as it
depends on when the active_since value is set, as well as when the
checkpoint_timeout occurs and the subsequent checkpoint is triggered.
Even if the user sets it to an appropriate value, there is still a
possibility of delayed identification due to the timing of when the
slot's active_timeout is being set. Including this information in the
documentation should be sufficient.

+1
v54 documents this information as suggested.

Attached the v54 patch-set addressing all the comments till now in
[1]: /messages/by-id/CALDaNm0mTWwg0z4v-sorq08S2CdZmL2s+rh4nHpWeJaBQ2F+mg@mail.gmail.com

[1]: /messages/by-id/CALDaNm0mTWwg0z4v-sorq08S2CdZmL2s+rh4nHpWeJaBQ2F+mg@mail.gmail.com
[2]: /messages/by-id/CALDaNm1STyk=S_EAihWP9SowBkS5dJ32JfEqmG5tTeC2Ct39yg@mail.gmail.com
[3]: /messages/by-id/CAHut+PtHbYNxPvtMfs7jARbsVcFXL1=C9SO3Q93NgVDgbKN7LQ@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v54-0001-Enhance-replication-slot-error-handling-slot-inv.patchapplication/x-patch; name=v54-0001-Enhance-replication-slot-error-handling-slot-inv.patchDownload

From 713871d8cda02f2b70c63983fc49dede3097f016 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:13:26 +0530
Subject: [PATCH v54 1/2] Enhance replication slot error handling, slot
 invalidation, and inactive_since setting logic

In ReplicationSlotAcquire(), raise an error for invalid slots if the
caller specifies error_if_invalid=true.

Add check if the slot is already acquired, then mark it invalidated directly.

Ensure same inactive_since time for all slots in update_synced_slots_inactive_since()
and RestoreSlotFromDisk().
---
 .../replication/logical/logicalfuncs.c        |  2 +-
 src/backend/replication/logical/slotsync.c    | 13 ++--
 src/backend/replication/slot.c                | 61 ++++++++++++++++---
 src/backend/replication/slotfuncs.c           |  2 +-
 src/backend/replication/walsender.c           |  4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |  2 +-
 src/include/replication/slot.h                |  3 +-
 src/test/recovery/t/019_replslot_limit.pl     |  2 +-
 8 files changed, 67 insertions(+), 22 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f4f80b2312..e3645aea53 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,10 +1540,6 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
 			SpinLockAcquire(&s->mutex);
 			s->inactive_since = now;
 			SpinLockRelease(&s->mutex);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 4a206f9527..db94cec5c3 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -535,9 +535,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +618,43 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot is found to
+	 * be invalid.
+	 */
+	if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
+	{
+		StringInfoData err_detail;
+
+		initStringInfo(&err_detail);
+
+		switch (s->data.invalidated)
+		{
+			case RS_INVAL_WAL_REMOVED:
+				appendStringInfo(&err_detail, _("This slot has been invalidated because the required WAL has been removed."));
+				break;
+
+			case RS_INVAL_HORIZON:
+				appendStringInfo(&err_detail, _("This slot has been invalidated because the required rows have been removed."));
+				break;
+
+			case RS_INVAL_WAL_LEVEL:
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."),
+								 "wal_level");
+				break;
+
+			case RS_INVAL_NONE:
+				pg_unreachable();
+		}
+
+		ereport(ERROR,
+				errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				errmsg("can no longer get changes from replication slot \"%s\"",
+					   NameStr(s->data.name)),
+				errdetail_internal("%s", err_detail.data));
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +825,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +852,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1676,11 +1716,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s && active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -2208,6 +2249,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2368,6 +2410,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2400,7 +2445,7 @@ RestoreSlotFromDisk(const char *name)
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 488a161b3e..578cff64c8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 371eef3ddd..b36ae90b2c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 8a45b5827e..e8bc986c07 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, true);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index d2cf786fd5..f5f2d22163 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -253,7 +253,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index efb4ba3af1..333e040e7f 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This slot has been invalidated because the required WAL has been removed",
 			$logstart))
 	{
 		$failed = 1;
-- 
2.34.1

v54-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/x-patch; name=v54-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 7fc20252d8828083613e7948b9d5a48349af0b26 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Wed, 4 Dec 2024 12:51:22 +0530
Subject: [PATCH v54 2/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  39 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |  10 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 155 +++++++++++++--
 src/backend/utils/misc/guc_tables.c           |  12 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 ++
 src/include/utils/guc_hooks.h                 |   2 +
 src/test/recovery/meson.build                 |   1 +
 .../t/043_invalidate_inactive_slots.pl        | 188 ++++++++++++++++++
 13 files changed, 434 insertions(+), 16 deletions(-)
 create mode 100644 src/test/recovery/t/043_invalidate_inactive_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e0c8325a39..a888c709fd 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4593,6 +4593,45 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are idle for longer than this
+        amount of time. If this value is specified without units,
+        it is taken as milliseconds. A value of zero (which is default) disables
+        the idle timeout invalidation mechanism. This parameter can only
+        be set in the <filename>postgresql.conf</filename> file or on the
+        server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        If the <varname>checkpoint_timeout</varname> exceeds
+        <varname>idle_replication_slot_timeout</varname>, the slot
+        invalidation will be delayed until the next checkpoint is triggered.
+        To avoid delays, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 8290cd1a08..158ec18211 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2163,6 +2163,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a586156614..611db4e539 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle beyond the duration specified by the
+          <xref linkend="guc-idle-replication-slot-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index e3645aea53..9777c6a9cc 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1540,9 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index db94cec5c3..41ef8ecc89 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,9 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/* Invalidate replication slots idle beyond this time; '0' disables it */
+int			idle_replication_slot_timeout_ms = 0;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -644,6 +648,12 @@ retry:
 								 "wal_level");
 				break;
 
+			case RS_INVAL_IDLE_TIMEOUT:
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_detail, _("This slot has been invalidated because inactivity exceeded the time limit set by \"%s\"."),
+								 "idle_replication_slot_timeout");
+				break;
+
 			case RS_INVAL_NONE:
 				pg_unreachable();
 		}
@@ -743,16 +753,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1548,7 +1554,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1578,6 +1585,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s, exceeding the time limit set by \"%s\"."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1594,6 +1611,30 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Is idle timeout invalidation possible for this replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot is inactive
+ * 3. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+IsSlotIdleTimeoutPossible(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_ms > 0 &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1621,6 +1662,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1628,6 +1670,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1638,6 +1681,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1691,6 +1743,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (IsSlotIdleTimeoutPossible(s) &&
+						TimestampDifferenceExceeds(s->inactive_since, now,
+												   idle_replication_slot_timeout_ms))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1776,7 +1843,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1822,7 +1890,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1845,6 +1914,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_IDLE_TIMEOUT: idle slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1897,7 +1967,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1955,6 +2026,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' occurs only for
+		 * released slots, based on 'idle_replication_slot_timeout'. Active
+		 * slots in use for replication are excluded, preventing accidental
+		 * invalidation. Slots where communication between the publisher and
+		 * subscriber is down are also excluded, as they are managed by the
+		 * 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2443,7 +2553,9 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
 		slot->inactive_since = now;
 
@@ -2838,3 +2950,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The idle_replication_slot_timeout must be disabled (set to 0)
+ * during the binary upgrade.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8cf1afbad2..6a4f15b832 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3047,6 +3047,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the amount of time a replication slot can remain idle before "
+						 "it will be invalidated."),
+			NULL,
+			GUC_UNIT_MS
+		},
+		&idle_replication_slot_timeout_ms,
+		0, 0, INT_MAX,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a2ac7575ca..6b5c246e8d 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -337,6 +337,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#idle_replication_slot_timeout = 0	# in milliseconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index e96370a9ec..a9ac00f44f 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index 91bcb4dbc7..90e2b9f188 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * inactive_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index f5f2d22163..98625f9d13 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_ms;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 5813dba0a2..d7a7dffab5 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..3b8f45c93e 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/043_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/043_invalidate_inactive_slots.pl b/src/test/recovery/t/043_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..a947bdf2e5
--- /dev/null
+++ b/src/test/recovery/t/043_invalidate_inactive_slots.pl
@@ -0,0 +1,188 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to idle timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+autovacuum = off
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+primary_conninfo = '$connstr dbname=postgres'
+));
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+my $logstart = -s $standby1->logfile;
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $idle_timeout_1s = 1;
+$standby1->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1s}s';
+]);
+$standby1->reload;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep($idle_timeout_1s + 1);
+
+# On standby, synced slots are not invalidated by the idle timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	"t",
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+$logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1s}s';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$idle_timeout_1s);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to idle timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'idle_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart, $idle_timeout_1s);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $idle_timeout);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	sleep($idle_timeout + 1);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#306

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: Nisha Moond (#305)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha.

Here are some review comments for patch v54-0002.

(I had also checked patch v54-0001, but have no further review
comments for that one).

======
doc/src/sgml/config.sgml

1.
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        If the <varname>checkpoint_timeout</varname> exceeds
+        <varname>idle_replication_slot_timeout</varname>, the slot
+        invalidation will be delayed until the next checkpoint is triggered.
+        To avoid delays, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated
using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+

The wording of "If the checkpoint_timeout exceeds
idle_replication_slot_timeout, the slot invalidation will be delayed
until the next checkpoint is triggered." seems slightly misleading,
because AFAIK it is not conditional on the GUC value differences like
that -- i.e. slot invalidation is *always* delayed until the next
checkpoint occurs.

SUGGESTION:
Slot invalidation due to idle timeout occurs during checkpoint.
Because checkpoints happen at checkpoint_timeout intervals, there can
be some lag between when the idle_replication_slot_timeout was
exceeded and when the slot invalidation is triggered at the next
checkpoint. To avoid such lags, users can force...

=======
src/backend/replication/slot.c

2. GENERAL

+/* Invalidate replication slots idle beyond this time; '0' disables it */
+int idle_replication_slot_timeout_ms = 0;

I noticed this patch is using a variety of ways of describing the same thing:
* guc var: Invalidate replication slots idle beyond this time...
* guc_tables: ... the amount of time a replication slot can remain
idle before it will be invalidated.
* docs: means that the slot has remained idle beyond the duration
specified by the idle_replication_slot_timeout parameter
* errmsg: ... slot has been invalidated because inactivity exceeded
the time limit set by ...
* etc..

They are all the same, but they are all worded slightly differently:
* "idle" vs "inactivity" vs ...
* "time" vs "amount of time" vs "duration" vs "time limit" vs ...

There may not be a one-size-fits-all, but still, it might be better to
try to search for all different phrasing and use common wording as
much as possible.

~~~

CheckPointReplicationSlots:

3.
+ * XXX: Slot invalidation due to 'idle_timeout' occurs only for
+ * released slots, based on 'idle_replication_slot_timeout'. Active
+ * slots in use for replication are excluded, preventing accidental
+ * invalidation. Slots where communication between the publisher and
+ * subscriber is down are also excluded, as they are managed by the
+ * 'wal_sender_timeout'.

Maybe a slight rewording like below is better. Maybe not. YMMV.

SUGGESTION:
XXX: Slot invalidation due to 'idle_timeout' applies only to released
slots, and is based on the 'idle_replication_slot_timeout' GUC. Active
slots
currently in use for replication are excluded to prevent accidental
invalidation. Slots...

======
src/bin/pg_upgrade/server.c

4.
+ /*
+ * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+ * inactive_timeout by checkpointer process during upgrade.
+ */
+ if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+ appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+

/inactive_timeout/idle_timeout/

======
src/test/recovery/t/043_invalidate_inactive_slots.pl

5.
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+ my ($node, $slot, $offset, $idle_timeout) = @_;
+ my $node_name = $node->name;

AFAICT this 'idle_timeout' parameter is passed units of "seconds", so
it would be better to call it something like 'idle_timeout_s' to make
the units clear.

~~~

6.
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+ my ($node, $slot, $offset, $idle_timeout) = @_;
+ my $node_name = $node->name;
+ my $invalidated = 0;

Ditto above review comment #5 -- better to call it something like
'idle_timeout_s' to make the units clear.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#307

vignesh C

vignesh21@gmail.com

about 1 year ago

In reply to: Nisha Moond (#305)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, 10 Dec 2024 at 17:21, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Fri, Dec 6, 2024 at 11:04 AM vignesh C <vignesh21@gmail.com> wrote:

Determining the correct time may be challenging for users, as it
depends on when the active_since value is set, as well as when the
checkpoint_timeout occurs and the subsequent checkpoint is triggered.
Even if the user sets it to an appropriate value, there is still a
possibility of delayed identification due to the timing of when the
slot's active_timeout is being set. Including this information in the
documentation should be sufficient.

+1
v54 documents this information as suggested.

Attached the v54 patch-set addressing all the comments till now in

Few comments on the test added:
1) Can we remove this and set idle_replication_slot_timeout while the
standby node is created itself during append_conf:
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $idle_timeout_1s = 1;
+$standby1->safe_psql(
+       'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1s}s';
+]);
+$standby1->reload;

2) You can move these statements before the standby node is created:
+# Create sync slot on the primary
+$primary->psql('postgres',
+       q{SELECT pg_create_logical_replication_slot('sync_slot1',
'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+       'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name :=
'sb_slot1', immediately_reserve := true);
+]);

3) Do we need autovacuum as off for these tests, is there any
probability of a test failure without this. I felt it should not
impact these tests, if not we can remove this:
+# Avoid unpredictability
+$primary->append_conf(
+       'postgresql.conf', qq{
+checkpoint_timeout = 1h
+autovacuum = off
+});

4) Generally we mention single char in single quotes, we can update "t" to 't':
+       ),
+       "t",
+       'logical slot sync_slot1 is synced to standby');
+

5) Similarly here too:
+                 WHERE slot_name = 'sync_slot1'
+                       AND invalidation_reason IS NULL;}
+       ),
+       "t",
+       'check that synced slot sync_slot1 has not been invalidated on
standby');

6) This standby offset is not used anywhere, it can be removed:
+my $logstart = -s $standby1->logfile;
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.

Regards,
Vignesh

#308

vignesh C

vignesh21@gmail.com

about 1 year ago

In reply to: Nisha Moond (#305)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, 10 Dec 2024 at 17:21, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Fri, Dec 6, 2024 at 11:04 AM vignesh C <vignesh21@gmail.com> wrote:

Determining the correct time may be challenging for users, as it
depends on when the active_since value is set, as well as when the
checkpoint_timeout occurs and the subsequent checkpoint is triggered.
Even if the user sets it to an appropriate value, there is still a
possibility of delayed identification due to the timing of when the
slot's active_timeout is being set. Including this information in the
documentation should be sufficient.

+1
v54 documents this information as suggested.

Attached the v54 patch-set addressing all the comments till now in
[1], [2] and [3].

Now that we support idle_replication_slot_timeout in milliseconds, we
can set this value from 1s to 1ms or 10millseconds and change sleep to
usleep, this will bring down the test execution time significantly:
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $idle_timeout_1s = 1;
+$standby1->safe_psql(
+       'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1s}s';
+]);
+$standby1->reload;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+               'postgres',
+               q{SELECT count(*) = 1 FROM pg_replication_slots
+                 WHERE slot_name = 'sync_slot1' AND synced
+                       AND NOT temporary
+                       AND invalidation_reason IS NULL;}
+       ),
+       "t",
+       'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep($idle_timeout_1s + 1);

Regards,
Vignesh

#309

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: vignesh C (#308)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Dec 12, 2024 at 9:42 AM vignesh C <vignesh21@gmail.com> wrote:

Now that we support idle_replication_slot_timeout in milliseconds, we
can set this value from 1s to 1ms or 10millseconds and change sleep to
usleep, this will bring down the test execution time significantly:

+1
v55 implements the test using idle_replication_slot_timeout=1ms,
significantly reducing the test time.

Attached the v55 patch set which addresses all the comments in [1]/messages/by-id/CAHut+Pvx294U-XBB6-BvabesUNxbnuDQmk-VOFm=pbcNWSsHvQ@mail.gmail.com, [2]/messages/by-id/CALDaNm2wHDnboo0FCj247HiBMHAHqy0se8NTH4fDCdscxdjhcg@mail.gmail.com as well.

[1]: /messages/by-id/CAHut+Pvx294U-XBB6-BvabesUNxbnuDQmk-VOFm=pbcNWSsHvQ@mail.gmail.com
[2]: /messages/by-id/CALDaNm2wHDnboo0FCj247HiBMHAHqy0se8NTH4fDCdscxdjhcg@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v55-0001-Enhance-replication-slot-error-handling-slot-inv.patchapplication/octet-stream; name=v55-0001-Enhance-replication-slot-error-handling-slot-inv.patchDownload

From 95acd8b64bb25fc61e3fbccb7c370b3293d8fda4 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:13:26 +0530
Subject: [PATCH v55 1/2] Enhance replication slot error handling, slot
 invalidation, and inactive_since setting logic

In ReplicationSlotAcquire(), raise an error for invalid slots if the
caller specifies error_if_invalid=true.

Add check if the slot is already acquired, then mark it invalidated directly.

Ensure same inactive_since time for all slots in update_synced_slots_inactive_since()
and RestoreSlotFromDisk().
---
 .../replication/logical/logicalfuncs.c        |  2 +-
 src/backend/replication/logical/slotsync.c    | 13 ++--
 src/backend/replication/slot.c                | 61 ++++++++++++++++---
 src/backend/replication/slotfuncs.c           |  2 +-
 src/backend/replication/walsender.c           |  4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |  2 +-
 src/include/replication/slot.h                |  3 +-
 src/test/recovery/t/019_replslot_limit.pl     |  2 +-
 8 files changed, 67 insertions(+), 22 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f4f80b2312..e3645aea53 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,10 +1540,6 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
 			SpinLockAcquire(&s->mutex);
 			s->inactive_since = now;
 			SpinLockRelease(&s->mutex);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 4a206f9527..db94cec5c3 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -535,9 +535,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +618,43 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot is found to
+	 * be invalid.
+	 */
+	if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
+	{
+		StringInfoData err_detail;
+
+		initStringInfo(&err_detail);
+
+		switch (s->data.invalidated)
+		{
+			case RS_INVAL_WAL_REMOVED:
+				appendStringInfo(&err_detail, _("This slot has been invalidated because the required WAL has been removed."));
+				break;
+
+			case RS_INVAL_HORIZON:
+				appendStringInfo(&err_detail, _("This slot has been invalidated because the required rows have been removed."));
+				break;
+
+			case RS_INVAL_WAL_LEVEL:
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."),
+								 "wal_level");
+				break;
+
+			case RS_INVAL_NONE:
+				pg_unreachable();
+		}
+
+		ereport(ERROR,
+				errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				errmsg("can no longer get changes from replication slot \"%s\"",
+					   NameStr(s->data.name)),
+				errdetail_internal("%s", err_detail.data));
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +825,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +852,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1676,11 +1716,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s && active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -2208,6 +2249,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2368,6 +2410,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2400,7 +2445,7 @@ RestoreSlotFromDisk(const char *name)
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 488a161b3e..578cff64c8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 371eef3ddd..b36ae90b2c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 8a45b5827e..e8bc986c07 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, true);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index d2cf786fd5..f5f2d22163 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -253,7 +253,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index efb4ba3af1..333e040e7f 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This slot has been invalidated because the required WAL has been removed",
 			$logstart))
 	{
 		$failed = 1;
-- 
2.34.1

v55-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v55-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 9b58e65ac1ec7d491b54e8ac5f655af95352211a Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Wed, 4 Dec 2024 12:51:22 +0530
Subject: [PATCH v55 2/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  40 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |  10 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 155 +++++++++++++--
 src/backend/utils/misc/guc_tables.c           |  12 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 +++
 src/include/utils/guc_hooks.h                 |   2 +
 src/test/recovery/meson.build                 |   1 +
 .../t/043_invalidate_inactive_slots.pl        | 182 ++++++++++++++++++
 13 files changed, 429 insertions(+), 16 deletions(-)
 create mode 100644 src/test/recovery/t/043_invalidate_inactive_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e0c8325a39..2948176cbc 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4593,6 +4593,46 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that are idle for longer than this
+        amount of time. If this value is specified without units,
+        it is taken as milliseconds. A value of zero (which is default) disables
+        the idle timeout invalidation mechanism. This parameter can only
+        be set in the <filename>postgresql.conf</filename> file or on the
+        server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 8290cd1a08..158ec18211 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2163,6 +2163,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a586156614..b83458d944 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the duration specified by the
+          <xref linkend="guc-idle-replication-slot-timeout"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index e3645aea53..9777c6a9cc 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1540,9 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index db94cec5c3..82ca94d5ff 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,9 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/* Invalidate replication slots idle longer than this time; '0' disables it */
+int			idle_replication_slot_timeout_ms = 0;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -644,6 +648,12 @@ retry:
 								 "wal_level");
 				break;
 
+			case RS_INVAL_IDLE_TIMEOUT:
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_detail, _("This slot has been invalidated because it has remained idle longer than the configured \"%s\" time."),
+								 "idle_replication_slot_timeout");
+				break;
+
 			case RS_INVAL_NONE:
 				pg_unreachable();
 		}
@@ -743,16 +753,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1548,7 +1554,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1578,6 +1585,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail,
+							 _("The slot has been inactive since %s and has remained idle longer than the configured \"%s\" time."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1594,6 +1611,30 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Is idle timeout invalidation possible for this replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot is inactive
+ * 3. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+IsSlotIdleTimeoutPossible(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_ms > 0 &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1621,6 +1662,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1628,6 +1670,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1638,6 +1681,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1691,6 +1743,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (IsSlotIdleTimeoutPossible(s) &&
+						TimestampDifferenceExceeds(s->inactive_since, now,
+												   idle_replication_slot_timeout_ms))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1776,7 +1843,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1822,7 +1890,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1845,6 +1914,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_IDLE_TIMEOUT: idle slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1897,7 +1967,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1955,6 +2026,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' applies only to
+		 * released slots, and is based on the 'idle_replication_slot_timeout'
+		 * GUC. Active slots currently in use for replication are excluded to
+		 * prevent accidental invalidation. Slots where communication between
+		 * the publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2443,7 +2553,9 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
 		slot->inactive_since = now;
 
@@ -2838,3 +2950,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The idle_replication_slot_timeout must be disabled (set to 0)
+ * during the binary upgrade.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8cf1afbad2..c69b456719 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3047,6 +3047,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the time limit for how long a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MS
+		},
+		&idle_replication_slot_timeout_ms,
+		0, 0, INT_MAX,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a2ac7575ca..6b5c246e8d 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -337,6 +337,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#idle_replication_slot_timeout = 0	# in milliseconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index e96370a9ec..a9ac00f44f 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index 91bcb4dbc7..93940825d5 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index f5f2d22163..98625f9d13 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_ms;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 5813dba0a2..d7a7dffab5 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..3b8f45c93e 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/043_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/043_invalidate_inactive_slots.pl b/src/test/recovery/t/043_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..6e51024e78
--- /dev/null
+++ b/src/test/recovery/t/043_invalidate_inactive_slots.pl
@@ -0,0 +1,182 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Time::HiRes qw(usleep);
+use Test::More;
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to idle timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $idle_timeout_1ms = 1;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+usleep($idle_timeout_1ms + 10);
+
+# On standby, synced slots are not invalidated by the idle timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1ms}ms';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$idle_timeout_1ms);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to idle timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'idle_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$idle_timeout_1ms);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $idle_timeout);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	usleep($idle_timeout + 10);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#310

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: Peter Smith (#306)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Dec 11, 2024 at 8:14 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha.

Here are some review comments for patch v54-0002.
======
src/test/recovery/t/043_invalidate_inactive_slots.pl
5.
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+ my ($node, $slot, $offset, $idle_timeout) = @_;
+ my $node_name = $node->name;
AFAICT this 'idle_timeout' parameter is passed units of "seconds", so
it would be better to call it something like 'idle_timeout_s' to make
the units clear.

As per the suggestion in [1]/messages/by-id/CALDaNm1FQS04aG0C0gCRpvi-o-OTdq91y6Az34YKN-dVc9r5Ng@mail.gmail.com, the test has been updated to use
idle_timeout=1ms. Since the parameter uses the default unit of
"milliseconds," keeping it as 'idle_timeout' seems reasonable to me.

~~~
6.
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+ my ($node, $slot, $offset, $idle_timeout) = @_;
+ my $node_name = $node->name;
+ my $invalidated = 0;
Ditto above review comment #5 -- better to call it something like
'idle_timeout_s' to make the units clear.

The 'idle_timeout' parameter name remains unchanged as explained above.

[1]: /messages/by-id/CALDaNm1FQS04aG0C0gCRpvi-o-OTdq91y6Az34YKN-dVc9r5Ng@mail.gmail.com

--
Thanks,
Nisha

#311

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: Nisha Moond (#309)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha.

Thanks for the v55* patches.

I have no comments for patch v55-0001.

I have only 1 comment for patch v55-0002 regarding some remaining
nitpicks (below) about the consistency of phrases.

======

I scanned again over all the phrases for consistency:

CURRENT PATCH:

Docs (idle_replication_slot_timeout): Invalidate replication slots
that are idle for longer than this amount of time
Docs (idle_timeout): means that the slot has remained idle longer than
the duration specified by the idle_replication_slot_timeout parameter.

Code (guc var comment): Invalidate replication slots idle longer than this time
Code (guc_tables): Sets the time limit for how long a replication slot
can remain idle before it is invalidated.

Msg (errdetail): This slot has been invalidated because it has
remained idle longer than the configured \"%s\" time.
Msg (errdetail): The slot has been inactive since %s and has remained
idle longer than the configured \"%s\" time.

NITPICKS:

nit -- There are still some variations "amount of time" versus "time"
versus "duration". I think the term "duration" best describe the
maing so we can use that everywhere.

nit - Should consistently say "remained idle" instead of just "idle"
or "are idle",

nit - The last errdetail is also rearranged a bit because IMO we don't
need to say inactive and idle in the same sentence.

nit - Just say "longer than" instead of sometimes saying "for longer than"

SUGGESTIONS:

Docs (idle_replication_slot_timeout): Invalidate replication slots
that have remained idle longer than this duration.
Docs (idle_timeout): means that the slot has remained idle longer than
the configured idle_replication_slot_timeout duration.

Code (guc var comment): Invalidate replication slots that have
remained idle longer than this duration.
Code (guc_tables): Sets the duration a replication slot can remain
idle before it is invalidated.

Msg (errdetail): This slot has been invalidated because it has
remained idle longer than the configured \"%s\" duration.
Msg (errdetail): The slot has remained idle since %s, which is longer
than the configured \"%s\" duration.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#312

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: Peter Smith (#311)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Dec 16, 2024 at 9:58 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha.

Thanks for the v55* patches.

I have no comments for patch v55-0001.

I have only 1 comment for patch v55-0002 regarding some remaining
nitpicks (below) about the consistency of phrases.

======

SUGGESTIONS:

Docs (idle_replication_slot_timeout): Invalidate replication slots
that have remained idle longer than this duration.
Docs (idle_timeout): means that the slot has remained idle longer than
the configured idle_replication_slot_timeout duration.

Code (guc var comment): Invalidate replication slots that have
remained idle longer than this duration.
Code (guc_tables): Sets the duration a replication slot can remain
idle before it is invalidated.

Msg (errdetail): This slot has been invalidated because it has
remained idle longer than the configured \"%s\" duration.
Msg (errdetail): The slot has remained idle since %s, which is longer
than the configured \"%s\" duration.

Here is the v56 patch set with the above comments incorporated.

--
Thanks
Nisha

Attachments:

v56-0001-Enhance-replication-slot-error-handling-slot-inv.patchapplication/octet-stream; name=v56-0001-Enhance-replication-slot-error-handling-slot-inv.patchDownload

From f898b32f068c862d78a9e05702e577e8ee7cd913 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:13:26 +0530
Subject: [PATCH v56 1/2] Enhance replication slot error handling, slot
 invalidation, and inactive_since setting logic

In ReplicationSlotAcquire(), raise an error for invalid slots if the
caller specifies error_if_invalid=true.

Add check if the slot is already acquired, then mark it invalidated directly.

Ensure same inactive_since time for all slots in update_synced_slots_inactive_since()
and RestoreSlotFromDisk().
---
 .../replication/logical/logicalfuncs.c        |  2 +-
 src/backend/replication/logical/slotsync.c    | 13 ++--
 src/backend/replication/slot.c                | 61 ++++++++++++++++---
 src/backend/replication/slotfuncs.c           |  2 +-
 src/backend/replication/walsender.c           |  4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |  2 +-
 src/include/replication/slot.h                |  3 +-
 src/test/recovery/t/019_replslot_limit.pl     |  2 +-
 8 files changed, 67 insertions(+), 22 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f4f80b2312..e3645aea53 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,10 +1540,6 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
 			SpinLockAcquire(&s->mutex);
 			s->inactive_since = now;
 			SpinLockRelease(&s->mutex);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 4a206f9527..db94cec5c3 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -535,9 +535,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +618,43 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot is found to
+	 * be invalid.
+	 */
+	if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
+	{
+		StringInfoData err_detail;
+
+		initStringInfo(&err_detail);
+
+		switch (s->data.invalidated)
+		{
+			case RS_INVAL_WAL_REMOVED:
+				appendStringInfo(&err_detail, _("This slot has been invalidated because the required WAL has been removed."));
+				break;
+
+			case RS_INVAL_HORIZON:
+				appendStringInfo(&err_detail, _("This slot has been invalidated because the required rows have been removed."));
+				break;
+
+			case RS_INVAL_WAL_LEVEL:
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."),
+								 "wal_level");
+				break;
+
+			case RS_INVAL_NONE:
+				pg_unreachable();
+		}
+
+		ereport(ERROR,
+				errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				errmsg("can no longer get changes from replication slot \"%s\"",
+					   NameStr(s->data.name)),
+				errdetail_internal("%s", err_detail.data));
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +825,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +852,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1676,11 +1716,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s && active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -2208,6 +2249,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2368,6 +2410,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2400,7 +2445,7 @@ RestoreSlotFromDisk(const char *name)
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 488a161b3e..578cff64c8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 371eef3ddd..b36ae90b2c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 8a45b5827e..e8bc986c07 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, true);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index d2cf786fd5..f5f2d22163 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -253,7 +253,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index efb4ba3af1..333e040e7f 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This slot has been invalidated because the required WAL has been removed",
 			$logstart))
 	{
 		$failed = 1;
-- 
2.34.1

v56-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v56-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From f0a862dbc22199f53df8a7b31728e08c2b9757c4 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Wed, 4 Dec 2024 12:51:22 +0530
Subject: [PATCH v56 2/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  40 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |  10 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 158 +++++++++++++--
 src/backend/utils/misc/guc_tables.c           |  12 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 +++
 src/include/utils/guc_hooks.h                 |   2 +
 src/test/recovery/meson.build                 |   1 +
 .../t/043_invalidate_inactive_slots.pl        | 182 ++++++++++++++++++
 13 files changed, 432 insertions(+), 16 deletions(-)
 create mode 100644 src/test/recovery/t/043_invalidate_inactive_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e0c8325a39..a9b5b24d50 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4593,6 +4593,46 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units,
+        it is taken as milliseconds. A value of zero (which is default) disables
+        the idle timeout invalidation mechanism. This parameter can only
+        be set in the <filename>postgresql.conf</filename> file or on the
+        server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 8290cd1a08..158ec18211 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2163,6 +2163,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a586156614..199d7248ee 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index e3645aea53..9777c6a9cc 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1540,9 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index db94cec5c3..f4c9c6689a 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_ms = 0;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -644,6 +651,12 @@ retry:
 								 "wal_level");
 				break;
 
+			case RS_INVAL_IDLE_TIMEOUT:
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_detail, _("This slot has been invalidated because it has remained idle longer than the configured \"%s\" duration."),
+								 "idle_replication_slot_timeout");
+				break;
+
 			case RS_INVAL_NONE:
 				pg_unreachable();
 		}
@@ -743,16 +756,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1548,7 +1557,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1578,6 +1588,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail,
+							 _("The slot has remained idle since %s, which is longer than the configured \"%s\" duration."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1594,6 +1614,30 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Is idle timeout invalidation possible for this replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot is inactive
+ * 3. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+IsSlotIdleTimeoutPossible(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_ms > 0 &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1621,6 +1665,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1628,6 +1673,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1638,6 +1684,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1691,6 +1746,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (IsSlotIdleTimeoutPossible(s) &&
+						TimestampDifferenceExceeds(s->inactive_since, now,
+												   idle_replication_slot_timeout_ms))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1776,7 +1846,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1822,7 +1893,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1845,6 +1917,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_IDLE_TIMEOUT: idle slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1897,7 +1970,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1955,6 +2029,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' applies only to
+		 * released slots, and is based on the 'idle_replication_slot_timeout'
+		 * GUC. Active slots currently in use for replication are excluded to
+		 * prevent accidental invalidation. Slots where communication between
+		 * the publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2443,7 +2556,9 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
 		slot->inactive_since = now;
 
@@ -2838,3 +2953,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The idle_replication_slot_timeout must be disabled (set to 0)
+ * during the binary upgrade.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8cf1afbad2..27fbbc8418 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3047,6 +3047,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MS
+		},
+		&idle_replication_slot_timeout_ms,
+		0, 0, INT_MAX,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a2ac7575ca..6b5c246e8d 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -337,6 +337,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#idle_replication_slot_timeout = 0	# in milliseconds; 0 disables
 
 # - Primary Server -
 
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index e96370a9ec..a9ac00f44f 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index 91bcb4dbc7..93940825d5 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index f5f2d22163..98625f9d13 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_ms;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 5813dba0a2..d7a7dffab5 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..3b8f45c93e 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/043_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/043_invalidate_inactive_slots.pl b/src/test/recovery/t/043_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..6e51024e78
--- /dev/null
+++ b/src/test/recovery/t/043_invalidate_inactive_slots.pl
@@ -0,0 +1,182 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Time::HiRes qw(usleep);
+use Test::More;
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to idle timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $idle_timeout_1ms = 1;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+usleep($idle_timeout_1ms + 10);
+
+# On standby, synced slots are not invalidated by the idle timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1ms}ms';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$idle_timeout_1ms);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to idle timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'idle_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$idle_timeout_1ms);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $idle_timeout);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	usleep($idle_timeout + 10);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#313

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: Nisha Moond (#312)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Dec 16, 2024 at 9:40 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

On Mon, Dec 16, 2024 at 9:58 AM Peter Smith <smithpb2250@gmail.com> wrote:

...

SUGGESTIONS:

Docs (idle_replication_slot_timeout): Invalidate replication slots
that have remained idle longer than this duration.
Docs (idle_timeout): means that the slot has remained idle longer than
the configured idle_replication_slot_timeout duration.

Code (guc var comment): Invalidate replication slots that have
remained idle longer than this duration.
Code (guc_tables): Sets the duration a replication slot can remain
idle before it is invalidated.

Msg (errdetail): This slot has been invalidated because it has
remained idle longer than the configured \"%s\" duration.
Msg (errdetail): The slot has remained idle since %s, which is longer
than the configured \"%s\" duration.

Here is the v56 patch set with the above comments incorporated.

Hi Nisha.

Thanks for the updates.

- Both patches could be applied cleanly.
- Tests (make check, TAP subscriber, TAP recovery) are all passing.
- The rendering of the documentation changes from patch 0002 looked good.
- I have no more review comments.

So, the v56* patchset LGTM.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#314

Amit Kapila

amit.kapila16@gmail.com

about 1 year ago

In reply to: Nisha Moond (#312)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Dec 16, 2024 at 4:10 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

Here is the v56 patch set with the above comments incorporated.

Review comments:
===============
1.
+ {
+ {"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+ gettext_noop("Sets the duration a replication slot can remain idle before "
+ "it is invalidated."),
+ NULL,
+ GUC_UNIT_MS
+ },
+ &idle_replication_slot_timeout_ms,

I think users are going to keep idele_slot timeout at least in hours.
So, millisecond seems the wrong choice to me. I suggest to keep the
units in minutes. I understand that writing a test would be
challenging as spending a minute or more on one test is not advisable.
But I don't see any test testing the other GUCs that are in minutes
(wal_summary_keep_time and log_rotation_age). The default value should
be one day.

2.
+ /*
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
+ */
+ if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
+ {
+ StringInfoData err_detail;
+
+ initStringInfo(&err_detail);
+
+ switch (s->data.invalidated)
+ {
+ case RS_INVAL_WAL_REMOVED:
+ appendStringInfo(&err_detail, _("This slot has been invalidated
because the required WAL has been removed."));
+ break;
+
+ case RS_INVAL_HORIZON:
+ appendStringInfo(&err_detail, _("This slot has been invalidated
because the required rows have been removed."));
+ break;
+
+ case RS_INVAL_WAL_LEVEL:
+ /* translator: %s is a GUC variable name */
+ appendStringInfo(&err_detail, _("This slot has been invalidated
because \"%s\" is insufficient for slot."),
+ "wal_level");
+ break;
+
+ case RS_INVAL_NONE:
+ pg_unreachable();
+ }
+
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("can no longer get changes from replication slot \"%s\"",
+    NameStr(s->data.name)),
+ errdetail_internal("%s", err_detail.data));
+ }
+

This should be moved to a separate function.

3.
+static inline bool
+IsSlotIdleTimeoutPossible(ReplicationSlot *s)

Would it be better to name this function as CanInvalidateIdleSlot()?
The current name doesn't seem to match with similar other
functionalities.

--
With Regards,
Amit Kapila.

#315

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: Amit Kapila (#314)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Dec 20, 2024 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Dec 16, 2024 at 4:10 PM Nisha Moond <nisha.moond412@gmail.com>
wrote:

Here is the v56 patch set with the above comments incorporated.
Review comments:
===============
1.
+ {
+ {"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+ gettext_noop("Sets the duration a replication slot can remain idle
before "
+ "it is invalidated."),
+ NULL,
+ GUC_UNIT_MS
+ },
+ &idle_replication_slot_timeout_ms,
I think users are going to keep idele_slot timeout at least in hours.
So, millisecond seems the wrong choice to me. I suggest to keep the
units in minutes. I understand that writing a test would be
challenging as spending a minute or more on one test is not advisable.
But I don't see any test testing the other GUCs that are in minutes
(wal_summary_keep_time and log_rotation_age). The default value should
be one day.

+1
- Changed the GUC unit to "minute".

Regarding the tests, we have two potential options:
1) Introduce an additional "debug_xx" GUC parameter with units of seconds
or milliseconds, only for testing purposes.
2) Skip writing tests for this, similar to other GUCs with units in
minutes.

IMO, adding an additional GUC just for testing may not be worthwhile. It's
reasonable to proceed without the test.

Thoughts?

The attached v57 patch-set addresses all the comments. I have kept the test
case in the patch for now, it takes 2-3 minutes to complete.

--
Thanks,
Nisha

Attachments:

v57-0001-Enhance-replication-slot-error-handling-slot-inv.patchapplication/octet-stream; name=v57-0001-Enhance-replication-slot-error-handling-slot-inv.patchDownload

From c95d010d878de6a3433722269759ffc08bcd7aef Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:13:26 +0530
Subject: [PATCH v57 1/2] Enhance replication slot error handling, slot
 invalidation, and inactive_since setting logic

In ReplicationSlotAcquire(), raise an error for invalid slots if the
caller specifies error_if_invalid=true.

Add check if the slot is already acquired, then mark it invalidated directly.

Ensure same inactive_since time for all slots in update_synced_slots_inactive_since()
and RestoreSlotFromDisk().
---
 .../replication/logical/logicalfuncs.c        |  2 +-
 src/backend/replication/logical/slotsync.c    | 13 ++--
 src/backend/replication/slot.c                | 69 ++++++++++++++++---
 src/backend/replication/slotfuncs.c           |  2 +-
 src/backend/replication/walsender.c           |  4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |  2 +-
 src/include/replication/slot.h                |  3 +-
 src/test/recovery/t/019_replslot_limit.pl     |  2 +-
 8 files changed, 75 insertions(+), 22 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f4f80b2312..e3645aea53 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,10 +1540,6 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
 			SpinLockAcquire(&s->mutex);
 			s->inactive_since = now;
 			SpinLockRelease(&s->mutex);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 4a206f9527..2a99c1f053 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -163,6 +163,7 @@ static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
 static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
+static void RaiseSlotInvalidationError(ReplicationSlot *slot);
 
 /*
  * Report shared-memory space needed by ReplicationSlotsShmemInit.
@@ -535,9 +536,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +619,13 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot is found to
+	 * be invalid.
+	 */
+	if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
+		RaiseSlotInvalidationError(s);
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +796,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +823,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1676,11 +1687,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s && active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -2208,6 +2220,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2368,6 +2381,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2400,7 +2416,7 @@ RestoreSlotFromDisk(const char *name)
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
@@ -2793,3 +2809,40 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * Raise an error based on the invalidation cause of the slot.
+ */
+static void
+RaiseSlotInvalidationError(ReplicationSlot *slot)
+{
+	StringInfoData err_detail;
+
+	initStringInfo(&err_detail);
+
+	switch (slot->data.invalidated)
+	{
+		case RS_INVAL_WAL_REMOVED:
+			appendStringInfo(&err_detail, _("This slot has been invalidated because the required WAL has been removed."));
+			break;
+
+		case RS_INVAL_HORIZON:
+			appendStringInfo(&err_detail, _("This slot has been invalidated because the required rows have been removed."));
+			break;
+
+		case RS_INVAL_WAL_LEVEL:
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."),
+							 "wal_level");
+			break;
+
+		case RS_INVAL_NONE:
+			pg_unreachable();
+	}
+
+	ereport(ERROR,
+			errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+			errmsg("can no longer get changes from replication slot \"%s\"",
+				   NameStr(slot->data.name)),
+			errdetail_internal("%s", err_detail.data));
+}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 488a161b3e..578cff64c8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 371eef3ddd..b36ae90b2c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 8a45b5827e..e8bc986c07 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, true);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index d2cf786fd5..f5f2d22163 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -253,7 +253,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index efb4ba3af1..333e040e7f 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This slot has been invalidated because the required WAL has been removed",
 			$logstart))
 	{
 		$failed = 1;
-- 
2.34.1

v57-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v57-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 04c688ab4902bdacf4503b01800172eb9a638466 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 23 Dec 2024 14:58:34 +0530
Subject: [PATCH v57 2/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  39 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |  10 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 158 +++++++++++++--
 src/backend/utils/adt/timestamp.c             |  19 ++
 src/backend/utils/misc/guc_tables.c           |  12 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 +++
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 src/test/recovery/meson.build                 |   1 +
 .../t/043_invalidate_inactive_slots.pl        | 181 ++++++++++++++++++
 15 files changed, 452 insertions(+), 16 deletions(-)
 create mode 100644 src/test/recovery/t/043_invalidate_inactive_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index fbdd6ce574..8953c8fd4b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4593,6 +4593,45 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero disables the idle timeout invalidation mechanism.
+        The default is 24 hours. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 8290cd1a08..158ec18211 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2163,6 +2163,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a586156614..199d7248ee 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index e3645aea53..9777c6a9cc 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1540,9 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 2a99c1f053..601b71819d 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_min = HOURS_PER_DAY * MINS_PER_HOUR;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -714,16 +721,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1519,7 +1522,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1549,6 +1553,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail,
+							 _("The slot has remained idle since %s, which is longer than the configured \"%s\" duration."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1565,6 +1579,30 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Can invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot is inactive
+ * 3. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_min > 0 &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1592,6 +1630,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1599,6 +1638,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1609,6 +1649,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1662,6 +1711,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (CanInvalidateIdleSlot(s) &&
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_min * SECS_PER_MINUTE))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1747,7 +1811,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1793,7 +1858,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1816,6 +1882,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_IDLE_TIMEOUT: idle slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1868,7 +1935,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1926,6 +1994,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' applies only to
+		 * released slots, and is based on the 'idle_replication_slot_timeout'
+		 * GUC. Active slots currently in use for replication are excluded to
+		 * prevent accidental invalidation. Slots where communication between
+		 * the publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2414,7 +2521,9 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
 		slot->inactive_since = now;
 
@@ -2836,6 +2945,12 @@ RaiseSlotInvalidationError(ReplicationSlot *slot)
 							 "wal_level");
 			break;
 
+		case RS_INVAL_IDLE_TIMEOUT:
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_detail, _("This slot has been invalidated because it has remained idle longer than the configured \"%s\" duration."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -2846,3 +2961,22 @@ RaiseSlotInvalidationError(ReplicationSlot *slot)
 				   NameStr(slot->data.name)),
 			errdetail_internal("%s", err_detail.data));
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The idle_replication_slot_timeout must be disabled (set to 0)
+ * during the binary upgrade.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 18d7d8a108..e1d96b0e29 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,25 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	/* Return if the difference meets or exceeds the threshold */
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8cf1afbad2..fe13a2c646 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3047,6 +3047,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_min,
+		HOURS_PER_DAY * MINS_PER_HOUR, 0, INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a2ac7575ca..720c50e966 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -337,6 +337,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#idle_replication_slot_timeout = 1d	# in minutes; 0 disables
 
 # - Primary Server -
 
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index e96370a9ec..a9ac00f44f 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index 91bcb4dbc7..93940825d5 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index f5f2d22163..b54970f22f 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_min;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 5813dba0a2..d7a7dffab5 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index a6ce03ed46..7f3d0001ef 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -143,5 +143,8 @@ extern int	date2isoyear(int year, int mon, int mday);
 extern int	date2isoyearday(int year, int mon, int mday);
 
 extern bool TimestampTimestampTzRequiresRewrite(void);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 #endif							/* TIMESTAMP_H */
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..3b8f45c93e 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/043_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/043_invalidate_inactive_slots.pl b/src/test/recovery/t/043_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..fd227bfc65
--- /dev/null
+++ b/src/test/recovery/t/043_invalidate_inactive_slots.pl
@@ -0,0 +1,181 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to idle timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $idle_timeout_1min = 1;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep($idle_timeout_1min * 60 + 10);
+
+# On standby, synced slots are not invalidated by the idle timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1min}min';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to idle timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'idle_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_min) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $idle_timeout_min);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_min) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	sleep($idle_timeout_min * 60 + 10);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#316

Michail Nikolaev

michail.nikolaev@gmail.com

about 1 year ago

In reply to: Nisha Moond (#315)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hello everyone!

Yesterday I got a strange set of test errors, probably somehow related to
that patch.
It happened on changed master branch (based
on d96d1d5152f30d15678e08e75b42756101b7cab6) but I don't think my changes
were affecting it.

My setup is a little bit tricky: Windows 11 run WSL2 with Ubuntu, meson.

So, `recovery ` suite started failing on:

1) at /src/test/recovery/t/019_replslot_limit.pl line 530.
2) at /src/test/recovery/t/040_standby_failover_slots_sync.pl line 198.

It was failing almost every run, one test or another. I was lurking around
for about 10 min, and..... it just stopped failing. And I can't reproduce
it anymore.

But I have logs of two fails. I am not sure if it is helpful, but decided
to mail them here just in case.

Best regards,
Mikhail.

#317

Zhijie Hou (Fujitsu)

houzj.fnst@fujitsu.com

about 1 year ago

In reply to: Michail Nikolaev (#316)

RE: Introduce XID age and inactive timeout based replication slot invalidation

On Tuesday, December 24, 2024 8:57 PM Michail Nikolaev <michail.nikolaev@gmail.com> wrote:

Hi,

Yesterday I got a strange set of test errors, probably somehow related to
that patch. It happened on changed master branch (based on
d96d1d5152f30d15678e08e75b42756101b7cab6) but I don't think my changes were
affecting it.

My setup is a little bit tricky: Windows 11 run WSL2 with Ubuntu, meson.

So, `recovery ` suite started failing on:

1) at /src/test/recovery/t/http://019_replslot_limit.pl line 530.
2) at /src/test/recovery/t/http://040_standby_failover_slots_sync.pl line
198.

It was failing almost every run, one test or another. I was lurking around
for about 10 min, and..... it just stopped failing. And I can't reproduce it
anymore.

But I have logs of two fails. I am not sure if it is helpful, but decided to
mail them here just in case.

Thanks for reporting the issue.

After checking the log, I think the failure is caused by the unexpected
behavior of the local system clock.

It's clear from the '019_replslot_limit_primary4.log'[1]2024-12-24 01:37:19.967 CET [161409] sub STATEMENT: START_REPLICATION SLOT "lsub4_slot" LOGICAL 0/0 (proto_version '4', streaming 'parallel', origin 'any', publication_names '"pub"') ... 2024-12-24 01:37:20.025 CET [161447] 019_replslot_limit.pl LOG: statement: SELECT '0/30003D8' <= replay_lsn AND state = 'streaming' ... 2024-12-24 01:37:19.388 CET [161097] LOG: received fast shutdown request that the clock went
backwards which makes the slot's inactive_since go backwards as well. That's
why the last testcase didn't pass.

And for 040_standby_failover_slots_sync, we can see that the clock of standby
lags behind that of the primary, which caused the inactive_since of newly synced
slot on standby to be earlier than the one on the primary.

So, I think it's not a bug in the committed patch but an issue in the testing
environment. Besides, since we have not seen such failures on BF, I think it
may not be necessary to improve the testcases.

[1]: 2024-12-24 01:37:19.967 CET [161409] sub STATEMENT: START_REPLICATION SLOT "lsub4_slot" LOGICAL 0/0 (proto_version '4', streaming 'parallel', origin 'any', publication_names '"pub"') ... 2024-12-24 01:37:20.025 CET [161447] 019_replslot_limit.pl LOG: statement: SELECT '0/30003D8' <= replay_lsn AND state = 'streaming' ... 2024-12-24 01:37:19.388 CET [161097] LOG: received fast shutdown request
2024-12-24 01:37:19.967 CET [161409] sub STATEMENT: START_REPLICATION SLOT "lsub4_slot" LOGICAL 0/0 (proto_version '4', streaming 'parallel', origin 'any', publication_names '"pub"')
...
2024-12-24 01:37:20.025 CET [161447] 019_replslot_limit.pl LOG: statement: SELECT '0/30003D8' <= replay_lsn AND state = 'streaming'
...
2024-12-24 01:37:19.388 CET [161097] LOG: received fast shutdown request

Best Regards,
Hou zj

#318

Michail Nikolaev

michail.nikolaev@gmail.com

about 1 year ago

In reply to: Zhijie Hou (Fujitsu) (#317)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hello, Hou!

So, I think it's not a bug in the committed patch but an issue in the

testing

venvironment. Besides, since we have not seen such failures on BF, I

think it

may not be necessary to improve the testcases.

Thanks for your analysis!
Yes, probably WSL2/Windows interactions cause strange system clock moving.
It looks like it is a common issue with WSL2 [0]https://github.com/microsoft/WSL/issues/10006.

[0]: https://github.com/microsoft/WSL/issues/10006

Best regards,
Mikhail.

#319

vignesh C

vignesh21@gmail.com

about 1 year ago

In reply to: Nisha Moond (#315)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, 24 Dec 2024 at 17:07, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Fri, Dec 20, 2024 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 16, 2024 at 4:10 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

Here is the v56 patch set with the above comments incorporated.
Review comments:
===============
1.
+ {
+ {"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+ gettext_noop("Sets the duration a replication slot can remain idle before "
+ "it is invalidated."),
+ NULL,
+ GUC_UNIT_MS
+ },
+ &idle_replication_slot_timeout_ms,
I think users are going to keep idele_slot timeout at least in hours.
So, millisecond seems the wrong choice to me. I suggest to keep the
units in minutes. I understand that writing a test would be
challenging as spending a minute or more on one test is not advisable.
But I don't see any test testing the other GUCs that are in minutes
(wal_summary_keep_time and log_rotation_age). The default value should
be one day.
+1
- Changed the GUC unit to "minute".
Regarding the tests, we have two potential options:
1) Introduce an additional "debug_xx" GUC parameter with units of seconds or milliseconds, only for testing purposes.
2) Skip writing tests for this, similar to other GUCs with units in minutes.

IMO, adding an additional GUC just for testing may not be worthwhile. It's reasonable to proceed without the test.

Thoughts?

The attached v57 patch-set addresses all the comments. I have kept the test case in the patch for now, it takes 2-3 minutes to complete.

Few comments:
1) We have disabled the similar configuration max_slot_wal_keep_size
by setting to -1, as this GUC also is in similar lines, should we
disable this and let the user configure it?
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int                    idle_replication_slot_timeout_min =
HOURS_PER_DAY * MINS_PER_HOUR;
+

2) I felt this behavior is an existing behavior, so this can also be
moved to 0001 patch:
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a586156614..199d7248ee 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.

3) Can we change the comment below to "We don't allow the value of
idle_replication_slot_timeout other than 0 during the binary upgrade.
See start_postmaster() in pg_upgrade for more details.":
+ * The idle_replication_slot_timeout must be disabled (set to 0)
+ * during the binary upgrade.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra,
GucSource source)

Regards,
Vignesh

#320

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: Nisha Moond (#315)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha.

Here are some review comments for patch v57-0001.

======
src/backend/replication/slot.c

1.
+
+/*
+ * Raise an error based on the invalidation cause of the slot.
+ */
+static void
+RaiseSlotInvalidationError(ReplicationSlot *slot)
+{
+ StringInfoData err_detail;
+
+ initStringInfo(&err_detail);
+
+ switch (slot->data.invalidated)

1a.
/invalidation cause of the slot./slot's invalidation cause./

1b.
This function does not expect to be called with slot->data.invalidated
== RS_INVAL_NONE, so I think it will be better to assert that
up-front.

1c.
This code could be simplified if you declare/initialize the variable
together, like:

StringInfo err_detail = makeStringInfo();

~~~

2.
+ case RS_INVAL_WAL_REMOVED:
+ appendStringInfo(&err_detail, _("This slot has been invalidated
because the required WAL has been removed."));
+ break;
+
+ case RS_INVAL_HORIZON:
+ appendStringInfo(&err_detail, _("This slot has been invalidated
because the required rows have been removed."));
+ break;

Since there are no format strings here, appendStringInfoString can be
used directly in some places.

======

FYI. I've attached a diffs patch that implements some of the
above-suggested changes.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachments:

PS_diffs_v570001.txttext/plain; charset=US-ASCII; name=PS_diffs_v570001.txtDownload

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 2a99c1f..71c6ae2 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -2816,23 +2816,23 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 static void
 RaiseSlotInvalidationError(ReplicationSlot *slot)
 {
-	StringInfoData err_detail;
+	StringInfo err_detail = makeStringInfo();
 
-	initStringInfo(&err_detail);
+	Assert(slot->data.invalidated != RS_INVAL_NONE);
 
 	switch (slot->data.invalidated)
 	{
 		case RS_INVAL_WAL_REMOVED:
-			appendStringInfo(&err_detail, _("This slot has been invalidated because the required WAL has been removed."));
+			appendStringInfoString(err_detail, _("This slot has been invalidated because the required WAL has been removed."));
 			break;
 
 		case RS_INVAL_HORIZON:
-			appendStringInfo(&err_detail, _("This slot has been invalidated because the required rows have been removed."));
+			appendStringInfoString(err_detail, _("This slot has been invalidated because the required rows have been removed."));
 			break;
 
 		case RS_INVAL_WAL_LEVEL:
 			/* translator: %s is a GUC variable name */
-			appendStringInfo(&err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."),
+			appendStringInfo(err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."),
 							 "wal_level");
 			break;
 
@@ -2844,5 +2844,5 @@ RaiseSlotInvalidationError(ReplicationSlot *slot)
 			errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 			errmsg("can no longer get changes from replication slot \"%s\"",
 				   NameStr(slot->data.name)),
-			errdetail_internal("%s", err_detail.data));
+			errdetail_internal("%s", err_detail->data));
 }

#321

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: Nisha Moond (#315)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha,

Here are some review comments for the patch v57-0001.

======
src/backend/replication/slot.c

1.
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int idle_replication_slot_timeout_min = HOURS_PER_DAY * MINS_PER_HOUR;

IMO it would be better to have the suffix "_mins" instead of "_min"
here to avoid any confusion with "minimum".

~~~

2.
+/*
+ * Can invalidate an idle replication slot?
+ *

Not an English sentence.

======
src/backend/utils/adt/timestamp.c

3.
+ /* Return if the difference meets or exceeds the threshold */
+ return (secs >= threshold_sec);

That comment may not be necessary; it is saying just the same as the code.

======
src/backend/utils/misc/guc_tables.c

4.
+ {
+ {"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+ gettext_noop("Sets the duration a replication slot can remain idle before "
+ "it is invalidated."),
+ NULL,
+ GUC_UNIT_MIN
+ },
+ &idle_replication_slot_timeout_min,
+ HOURS_PER_DAY * MINS_PER_HOUR, 0, INT_MAX / SECS_PER_MINUTE,
+ check_idle_replication_slot_timeout, NULL, NULL
+ },
+

Maybe it's better to include a comment that says "24 hours".

(e.g. like wal_summary_keep_time does)

======
src/backend/utils/misc/postgresql.conf.sample

5.
#track_commit_timestamp = off # collect timestamp of transaction commit
# (change requires restart)
+#idle_replication_slot_timeout = 1d # in minutes; 0 disables

I felt it might be better to say 24h here instead of 1d. And, that
would also be consistent with the docs, which said the default was 24
hours.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#322

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: Nisha Moond (#315)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Dec 24, 2024 at 10:37 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

On Fri, Dec 20, 2024 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 16, 2024 at 4:10 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

Here is the v56 patch set with the above comments incorporated.
Review comments:
===============
1.
+ {
+ {"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+ gettext_noop("Sets the duration a replication slot can remain idle before "
+ "it is invalidated."),
+ NULL,
+ GUC_UNIT_MS
+ },
+ &idle_replication_slot_timeout_ms,
I think users are going to keep idele_slot timeout at least in hours.
So, millisecond seems the wrong choice to me. I suggest to keep the
units in minutes. I understand that writing a test would be
challenging as spending a minute or more on one test is not advisable.
But I don't see any test testing the other GUCs that are in minutes
(wal_summary_keep_time and log_rotation_age). The default value should
be one day.
+1
- Changed the GUC unit to "minute".
Regarding the tests, we have two potential options:
1) Introduce an additional "debug_xx" GUC parameter with units of seconds or milliseconds, only for testing purposes.
2) Skip writing tests for this, similar to other GUCs with units in minutes.

IMO, adding an additional GUC just for testing may not be worthwhile. It's reasonable to proceed without the test.

Thoughts?

The attached v57 patch-set addresses all the comments. I have kept the test case in the patch for now, it takes 2-3 minutes to complete.

Hi Nisha.

I think we are often too quick to throw out perfectly good tests.
Citing that some similar GUCs don't do testing as a reason to skip
them just seems to me like an example of "two wrongs don't make a
right".

There is a third option.

Keep the tests. Because they take excessive time to run, that simply
means you should run them *conditionally* based on the PG_TEST_EXTRA
environment variable so they don't impact the normal BF execution. The
documentation [1]https://www.postgresql.org/docs/17/regress-run.html says this env var is for "resource intensive" tests
-- AFAIK this is exactly the scenario we find ourselves in, so is
exactly what this env var was meant for.

Search other *.pl tests for PG_TEST_EXTRA to see some examples.

======
[1]: https://www.postgresql.org/docs/17/regress-run.html

Kind Regards,
Peter Smith.
Fujitsu Australia

#323

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: Peter Smith (#322)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Dec 30, 2024 at 11:05 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Tue, Dec 24, 2024 at 10:37 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
On Fri, Dec 20, 2024 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Dec 16, 2024 at 4:10 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

Here is the v56 patch set with the above comments incorporated.
Review comments:
===============
1.
+ {
+ {"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+ gettext_noop("Sets the duration a replication slot can remain idle before "
+ "it is invalidated."),
+ NULL,
+ GUC_UNIT_MS
+ },
+ &idle_replication_slot_timeout_ms,
I think users are going to keep idele_slot timeout at least in hours.
So, millisecond seems the wrong choice to me. I suggest to keep the
units in minutes. I understand that writing a test would be
challenging as spending a minute or more on one test is not advisable.
But I don't see any test testing the other GUCs that are in minutes
(wal_summary_keep_time and log_rotation_age). The default value should
be one day.
+1
- Changed the GUC unit to "minute".
Regarding the tests, we have two potential options:
1) Introduce an additional "debug_xx" GUC parameter with units of seconds or milliseconds, only for testing purposes.
2) Skip writing tests for this, similar to other GUCs with units in minutes.

IMO, adding an additional GUC just for testing may not be worthwhile. It's reasonable to proceed without the test.

Thoughts?

The attached v57 patch-set addresses all the comments. I have kept the test case in the patch for now, it takes 2-3 minutes to complete.
Hi Nisha.

I think we are often too quick to throw out perfectly good tests.
Citing that some similar GUCs don't do testing as a reason to skip
them just seems to me like an example of "two wrongs don't make a
right".

There is a third option.

Keep the tests. Because they take excessive time to run, that simply
means you should run them *conditionally* based on the PG_TEST_EXTRA
environment variable so they don't impact the normal BF execution. The
documentation [1] says this env var is for "resource intensive" tests
-- AFAIK this is exactly the scenario we find ourselves in, so is
exactly what this env var was meant for.

Search other *.pl tests for PG_TEST_EXTRA to see some examples.

Thank you for the suggestion! I’ve added the tests under the
PG_TEST_EXTRA condition. Now, the '043_invalidate_inactive_slots.pl'
tests will only be executed when
PG_TEST_EXTRA=idle_replication_slot_timeout is set.

Attached the v58 patch set, addressing the above and the comments in
[1]: /messages/by-id/CALDaNm14QrW5j6su+EAqjwnHbiwXJwO+yk73_=7yvc5TVY-43g@mail.gmail.com

[1]: /messages/by-id/CALDaNm14QrW5j6su+EAqjwnHbiwXJwO+yk73_=7yvc5TVY-43g@mail.gmail.com
[2]: /messages/by-id/CAHut+PvDsM=+vTbM-xX6DD-PavONs2kGn03MZbCPGGL2t60TRA@mail.gmail.com
[3]: /messages/by-id/CAHut+Ps2ecNTfG3vsGb91CYpEzWtffyvkOzk1jqwhqTCwH8HQA@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v58-0001-Enhance-replication-slot-error-handling-slot-inv.patchapplication/octet-stream; name=v58-0001-Enhance-replication-slot-error-handling-slot-inv.patchDownload

From a1bd59e300181c508242062896b4d439e6b9fb2f Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:13:26 +0530
Subject: [PATCH v58 1/2] Enhance replication slot error handling, slot
 invalidation, and inactive_since setting logic

In ReplicationSlotAcquire(), raise an error for invalid slots if the
caller specifies error_if_invalid=true.

Add check if the slot is already acquired, then mark it invalidated directly.

Ensure same inactive_since time for all slots in update_synced_slots_inactive_since()
and RestoreSlotFromDisk().
---
 .../replication/logical/logicalfuncs.c        |  2 +-
 src/backend/replication/logical/slotsync.c    | 13 ++--
 src/backend/replication/slot.c                | 69 ++++++++++++++++---
 src/backend/replication/slotfuncs.c           |  2 +-
 src/backend/replication/walsender.c           |  4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |  2 +-
 src/include/replication/slot.h                |  3 +-
 src/test/recovery/t/019_replslot_limit.pl     |  2 +-
 8 files changed, 75 insertions(+), 22 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index b4dd5cce75..56fc1a45a9 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f4f80b2312..e3645aea53 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,10 +1540,6 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
 			SpinLockAcquire(&s->mutex);
 			s->inactive_since = now;
 			SpinLockRelease(&s->mutex);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 4a206f9527..0244191653 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -163,6 +163,7 @@ static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
 static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
+static void RaiseSlotInvalidationError(ReplicationSlot *slot);
 
 /*
  * Report shared-memory space needed by ReplicationSlotsShmemInit.
@@ -535,9 +536,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +619,13 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot is found to
+	 * be invalid.
+	 */
+	if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
+		RaiseSlotInvalidationError(s);
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +796,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +823,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1676,11 +1687,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s && active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -2208,6 +2220,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2368,6 +2381,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2400,7 +2416,7 @@ RestoreSlotFromDisk(const char *name)
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
@@ -2793,3 +2809,40 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * Raise an error based on the slot's invalidation cause.
+ */
+static void
+RaiseSlotInvalidationError(ReplicationSlot *slot)
+{
+	StringInfo	err_detail = makeStringInfo();
+
+	Assert(slot->data.invalidated != RS_INVAL_NONE);
+
+	switch (slot->data.invalidated)
+	{
+		case RS_INVAL_WAL_REMOVED:
+			appendStringInfoString(err_detail, _("This slot has been invalidated because the required WAL has been removed."));
+			break;
+
+		case RS_INVAL_HORIZON:
+			appendStringInfoString(err_detail, _("This slot has been invalidated because the required rows have been removed."));
+			break;
+
+		case RS_INVAL_WAL_LEVEL:
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."),
+							 "wal_level");
+			break;
+
+		case RS_INVAL_NONE:
+			pg_unreachable();
+	}
+
+	ereport(ERROR,
+			errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+			errmsg("can no longer get changes from replication slot \"%s\"",
+				   NameStr(slot->data.name)),
+			errdetail_internal("%s", err_detail->data));
+}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 488a161b3e..578cff64c8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index dc25dd6af9..2fe2c591ab 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 8a45b5827e..e8bc986c07 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, true);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index d2cf786fd5..f5f2d22163 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -253,7 +253,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index efb4ba3af1..333e040e7f 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This slot has been invalidated because the required WAL has been removed",
 			$logstart))
 	{
 		$failed = 1;
-- 
2.34.1

v58-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v58-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 1c2256d7c827b641e31a196aef3f06afb04ef655 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 30 Dec 2024 14:57:24 +0530
Subject: [PATCH v58 2/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  39 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/regress.sgml                     |  10 +
 doc/src/sgml/system-views.sgml                |  10 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 159 +++++++++++++--
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  14 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 ++
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 src/test/recovery/README                      |   5 +
 src/test/recovery/meson.build                 |   1 +
 .../t/043_invalidate_inactive_slots.pl        | 190 ++++++++++++++++++
 17 files changed, 478 insertions(+), 16 deletions(-)
 create mode 100644 src/test/recovery/t/043_invalidate_inactive_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index fbdd6ce574..8953c8fd4b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4593,6 +4593,45 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero disables the idle timeout invalidation mechanism.
+        The default is 24 hours. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 8290cd1a08..158ec18211 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2163,6 +2163,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index f4cef9e80f..f2387de6b5 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -336,6 +336,16 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>idle_replication_slot_timeout</literal></term>
+     <listitem>
+      <para>
+       Runs the test <filename>src/test/recovery/t/043_invalidate_inactive_slots.pl</filename>.
+       Not enabled by default because it is time consuming.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a586156614..199d7248ee 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index e3645aea53..9777c6a9cc 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1540,9 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 0244191653..d343447c19 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = HOURS_PER_DAY * MINS_PER_HOUR;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -714,16 +721,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1519,7 +1522,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1549,6 +1553,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail,
+							 _("The slot has remained idle since %s, which is longer than the configured \"%s\" duration."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1565,6 +1579,30 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot is inactive
+ * 3. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins > 0 &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1592,6 +1630,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1599,6 +1638,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1609,6 +1649,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1662,6 +1711,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (CanInvalidateIdleSlot(s) &&
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1747,7 +1811,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1793,7 +1858,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1816,6 +1882,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_IDLE_TIMEOUT: idle slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1868,7 +1935,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1926,6 +1994,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' applies only to
+		 * released slots, and is based on the 'idle_replication_slot_timeout'
+		 * GUC. Active slots currently in use for replication are excluded to
+		 * prevent accidental invalidation. Slots where communication between
+		 * the publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2414,7 +2521,9 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
 		slot->inactive_since = now;
 
@@ -2836,6 +2945,12 @@ RaiseSlotInvalidationError(ReplicationSlot *slot)
 							 "wal_level");
 			break;
 
+		case RS_INVAL_IDLE_TIMEOUT:
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(err_detail, _("This slot has been invalidated because it has remained idle longer than the configured \"%s\" duration."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -2846,3 +2961,23 @@ RaiseSlotInvalidationError(ReplicationSlot *slot)
 				   NameStr(slot->data.name)),
 			errdetail_internal("%s", err_detail->data));
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * We don't allow the value of idle_replication_slot_timeout other
+ * than 0 during the binary upgrade.
+ * See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 1b33eb6ea8..782c274944 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8cf1afbad2..902fe333e3 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3047,6 +3047,20 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		HOURS_PER_DAY * MINS_PER_HOUR,	/* 24 hours */
+		0,
+		INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a2ac7575ca..7284edfbc1 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -337,6 +337,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#idle_replication_slot_timeout = 24h	# in minutes; 0 disables
 
 # - Primary Server -
 
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index e96370a9ec..a9ac00f44f 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index 91bcb4dbc7..93940825d5 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index f5f2d22163..04d9cd73db 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 5813dba0a2..d7a7dffab5 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index a6ce03ed46..7f3d0001ef 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -143,5 +143,8 @@ extern int	date2isoyear(int year, int mon, int mday);
 extern int	date2isoyearday(int year, int mon, int mday);
 
 extern bool TimestampTimestampTzRequiresRewrite(void);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 #endif							/* TIMESTAMP_H */
diff --git a/src/test/recovery/README b/src/test/recovery/README
index 896df0ad05..b4577a8539 100644
--- a/src/test/recovery/README
+++ b/src/test/recovery/README
@@ -30,4 +30,9 @@ PG_TEST_EXTRA=wal_consistency_checking
 to the "make" command.  This is resource-intensive, so it's not done
 by default.
 
+If you want to test idle_replication_slot_timeout, add
+PG_TEST_EXTRA=idle_replication_slot_timeout
+to the "make" command. The test takes over 2 minutes, so not done
+by default.
+
 See src/test/perl/README for more info about running these tests.
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index b1eb77b1ec..3b8f45c93e 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/043_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/043_invalidate_inactive_slots.pl b/src/test/recovery/t/043_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..ab02d1f68e
--- /dev/null
+++ b/src/test/recovery/t/043_invalidate_inactive_slots.pl
@@ -0,0 +1,190 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to idle timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# The test takes over two minutes to complete. Run it only if
+# idle_replication_slot_timeout is specified in PG_TEST_EXTRA.
+if (  !$ENV{PG_TEST_EXTRA}
+	|| $ENV{PG_TEST_EXTRA} !~ /\bidle_replication_slot_timeout\b/)
+{
+	plan skip_all =>
+	  'A time consuming test, idle_replication_slot_timeout is not enabled in PG_TEST_EXTRA';
+}
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $idle_timeout_1min = 1;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep($idle_timeout_1min * 60 + 10);
+
+# On standby, synced slots are not invalidated by the idle timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1min}min';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to idle timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'idle_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_mins) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $idle_timeout_mins);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_mins) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	sleep($idle_timeout_mins * 60 + 10);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#324

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: vignesh C (#319)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Dec 27, 2024 at 9:22 AM vignesh C <vignesh21@gmail.com> wrote:

Few comments:
1) We have disabled the similar configuration max_slot_wal_keep_size
by setting to -1, as this GUC also is in similar lines, should we
disable this and let the user configure it?
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int                    idle_replication_slot_timeout_min =
HOURS_PER_DAY * MINS_PER_HOUR;
+

I’m okay with setting the default to either '1-day' or 'Off'. Let’s
wait for feedback from others.

2) I felt this behavior is an existing behavior, so this can also be
moved to 0001 patch:
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a586156614..199d7248ee 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
</para>
<para>
The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.

You are correct that the 'inactive_since' value getting reset on
server restart has been the existing behavior.
However, earlier, there was no guarantee that it would remain
unchanged for invalid slots. The new function
"ReplicationSlotSetInactiveSince()" in patch 002, ensures that the
value does not change for invalidated slots until the server is shut
down. Therefore, I feel the doc addition in patch 002 is appropriate.

3) Can we change the comment below to "We don't allow the value of
idle_replication_slot_timeout other than 0 during the binary upgrade.
See start_postmaster() in pg_upgrade for more details.":
+ * The idle_replication_slot_timeout must be disabled (set to 0)
+ * during the binary upgrade.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra,
GucSource source)

Done.

--
Thanks,
Nisha

#325

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: Nisha Moond (#323)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha.

My review comments for patch v58-0001.

======
src/backend/replication/slot.c

InvalidatePossiblyObsoleteSlot:

1.
  /*
- * If the slot can be acquired, do so and mark it invalidated
- * immediately.  Otherwise we'll signal the owning process, below, and
- * retry.
+ * If the slot can be acquired, do so and mark it as invalidated. If
+ * the slot is already ours, mark it as invalidated. Otherwise, we'll
+ * signal the owning process below and retry.
  */
- if (active_pid == 0)
+ if (active_pid == 0 ||
+ (MyReplicationSlot == s && active_pid == MyProcPid))
  {

As you previously explained [1]/messages/by-id/CABdArM5tcYTQ2zeAPWTciTnea4jj6sPUjVY9M1O-4wWoTBjFgw@mail.gmail.com "This change applies to all types of
invalidation, not just inactive_timeout case [...] It's a general
optimization for the case when the current process is the active PID
for the slot."

In that case, should this be in a separate patch that can be pushed to
master by itself, i.e. independent of anything else in this thread
that is being done for the purpose of implementing the timeout
feature?

======
[1]: /messages/by-id/CABdArM5tcYTQ2zeAPWTciTnea4jj6sPUjVY9M1O-4wWoTBjFgw@mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Austalia

#326

Peter Smith

smithpb2250@gmail.com

about 1 year ago

In reply to: Nisha Moond (#323)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha,

Here are some minor review comments for patch v58-0002.

======
src/backend/replication/slot.c

check_replication_slot_inactive_timeout:

1.
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * We don't allow the value of idle_replication_slot_timeout other
+ * than 0 during the binary upgrade.
+ * See start_postmaster() in pg_upgrade for more details.
+ */

If you want to express it this way, then it seems there are some
wrong/missing words:

SUGGESTION #1.
We don't allow any value of idle_replication_slot_timeout other than 0
during a binary upgrade.

SUGGESTION #2.
We don't allow the value of idle_replication_slot_timeout to be other
than 0 during a binary upgrade.

(But, I prefer more terse comments which are not negative-sounding. YMMV).

SUGGESTION #3 (nearly identical text to the actual error message)
The value of idle_replication_slot_timeout must be set to 0 during a
binary upgrade.

======
src/test/recovery/README

2.
+If you want to test idle_replication_slot_timeout, add
+PG_TEST_EXTRA=idle_replication_slot_timeout
+to the "make" command. The test takes over 2 minutes, so not done
+by default.
+

Maybe it's better to use consistent wording with the other tests like
this one already in the README:

/The test/This test/

/so not done by default./so it's not done by default./

======
.../t/043_invalidate_inactive_slots.pl

3.
+# Copyright (c) 2024, PostgreSQL Global Development Group
+

Happy New Year.

/2024/2025/

~~~

4.
+
+# The test takes over two minutes to complete. Run it only if
+# idle_replication_slot_timeout is specified in PG_TEST_EXTRA.
+if (  !$ENV{PG_TEST_EXTRA}
+ || $ENV{PG_TEST_EXTRA} !~ /\bidle_replication_slot_timeout\b/)
+{
+ plan skip_all =>
+   'A time consuming test, idle_replication_slot_timeout is not
enabled in PG_TEST_EXTRA';
+}

4a.
I noticed the other skipping TAP tests like this have a simpler
message without giving a reason, so maybe it's better to be consistent
with those:

SUGGESTION:
plan skip_all => "test idle_replication_slot_timeout not enabled in
PG_TEST_EXTRA";

4b.
Should the check be done right at the top of the file (e.g. even
before the "# Testcase start" comment)?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#327

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: Peter Smith (#325)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Jan 2, 2025 at 5:44 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha.

My review comments for patch v58-0001.

======
src/backend/replication/slot.c

InvalidatePossiblyObsoleteSlot:
1.
/*
- * If the slot can be acquired, do so and mark it invalidated
- * immediately.  Otherwise we'll signal the owning process, below, and
- * retry.
+ * If the slot can be acquired, do so and mark it as invalidated. If
+ * the slot is already ours, mark it as invalidated. Otherwise, we'll
+ * signal the owning process below and retry.
*/
- if (active_pid == 0)
+ if (active_pid == 0 ||
+ (MyReplicationSlot == s && active_pid == MyProcPid))
{
As you previously explained [1] "This change applies to all types of
invalidation, not just inactive_timeout case [...] It's a general
optimization for the case when the current process is the active PID
for the slot."

In that case, should this be in a separate patch that can be pushed to
master by itself, i.e. independent of anything else in this thread
that is being done for the purpose of implementing the timeout
feature?

The patch-001 has additional general optimizations similar to the one
you mentioned, which are not strictly required for this feature.
Let’s wait for input from others on splitting the patches or
addressing it in a separate thread.

--
Thanks,
Nisha

#328

Nisha Moond

nisha.moond412@gmail.com

about 1 year ago

In reply to: Peter Smith (#326)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Jan 2, 2025 at 8:16 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha,

Here are some minor review comments for patch v58-0002.

Thank you for your feedback! Please find the v59 patch set addressing
all the comments.
Note: There are no new changes in patch-0001.

--
Thanks,
Nisha

Attachments:

v59-0001-Enhance-replication-slot-error-handling-slot-inv.patchapplication/x-patch; name=v59-0001-Enhance-replication-slot-error-handling-slot-inv.patchDownload

From 8154e2baee6fcf348524899c1f8b643a1e3564fc Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:13:26 +0530
Subject: [PATCH v59 1/2] Enhance replication slot error handling, slot
 invalidation, and inactive_since setting logic

In ReplicationSlotAcquire(), raise an error for invalid slots if the
caller specifies error_if_invalid=true.

Add check if the slot is already acquired, then mark it invalidated directly.

Ensure same inactive_since time for all slots in update_synced_slots_inactive_since()
and RestoreSlotFromDisk().
---
 .../replication/logical/logicalfuncs.c        |  2 +-
 src/backend/replication/logical/slotsync.c    | 13 ++--
 src/backend/replication/slot.c                | 69 ++++++++++++++++---
 src/backend/replication/slotfuncs.c           |  2 +-
 src/backend/replication/walsender.c           |  4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |  2 +-
 src/include/replication/slot.h                |  3 +-
 src/test/recovery/t/019_replslot_limit.pl     |  2 +-
 8 files changed, 75 insertions(+), 22 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index 0148ec3678..ca53caac2f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f6945af1d4..692527b984 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,10 +1540,6 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
 			SpinLockAcquire(&s->mutex);
 			s->inactive_since = now;
 			SpinLockRelease(&s->mutex);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index b30e0473e1..47e474a548 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -163,6 +163,7 @@ static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
 static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
+static void RaiseSlotInvalidationError(ReplicationSlot *slot);
 
 /*
  * Report shared-memory space needed by ReplicationSlotsShmemInit.
@@ -535,9 +536,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +619,13 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot is found to
+	 * be invalid.
+	 */
+	if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
+		RaiseSlotInvalidationError(s);
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +796,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +823,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1676,11 +1687,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s && active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -2208,6 +2220,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2368,6 +2381,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2400,7 +2416,7 @@ RestoreSlotFromDisk(const char *name)
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
@@ -2793,3 +2809,40 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * Raise an error based on the slot's invalidation cause.
+ */
+static void
+RaiseSlotInvalidationError(ReplicationSlot *slot)
+{
+	StringInfo	err_detail = makeStringInfo();
+
+	Assert(slot->data.invalidated != RS_INVAL_NONE);
+
+	switch (slot->data.invalidated)
+	{
+		case RS_INVAL_WAL_REMOVED:
+			appendStringInfoString(err_detail, _("This slot has been invalidated because the required WAL has been removed."));
+			break;
+
+		case RS_INVAL_HORIZON:
+			appendStringInfoString(err_detail, _("This slot has been invalidated because the required rows have been removed."));
+			break;
+
+		case RS_INVAL_WAL_LEVEL:
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."),
+							 "wal_level");
+			break;
+
+		case RS_INVAL_NONE:
+			pg_unreachable();
+	}
+
+	ereport(ERROR,
+			errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+			errmsg("can no longer get changes from replication slot \"%s\"",
+				   NameStr(slot->data.name)),
+			errdetail_internal("%s", err_detail->data));
+}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 977146789f..8be4b8c65b 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a0782b1bbf..3df5bd7b2a 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 9a10907d05..d44f8c262b 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, true);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index bf62b36ad0..47ebdaecb6 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -253,7 +253,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index ae2ad5c933..66ac7c40f1 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This slot has been invalidated because the required WAL has been removed",
 			$logstart))
 	{
 		$failed = 1;
-- 
2.34.1

v59-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/x-patch; name=v59-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From a027fa8ac26973cc3adecb91a662002e0fe12765 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 30 Dec 2024 14:57:24 +0530
Subject: [PATCH v59 2/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  39 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/regress.sgml                     |  10 +
 doc/src/sgml/system-views.sgml                |  10 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 158 +++++++++++++--
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  14 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 ++
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 src/test/recovery/README                      |   5 +
 src/test/recovery/meson.build                 |   1 +
 .../t/043_invalidate_inactive_slots.pl        | 190 ++++++++++++++++++
 17 files changed, 477 insertions(+), 16 deletions(-)
 create mode 100644 src/test/recovery/t/043_invalidate_inactive_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index fbdd6ce574..8953c8fd4b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4593,6 +4593,45 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero disables the idle timeout invalidation mechanism.
+        The default is 24 hours. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 8290cd1a08..158ec18211 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2163,6 +2163,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index f4cef9e80f..f2387de6b5 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -336,6 +336,16 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>idle_replication_slot_timeout</literal></term>
+     <listitem>
+      <para>
+       Runs the test <filename>src/test/recovery/t/043_invalidate_inactive_slots.pl</filename>.
+       Not enabled by default because it is time consuming.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a586156614..199d7248ee 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 692527b984..7df6892824 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1540,9 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 47e474a548..2eec51478f 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = HOURS_PER_DAY * MINS_PER_HOUR;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -714,16 +721,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1519,7 +1522,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1549,6 +1553,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail,
+							 _("The slot has remained idle since %s, which is longer than the configured \"%s\" duration."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1565,6 +1579,30 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot is inactive
+ * 3. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins > 0 &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1592,6 +1630,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1599,6 +1638,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1609,6 +1649,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1662,6 +1711,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (CanInvalidateIdleSlot(s) &&
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1747,7 +1811,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1793,7 +1858,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1816,6 +1882,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_IDLE_TIMEOUT: idle slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1868,7 +1935,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1926,6 +1994,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' applies only to
+		 * released slots, and is based on the 'idle_replication_slot_timeout'
+		 * GUC. Active slots currently in use for replication are excluded to
+		 * prevent accidental invalidation. Slots where communication between
+		 * the publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2414,7 +2521,9 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
 		slot->inactive_since = now;
 
@@ -2836,6 +2945,12 @@ RaiseSlotInvalidationError(ReplicationSlot *slot)
 							 "wal_level");
 			break;
 
+		case RS_INVAL_IDLE_TIMEOUT:
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(err_detail, _("This slot has been invalidated because it has remained idle longer than the configured \"%s\" duration."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -2846,3 +2961,22 @@ RaiseSlotInvalidationError(ReplicationSlot *slot)
 				   NameStr(slot->data.name)),
 			errdetail_internal("%s", err_detail->data));
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 22f16a3b46..ef8a0b5fce 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3047,6 +3047,20 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		HOURS_PER_DAY * MINS_PER_HOUR,	/* 24 hours */
+		0,
+		INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a2ac7575ca..7284edfbc1 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -337,6 +337,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#idle_replication_slot_timeout = 24h	# in minutes; 0 disables
 
 # - Primary Server -
 
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 47ebdaecb6..f3994ab000 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..e1d05d6779 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -143,5 +143,8 @@ extern int	date2isoyear(int year, int mon, int mday);
 extern int	date2isoyearday(int year, int mon, int mday);
 
 extern bool TimestampTimestampTzRequiresRewrite(void);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 #endif							/* TIMESTAMP_H */
diff --git a/src/test/recovery/README b/src/test/recovery/README
index 896df0ad05..2941776780 100644
--- a/src/test/recovery/README
+++ b/src/test/recovery/README
@@ -30,4 +30,9 @@ PG_TEST_EXTRA=wal_consistency_checking
 to the "make" command.  This is resource-intensive, so it's not done
 by default.
 
+If you want to test idle_replication_slot_timeout, add
+PG_TEST_EXTRA=idle_replication_slot_timeout
+to the "make" command. This test takes over 2 minutes, so it's not done
+by default.
+
 See src/test/perl/README for more info about running these tests.
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 56c464abb7..415d45d58a 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/043_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/043_invalidate_inactive_slots.pl b/src/test/recovery/t/043_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..efd2b050f0
--- /dev/null
+++ b/src/test/recovery/t/043_invalidate_inactive_slots.pl
@@ -0,0 +1,190 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# The test takes over two minutes to complete. Run it only if
+# idle_replication_slot_timeout is specified in PG_TEST_EXTRA.
+if (  !$ENV{PG_TEST_EXTRA}
+	|| $ENV{PG_TEST_EXTRA} !~ /\bidle_replication_slot_timeout\b/)
+{
+	plan skip_all =>
+	  'test idle_replication_slot_timeout not enabled in PG_TEST_EXTRA';
+}
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to idle timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $idle_timeout_1min = 1;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep($idle_timeout_1min * 60 + 10);
+
+# On standby, synced slots are not invalidated by the idle timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1min}min';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to idle timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'idle_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_mins) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $idle_timeout_mins);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_mins) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	sleep($idle_timeout_mins * 60 + 10);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#329

vignesh C

vignesh21@gmail.com

12 months ago

In reply to: Nisha Moond (#328)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, 2 Jan 2025 at 15:57, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Thu, Jan 2, 2025 at 8:16 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha,

Here are some minor review comments for patch v58-0002.

Thank you for your feedback! Please find the v59 patch set addressing
all the comments.
Note: There are no new changes in patch-0001.

Few comments:
1) I felt we should not invalidate the slots for which has no
effective xmin set as they will not be holding the WAL files from
deletion. This can happen when user created slots with
immediately_reserve as false and lsn will be actually reserved only
after the first connection from a streaming replication client:
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+       return (idle_replication_slot_timeout_mins > 0 &&
+                       s->inactive_since > 0 &&
+                       !(RecoveryInProgress() && s->data.synced));
+}

2) We can mention this as 1d  instead of 24h as we want to represent 1
day similar to how we have mentioned for log_rotation_age:
index a2ac7575ca..7284edfbc1 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -337,6 +337,7 @@
 #wal_sender_timeout = 60s      # in milliseconds; 0 disables
 #track_commit_timestamp = off  # collect timestamp of transaction commit
                                # (change requires restart)
+#idle_replication_slot_timeout = 24h   # in minutes; 0 disables

Regards,
Vignesh

#330

Peter Smith

smithpb2250@gmail.com

12 months ago

In reply to: vignesh C (#329)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Jan 13, 2025 at 5:52 PM vignesh C <vignesh21@gmail.com> wrote:

On Thu, 2 Jan 2025 at 15:57, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Thu, Jan 2, 2025 at 8:16 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha,

Here are some minor review comments for patch v58-0002.

...

2) We can mention this as 1d  instead of 24h as we want to represent 1
day similar to how we have mentioned for log_rotation_age:
index a2ac7575ca..7284edfbc1 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -337,6 +337,7 @@
#wal_sender_timeout = 60s      # in milliseconds; 0 disables
#track_commit_timestamp = off  # collect timestamp of transaction commit
# (change requires restart)
+#idle_replication_slot_timeout = 24h   # in minutes; 0 disables

Hi Vignesh. AFAIK that is due to a previous review comment of mine
where I suggested we should use 24h format here, because this GUC
default is described as "24 hours" in the config.sgml, and I felt the
sample should match its own documentation.

======
Kind Regards,
Peter Smith.
Fujitsu Australia.

#331

vignesh C

vignesh21@gmail.com

12 months ago

In reply to: Peter Smith (#330)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, 13 Jan 2025 at 12:48, Peter Smith <smithpb2250@gmail.com> wrote:

On Mon, Jan 13, 2025 at 5:52 PM vignesh C <vignesh21@gmail.com> wrote:

On Thu, 2 Jan 2025 at 15:57, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Thu, Jan 2, 2025 at 8:16 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha,

Here are some minor review comments for patch v58-0002.

...
2) We can mention this as 1d  instead of 24h as we want to represent 1
day similar to how we have mentioned for log_rotation_age:
index a2ac7575ca..7284edfbc1 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -337,6 +337,7 @@
#wal_sender_timeout = 60s      # in milliseconds; 0 disables
#track_commit_timestamp = off  # collect timestamp of transaction commit
# (change requires restart)
+#idle_replication_slot_timeout = 24h   # in minutes; 0 disables
Hi Vignesh. AFAIK that is due to a previous review comment of mine
where I suggested we should use 24h format here, because this GUC
default is described as "24 hours" in the config.sgml, and I felt the
sample should match its own documentation.

I suggest we reverse the current approach: change the default
configuration value to 1d and update the documentation accordingly. I
preferred to set default values of 1h instead of 60 mins, 1d instead
of 24h, etc.

Regards,
Vignesh

#332

Shlok Kyal

shlok.kyal.oss@gmail.com

12 months ago

In reply to: Nisha Moond (#328)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, 2 Jan 2025 at 15:57, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Thu, Jan 2, 2025 at 8:16 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha,

Here are some minor review comments for patch v58-0002.

Thank you for your feedback! Please find the v59 patch set addressing
all the comments.
Note: There are no new changes in patch-0001.

Hi Nisha,
I reviewed the v59-0001 patch. I have few comments:

1.I think we should update the comment for function
'InvalidatePossiblyObsoleteSlot'
Currently the comment is like:

/*
* Helper for InvalidateObsoleteReplicationSlots
*
* Acquires the given slot and mark it invalid, if necessary and possible.
*
* Returns whether ReplicationSlotControlLock was released in the interim (and
* in that case we're not holding the lock at return, otherwise we are).
*
* Sets *invalidated true if the slot was invalidated. (Untouched otherwise.)
*
* This is inherently racy, because we release the LWLock
* for syscalls, so caller must restart if we return true.
*/

I think we should add a comment for the case 'when slot is already ours'.

2. Similarly we should update comment here:
/*
* Check if the slot needs to be invalidated. If it needs to be
* invalidated, and is not currently acquired, acquire it and mark it
* as having been invalidated. We do this with the spinlock held to
* avoid race conditions -- for example the restart_lsn could move
* forward, or the slot could be dropped.
*/
SpinLockAcquire(&s->mutex);

Before we release the lock, we are marking the slot as invalidated for
the case when the slot is already acquired by our process. So we
should update it in comment.

3. I think we should also update the following 'if condition':

if (active_pid != 0)
{
/*
* Prepare the sleep on the slot's condition variable before
* releasing the lock, to close a possible race condition if the
* slot is released before the sleep below.
*/
We should not enter the if condition for the case when the slot was
already acquired by our process.

Thanks and Regards,
Shlok Kyal

#333

Nisha Moond

nisha.moond412@gmail.com

12 months ago

In reply to: Shlok Kyal (#332)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Jan 15, 2025 at 11:37 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:

On Thu, 2 Jan 2025 at 15:57, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Thu, Jan 2, 2025 at 8:16 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha,

Here are some minor review comments for patch v58-0002.

Thank you for your feedback! Please find the v59 patch set addressing
all the comments.
Note: There are no new changes in patch-0001.

Hi Nisha,
I reviewed the v59-0001 patch. I have few comments:

1.I think we should update the comment for function
'InvalidatePossiblyObsoleteSlot'
Currently the comment is like:

/*
* Helper for InvalidateObsoleteReplicationSlots
*
* Acquires the given slot and mark it invalid, if necessary and possible.
*
* Returns whether ReplicationSlotControlLock was released in the interim (and
* in that case we're not holding the lock at return, otherwise we are).
*
* Sets *invalidated true if the slot was invalidated. (Untouched otherwise.)
*
* This is inherently racy, because we release the LWLock
* for syscalls, so caller must restart if we return true.
*/

I think we should add a comment for the case 'when slot is already ours'.

Done.

2. Similarly we should update comment here:
/*
* Check if the slot needs to be invalidated. If it needs to be
* invalidated, and is not currently acquired, acquire it and mark it
* as having been invalidated. We do this with the spinlock held to
* avoid race conditions -- for example the restart_lsn could move
* forward, or the slot could be dropped.
*/
SpinLockAcquire(&s->mutex);

Before we release the lock, we are marking the slot as invalidated for
the case when the slot is already acquired by our process. So we
should update it in comment.

Clarified the comments as per the mentioned case.

3. I think we should also update the following 'if condition':

if (active_pid != 0)
{
/*
* Prepare the sleep on the slot's condition variable before
* releasing the lock, to close a possible race condition if the
* slot is released before the sleep below.
*/
We should not enter the if condition for the case when the slot was
already acquired by our process.

Thank you for pointing that out. I've included the fix and also
reorganized this section of the code in patch-0001 to improve the
readability of the logic.

Attached the v60 patch set addressing above comments and all other
comments at [1]/messages/by-id/CALDaNm2r969ZZPDaAZQEtxcfL-sGUW8AGdbdwC8AcMn1V8w+hw@mail.gmail.com.

[1]: /messages/by-id/CALDaNm2r969ZZPDaAZQEtxcfL-sGUW8AGdbdwC8AcMn1V8w+hw@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v60-0001-Enhance-replication-slot-error-handling-slot-inv.patchapplication/octet-stream; name=v60-0001-Enhance-replication-slot-error-handling-slot-inv.patchDownload

From 7515615726b9ecb2f58ac75a4cf7e03d9ce43bab Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:13:26 +0530
Subject: [PATCH v60 1/2] Enhance replication slot error handling, slot
 invalidation, and inactive_since setting logic

In ReplicationSlotAcquire(), raise an error for invalid slots if the
caller specifies error_if_invalid=true.

Add check if the slot is already acquired, then mark it invalidated directly.

Ensure same inactive_since time for all slots in update_synced_slots_inactive_since()
and RestoreSlotFromDisk().
---
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |  13 +-
 src/backend/replication/slot.c                | 159 ++++++++++++------
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/include/replication/slot.h                |   3 +-
 src/test/recovery/t/019_replslot_limit.pl     |   2 +-
 8 files changed, 125 insertions(+), 62 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index 0148ec3678..ca53caac2f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f6945af1d4..692527b984 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,10 +1540,6 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
 			SpinLockAcquire(&s->mutex);
 			s->inactive_since = now;
 			SpinLockRelease(&s->mutex);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index b30e0473e1..21eb9bdfd7 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -163,6 +163,7 @@ static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
 static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
+static void RaiseSlotInvalidationError(ReplicationSlot *slot);
 
 /*
  * Report shared-memory space needed by ReplicationSlotsShmemInit.
@@ -535,9 +536,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +619,13 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot is found to
+	 * be invalid.
+	 */
+	if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
+		RaiseSlotInvalidationError(s);
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +796,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +823,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1557,7 +1568,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
- * Acquires the given slot and mark it invalid, if necessary and possible.
+ * Acquires the given slot unless already owned, and mark it invalid
+ * if necessary and possible.
  *
  * Returns whether ReplicationSlotControlLock was released in the interim (and
  * in that case we're not holding the lock at return, otherwise we are).
@@ -1600,10 +1612,11 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
-		 * invalidated, and is not currently acquired, acquire it and mark it
-		 * as having been invalidated.  We do this with the spinlock held to
-		 * avoid race conditions -- for example the restart_lsn could move
-		 * forward, or the slot could be dropped.
+		 * invalidated, and is already ours, mark it as having been
+		 * invalidated; otherwise, acquire it first and then mark it as having
+		 * been invalidated. We do this with the spinlock held to avoid race
+		 * conditions -- for example the restart_lsn could move forward, or
+		 * the slot could be dropped.
 		 */
 		SpinLockAcquire(&s->mutex);
 
@@ -1676,11 +1689,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s && active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1695,20 +1709,53 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			/* Let caller know */
 			*invalidated = true;
-		}
 
-		SpinLockRelease(&s->mutex);
+			SpinLockRelease(&s->mutex);
 
-		/*
-		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
-		 */
-		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
+			/*
+			 * The logical replication slots shouldn't be invalidated as GUC
+			 * max_slot_wal_keep_size is set to -1 during the binary upgrade.
+			 * See check_old_cluster_for_valid_slots() where we ensure that no
+			 * invalidated before the upgrade.
+			 */
+			Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
+
+			/*
+			 * We hold the slot now and have already invalidated it; flush it
+			 * to ensure that state persists.
+			 *
+			 * Don't want to hold ReplicationSlotControlLock across file
+			 * system operations, so release it now but be sure to tell caller
+			 * to restart from scratch.
+			 */
+			LWLockRelease(ReplicationSlotControlLock);
+			released_lock = true;
 
-		if (active_pid != 0)
+			/* Make sure the invalidated state persists across server restart */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+			ReplicationSlotRelease();
+
+			ReportSlotInvalidation(invalidation_cause, false, active_pid,
+								   slotname, restart_lsn,
+								   oldestLSN, snapshotConflictHorizon);
+
+			/* done with this slot for now */
+			break;
+		}
+		else					/* Some other process owns the slot */
 		{
+
+			SpinLockRelease(&s->mutex);
+
+			/*
+			 * The logical replication slots shouldn't be invalidated as GUC
+			 * max_slot_wal_keep_size is set to -1 during the binary upgrade.
+			 * See check_old_cluster_for_valid_slots() where we ensure that no
+			 * invalidated before the upgrade.
+			 */
+			Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
+
 			/*
 			 * Prepare the sleep on the slot's condition variable before
 			 * releasing the lock, to close a possible race condition if the
@@ -1761,31 +1808,6 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 			continue;
 		}
-		else
-		{
-			/*
-			 * We hold the slot now and have already invalidated it; flush it
-			 * to ensure that state persists.
-			 *
-			 * Don't want to hold ReplicationSlotControlLock across file
-			 * system operations, so release it now but be sure to tell caller
-			 * to restart from scratch.
-			 */
-			LWLockRelease(ReplicationSlotControlLock);
-			released_lock = true;
-
-			/* Make sure the invalidated state persists across server restart */
-			ReplicationSlotMarkDirty();
-			ReplicationSlotSave();
-			ReplicationSlotRelease();
-
-			ReportSlotInvalidation(invalidation_cause, false, active_pid,
-								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
-
-			/* done with this slot for now */
-			break;
-		}
 	}
 
 	Assert(released_lock == !LWLockHeldByMe(ReplicationSlotControlLock));
@@ -2208,6 +2230,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2368,6 +2391,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2400,7 +2426,7 @@ RestoreSlotFromDisk(const char *name)
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
@@ -2793,3 +2819,40 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * Raise an error based on the slot's invalidation cause.
+ */
+static void
+RaiseSlotInvalidationError(ReplicationSlot *slot)
+{
+	StringInfo	err_detail = makeStringInfo();
+
+	Assert(slot->data.invalidated != RS_INVAL_NONE);
+
+	switch (slot->data.invalidated)
+	{
+		case RS_INVAL_WAL_REMOVED:
+			appendStringInfoString(err_detail, _("This slot has been invalidated because the required WAL has been removed."));
+			break;
+
+		case RS_INVAL_HORIZON:
+			appendStringInfoString(err_detail, _("This slot has been invalidated because the required rows have been removed."));
+			break;
+
+		case RS_INVAL_WAL_LEVEL:
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."),
+							 "wal_level");
+			break;
+
+		case RS_INVAL_NONE:
+			pg_unreachable();
+	}
+
+	ereport(ERROR,
+			errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+			errmsg("can no longer get changes from replication slot \"%s\"",
+				   NameStr(slot->data.name)),
+			errdetail_internal("%s", err_detail->data));
+}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 977146789f..8be4b8c65b 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a0782b1bbf..3df5bd7b2a 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 9a10907d05..d44f8c262b 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, true);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index bf62b36ad0..47ebdaecb6 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -253,7 +253,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index ae2ad5c933..66ac7c40f1 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This slot has been invalidated because the required WAL has been removed",
 			$logstart))
 	{
 		$failed = 1;
-- 
2.34.1

v60-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v60-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 1180546bf30d68bad2b83ddcab8db985ec118e35 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Wed, 15 Jan 2025 17:28:41 +0530
Subject: [PATCH v60 2/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  39 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/regress.sgml                     |  10 +
 doc/src/sgml/system-views.sgml                |  10 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 160 +++++++++++++--
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  14 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 ++
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 src/test/recovery/README                      |   5 +
 src/test/recovery/meson.build                 |   1 +
 .../t/043_invalidate_inactive_slots.pl        | 190 ++++++++++++++++++
 17 files changed, 479 insertions(+), 16 deletions(-)
 create mode 100644 src/test/recovery/t/043_invalidate_inactive_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3f41a17b1f..5425050752 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4450,6 +4450,45 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero disables the idle timeout invalidation mechanism.
+        The default is one day. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 7cc5f4b18d..540dd27f91 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2187,6 +2187,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index f4cef9e80f..f2387de6b5 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -336,6 +336,16 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>idle_replication_slot_timeout</literal></term>
+     <listitem>
+      <para>
+       Runs the test <filename>src/test/recovery/t/043_invalidate_inactive_slots.pl</filename>.
+       Not enabled by default because it is time consuming.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index a586156614..199d7248ee 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 692527b984..7df6892824 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1540,9 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 21eb9bdfd7..17dcda111c 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = HOURS_PER_DAY * MINS_PER_HOUR;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -714,16 +721,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1519,7 +1522,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1549,6 +1553,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail,
+							 _("The slot has remained idle since %s, which is longer than the configured \"%s\" duration."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1565,6 +1579,32 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has WAL reserved
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins > 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1593,6 +1633,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1600,6 +1641,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1610,6 +1652,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is already ours, mark it as having been
@@ -1664,6 +1715,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (CanInvalidateIdleSlot(s) &&
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1738,7 +1804,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1782,7 +1849,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1826,6 +1894,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_IDLE_TIMEOUT: idle slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1878,7 +1947,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1936,6 +2006,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' applies only to
+		 * released slots, and is based on the 'idle_replication_slot_timeout'
+		 * GUC. Active slots currently in use for replication are excluded to
+		 * prevent accidental invalidation. Slots where communication between
+		 * the publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2424,7 +2533,9 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
 		slot->inactive_since = now;
 
@@ -2846,6 +2957,12 @@ RaiseSlotInvalidationError(ReplicationSlot *slot)
 							 "wal_level");
 			break;
 
+		case RS_INVAL_IDLE_TIMEOUT:
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(err_detail, _("This slot has been invalidated because it has remained idle longer than the configured \"%s\" duration."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -2856,3 +2973,22 @@ RaiseSlotInvalidationError(ReplicationSlot *slot)
 				   NameStr(slot->data.name)),
 			errdetail_internal("%s", err_detail->data));
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 38cb9e970d..7cbba03bc1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,20 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		HOURS_PER_DAY * MINS_PER_HOUR,	/* 1 day */
+		0,
+		INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 079efa1baa..0ed9eb057e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -329,6 +329,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#idle_replication_slot_timeout = 1d	# in minutes; 0 disables
 
 # - Primary Server -
 
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 47ebdaecb6..f3994ab000 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..e1d05d6779 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -143,5 +143,8 @@ extern int	date2isoyear(int year, int mon, int mday);
 extern int	date2isoyearday(int year, int mon, int mday);
 
 extern bool TimestampTimestampTzRequiresRewrite(void);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 #endif							/* TIMESTAMP_H */
diff --git a/src/test/recovery/README b/src/test/recovery/README
index 896df0ad05..2941776780 100644
--- a/src/test/recovery/README
+++ b/src/test/recovery/README
@@ -30,4 +30,9 @@ PG_TEST_EXTRA=wal_consistency_checking
 to the "make" command.  This is resource-intensive, so it's not done
 by default.
 
+If you want to test idle_replication_slot_timeout, add
+PG_TEST_EXTRA=idle_replication_slot_timeout
+to the "make" command. This test takes over 2 minutes, so it's not done
+by default.
+
 See src/test/perl/README for more info about running these tests.
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 56c464abb7..415d45d58a 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -51,6 +51,7 @@ tests += {
       't/040_standby_failover_slots_sync.pl',
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
+      't/043_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/043_invalidate_inactive_slots.pl b/src/test/recovery/t/043_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..efd2b050f0
--- /dev/null
+++ b/src/test/recovery/t/043_invalidate_inactive_slots.pl
@@ -0,0 +1,190 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# The test takes over two minutes to complete. Run it only if
+# idle_replication_slot_timeout is specified in PG_TEST_EXTRA.
+if (  !$ENV{PG_TEST_EXTRA}
+	|| $ENV{PG_TEST_EXTRA} !~ /\bidle_replication_slot_timeout\b/)
+{
+	plan skip_all =>
+	  'test idle_replication_slot_timeout not enabled in PG_TEST_EXTRA';
+}
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to idle timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $idle_timeout_1min = 1;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep($idle_timeout_1min * 60 + 10);
+
+# On standby, synced slots are not invalidated by the idle timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1min}min';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to idle timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'idle_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_mins) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $idle_timeout_mins);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_mins) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	sleep($idle_timeout_mins * 60 + 10);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#334

Nisha Moond

nisha.moond412@gmail.com

12 months ago

In reply to: vignesh C (#329)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Jan 13, 2025 at 12:22 PM vignesh C <vignesh21@gmail.com> wrote:

On Thu, 2 Jan 2025 at 15:57, Nisha Moond <nisha.moond412@gmail.com> wrote:

Thank you for your feedback! Please find the v59 patch set addressing
all the comments.
Note: There are no new changes in patch-0001.

Few comments:
1) I felt we should not invalidate the slots for which has no
effective xmin set as they will not be holding the WAL files from
deletion. This can happen when user created slots with
immediately_reserve as false and lsn will be actually reserved only
after the first connection from a streaming replication client:
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+       return (idle_replication_slot_timeout_mins > 0 &&
+                       s->inactive_since > 0 &&
+                       !(RecoveryInProgress() && s->data.synced));
+}

IIUC, for both logical and physical replication slots, the effective
xmin remains NULL until the respective nodes establish their first
connection. However, logical slots always reserve WAL immediately,
whereas physical slots do not unless immediately_reserve=true is set.
To avoid unnecessary slot invalidation for slots that are not
reserving WAL when created, it might be better to check the
restart_lsn instead of effective_xmin. If restart_lsn is invalid for a
slot, it indicates that WAL is not reserved for the slot, and we can
safely skip invalidation due to idle_timeout for such slots. This
logic has been implemented in v60.

--
Thanks,
Nisha

#335

Shlok Kyal

shlok.kyal.oss@gmail.com

12 months ago

In reply to: Nisha Moond (#333)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, 16 Jan 2025 at 12:35, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Wed, Jan 15, 2025 at 11:37 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:

On Thu, 2 Jan 2025 at 15:57, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Thu, Jan 2, 2025 at 8:16 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha,

Here are some minor review comments for patch v58-0002.

Thank you for your feedback! Please find the v59 patch set addressing
all the comments.
Note: There are no new changes in patch-0001.

Hi Nisha,
I reviewed the v59-0001 patch. I have few comments:

1.I think we should update the comment for function
'InvalidatePossiblyObsoleteSlot'
Currently the comment is like:

/*
* Helper for InvalidateObsoleteReplicationSlots
*
* Acquires the given slot and mark it invalid, if necessary and possible.
*
* Returns whether ReplicationSlotControlLock was released in the interim (and
* in that case we're not holding the lock at return, otherwise we are).
*
* Sets *invalidated true if the slot was invalidated. (Untouched otherwise.)
*
* This is inherently racy, because we release the LWLock
* for syscalls, so caller must restart if we return true.
*/

I think we should add a comment for the case 'when slot is already ours'.

Done.

2. Similarly we should update comment here:
/*
* Check if the slot needs to be invalidated. If it needs to be
* invalidated, and is not currently acquired, acquire it and mark it
* as having been invalidated. We do this with the spinlock held to
* avoid race conditions -- for example the restart_lsn could move
* forward, or the slot could be dropped.
*/
SpinLockAcquire(&s->mutex);

Before we release the lock, we are marking the slot as invalidated for
the case when the slot is already acquired by our process. So we
should update it in comment.

Clarified the comments as per the mentioned case.

3. I think we should also update the following 'if condition':

if (active_pid != 0)
{
/*
* Prepare the sleep on the slot's condition variable before
* releasing the lock, to close a possible race condition if the
* slot is released before the sleep below.
*/
We should not enter the if condition for the case when the slot was
already acquired by our process.

Thank you for pointing that out. I've included the fix and also
reorganized this section of the code in patch-0001 to improve the
readability of the logic.

Attached the v60 patch set addressing above comments and all other
comments at [1].

[1] /messages/by-id/CALDaNm2r969ZZPDaAZQEtxcfL-sGUW8AGdbdwC8AcMn1V8w+hw@mail.gmail.com

Hi Nisha,

Thanks for providing an updated patch. I have tested the patch and ran
some tests. The patch works fine. I have few comments:

v60-0001 patch:
1) There is extra line before 'SpinLockRelease(&s->mutex)' :

+       else                    /* Some other process owns the slot */
        {
+
+           SpinLockRelease(&s->mutex);

v60-0002 patch:
1) In the comment:

/*
* Invalidate slots that require resources about to be removed.
*
* Returns true when any slot have got invalidated.
*
* Whether a slot needs to be invalidated depends on the cause. A slot is
* removed if it:
* - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
* - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
* db; dboid may be InvalidOid for shared relations
* - RS_INVAL_WAL_LEVEL: is logical
* - RS_INVAL_IDLE_TIMEOUT: idle slot timeout has occurred
*
* NB - this runs as part of checkpoint, so avoid raising errors if possible.
*/

It is mentioned that 'A slot is removed if it:'. I think instead of
'removed' it should be 'invalidated'.

Thanks and regards,
Shlok Kyal

#336

Nisha Moond

nisha.moond412@gmail.com

12 months ago

In reply to: Shlok Kyal (#335)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Jan 17, 2025 at 6:50 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:

Hi Nisha,

Thanks for providing an updated patch. I have tested the patch and ran
some tests. The patch works fine. I have few comments:

Thanks for your review. Attached are the v61 patches.

I've addressed the comments and rebased patches as needed due to the
latest changes on pgHead. The patch-0002 test file name has been
updated from "043_invalidate_inactive_slots.pl" to
"044_invalidate_inactive_slots.pl".

--
Thanks,
Nisha

Attachments:

v61-0001-Enhance-replication-slot-error-handling-slot-inv.patchapplication/octet-stream; name=v61-0001-Enhance-replication-slot-error-handling-slot-inv.patchDownload

From e3652b3ed884679aa2514b0f1e47fe9d91b214aa Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:13:26 +0530
Subject: [PATCH v61 1/2] Enhance replication slot error handling, slot
 invalidation, and inactive_since setting logic

In ReplicationSlotAcquire(), raise an error for invalid slots if the
caller specifies error_if_invalid=true.

Add check if the slot is already acquired, then mark it invalidated directly.

Ensure same inactive_since time for all slots in update_synced_slots_inactive_since()
and RestoreSlotFromDisk().
---
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |  13 +-
 src/backend/replication/slot.c                | 158 ++++++++++++------
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/include/replication/slot.h                |   3 +-
 src/test/recovery/t/019_replslot_limit.pl     |   2 +-
 8 files changed, 124 insertions(+), 62 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index 0148ec3678..ca53caac2f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f6945af1d4..692527b984 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,10 +1540,6 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
 			SpinLockAcquire(&s->mutex);
 			s->inactive_since = now;
 			SpinLockRelease(&s->mutex);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index b30e0473e1..4a78160c6c 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -163,6 +163,7 @@ static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
 static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
+static void RaiseSlotInvalidationError(ReplicationSlot *slot);
 
 /*
  * Report shared-memory space needed by ReplicationSlotsShmemInit.
@@ -535,9 +536,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +619,13 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot is found to
+	 * be invalid.
+	 */
+	if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
+		RaiseSlotInvalidationError(s);
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +796,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +823,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1557,7 +1568,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
- * Acquires the given slot and mark it invalid, if necessary and possible.
+ * Acquires the given slot unless already owned, and mark it invalid
+ * if necessary and possible.
  *
  * Returns whether ReplicationSlotControlLock was released in the interim (and
  * in that case we're not holding the lock at return, otherwise we are).
@@ -1600,10 +1612,11 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
-		 * invalidated, and is not currently acquired, acquire it and mark it
-		 * as having been invalidated.  We do this with the spinlock held to
-		 * avoid race conditions -- for example the restart_lsn could move
-		 * forward, or the slot could be dropped.
+		 * invalidated, and is already ours, mark it as having been
+		 * invalidated; otherwise, acquire it first and then mark it as having
+		 * been invalidated. We do this with the spinlock held to avoid race
+		 * conditions -- for example the restart_lsn could move forward, or
+		 * the slot could be dropped.
 		 */
 		SpinLockAcquire(&s->mutex);
 
@@ -1676,11 +1689,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s && active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1695,20 +1709,52 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			/* Let caller know */
 			*invalidated = true;
-		}
 
-		SpinLockRelease(&s->mutex);
+			SpinLockRelease(&s->mutex);
 
-		/*
-		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
-		 */
-		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
+			/*
+			 * The logical replication slots shouldn't be invalidated as GUC
+			 * max_slot_wal_keep_size is set to -1 during the binary upgrade.
+			 * See check_old_cluster_for_valid_slots() where we ensure that no
+			 * invalidated before the upgrade.
+			 */
+			Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
+
+			/*
+			 * We hold the slot now and have already invalidated it; flush it
+			 * to ensure that state persists.
+			 *
+			 * Don't want to hold ReplicationSlotControlLock across file
+			 * system operations, so release it now but be sure to tell caller
+			 * to restart from scratch.
+			 */
+			LWLockRelease(ReplicationSlotControlLock);
+			released_lock = true;
 
-		if (active_pid != 0)
+			/* Make sure the invalidated state persists across server restart */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+			ReplicationSlotRelease();
+
+			ReportSlotInvalidation(invalidation_cause, false, active_pid,
+								   slotname, restart_lsn,
+								   oldestLSN, snapshotConflictHorizon);
+
+			/* done with this slot for now */
+			break;
+		}
+		else					/* Some other process owns the slot */
 		{
+			SpinLockRelease(&s->mutex);
+
+			/*
+			 * The logical replication slots shouldn't be invalidated as GUC
+			 * max_slot_wal_keep_size is set to -1 during the binary upgrade.
+			 * See check_old_cluster_for_valid_slots() where we ensure that no
+			 * invalidated before the upgrade.
+			 */
+			Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
+
 			/*
 			 * Prepare the sleep on the slot's condition variable before
 			 * releasing the lock, to close a possible race condition if the
@@ -1761,31 +1807,6 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 			continue;
 		}
-		else
-		{
-			/*
-			 * We hold the slot now and have already invalidated it; flush it
-			 * to ensure that state persists.
-			 *
-			 * Don't want to hold ReplicationSlotControlLock across file
-			 * system operations, so release it now but be sure to tell caller
-			 * to restart from scratch.
-			 */
-			LWLockRelease(ReplicationSlotControlLock);
-			released_lock = true;
-
-			/* Make sure the invalidated state persists across server restart */
-			ReplicationSlotMarkDirty();
-			ReplicationSlotSave();
-			ReplicationSlotRelease();
-
-			ReportSlotInvalidation(invalidation_cause, false, active_pid,
-								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
-
-			/* done with this slot for now */
-			break;
-		}
 	}
 
 	Assert(released_lock == !LWLockHeldByMe(ReplicationSlotControlLock));
@@ -2208,6 +2229,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2368,6 +2390,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2400,7 +2425,7 @@ RestoreSlotFromDisk(const char *name)
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
@@ -2793,3 +2818,40 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * Raise an error based on the slot's invalidation cause.
+ */
+static void
+RaiseSlotInvalidationError(ReplicationSlot *slot)
+{
+	StringInfo	err_detail = makeStringInfo();
+
+	Assert(slot->data.invalidated != RS_INVAL_NONE);
+
+	switch (slot->data.invalidated)
+	{
+		case RS_INVAL_WAL_REMOVED:
+			appendStringInfoString(err_detail, _("This slot has been invalidated because the required WAL has been removed."));
+			break;
+
+		case RS_INVAL_HORIZON:
+			appendStringInfoString(err_detail, _("This slot has been invalidated because the required rows have been removed."));
+			break;
+
+		case RS_INVAL_WAL_LEVEL:
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."),
+							 "wal_level");
+			break;
+
+		case RS_INVAL_NONE:
+			pg_unreachable();
+	}
+
+	ereport(ERROR,
+			errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+			errmsg("can no longer get changes from replication slot \"%s\"",
+				   NameStr(slot->data.name)),
+			errdetail_internal("%s", err_detail->data));
+}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 977146789f..8be4b8c65b 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a0782b1bbf..3df5bd7b2a 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 9a10907d05..d44f8c262b 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, true);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index bf62b36ad0..47ebdaecb6 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -253,7 +253,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index ae2ad5c933..66ac7c40f1 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This slot has been invalidated because the required WAL has been removed",
 			$logstart))
 	{
 		$failed = 1;
-- 
2.34.1

v61-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v61-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From d374a1a486feeaa87fbf2788fa4de7249ae7cc99 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 20 Jan 2025 10:50:09 +0530
Subject: [PATCH v61 2/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  39 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/regress.sgml                     |  10 +
 doc/src/sgml/system-views.sgml                |  10 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 162 +++++++++++++--
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  14 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 ++
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 src/test/recovery/README                      |   5 +
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 190 ++++++++++++++++++
 17 files changed, 480 insertions(+), 17 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a8866292d4..2e2935b24f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4450,6 +4450,45 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero disables the idle timeout invalidation mechanism.
+        The default is one day. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 7cc5f4b18d..540dd27f91 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2187,6 +2187,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index f4cef9e80f..f2387de6b5 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -336,6 +336,16 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>idle_replication_slot_timeout</literal></term>
+     <listitem>
+      <para>
+       Runs the test <filename>src/test/recovery/t/043_invalidate_inactive_slots.pl</filename>.
+       Not enabled by default because it is time consuming.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 8e2b0a7927..7d3a0aa709 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 692527b984..7df6892824 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1540,9 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 4a78160c6c..2f9f2aa5c3 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = HOURS_PER_DAY * MINS_PER_HOUR;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -714,16 +721,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1519,7 +1522,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1549,6 +1553,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail,
+							 _("The slot has remained idle since %s, which is longer than the configured \"%s\" duration."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1565,6 +1579,32 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has WAL reserved
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins > 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1593,6 +1633,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1600,6 +1641,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1610,6 +1652,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is already ours, mark it as having been
@@ -1664,6 +1715,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (CanInvalidateIdleSlot(s) &&
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1738,7 +1804,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1781,7 +1848,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1820,11 +1888,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  * Returns true when any slot have got invalidated.
  *
  * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_IDLE_TIMEOUT: idle slot timeout has occurred
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1877,7 +1946,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1935,6 +2005,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' applies only to
+		 * released slots, and is based on the 'idle_replication_slot_timeout'
+		 * GUC. Active slots currently in use for replication are excluded to
+		 * prevent accidental invalidation. Slots where communication between
+		 * the publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2423,7 +2532,9 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
 		slot->inactive_since = now;
 
@@ -2845,6 +2956,12 @@ RaiseSlotInvalidationError(ReplicationSlot *slot)
 							 "wal_level");
 			break;
 
+		case RS_INVAL_IDLE_TIMEOUT:
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(err_detail, _("This slot has been invalidated because it has remained idle longer than the configured \"%s\" duration."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -2855,3 +2972,22 @@ RaiseSlotInvalidationError(ReplicationSlot *slot)
 				   NameStr(slot->data.name)),
 			errdetail_internal("%s", err_detail->data));
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 38cb9e970d..7cbba03bc1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,20 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		HOURS_PER_DAY * MINS_PER_HOUR,	/* 1 day */
+		0,
+		INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 079efa1baa..0ed9eb057e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -329,6 +329,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#idle_replication_slot_timeout = 1d	# in minutes; 0 disables
 
 # - Primary Server -
 
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 47ebdaecb6..f3994ab000 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..e1d05d6779 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -143,5 +143,8 @@ extern int	date2isoyear(int year, int mon, int mday);
 extern int	date2isoyearday(int year, int mon, int mday);
 
 extern bool TimestampTimestampTzRequiresRewrite(void);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 #endif							/* TIMESTAMP_H */
diff --git a/src/test/recovery/README b/src/test/recovery/README
index 896df0ad05..2941776780 100644
--- a/src/test/recovery/README
+++ b/src/test/recovery/README
@@ -30,4 +30,9 @@ PG_TEST_EXTRA=wal_consistency_checking
 to the "make" command.  This is resource-intensive, so it's not done
 by default.
 
+If you want to test idle_replication_slot_timeout, add
+PG_TEST_EXTRA=idle_replication_slot_timeout
+to the "make" command. This test takes over 2 minutes, so it's not done
+by default.
+
 See src/test/perl/README for more info about running these tests.
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..efd2b050f0
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,190 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# The test takes over two minutes to complete. Run it only if
+# idle_replication_slot_timeout is specified in PG_TEST_EXTRA.
+if (  !$ENV{PG_TEST_EXTRA}
+	|| $ENV{PG_TEST_EXTRA} !~ /\bidle_replication_slot_timeout\b/)
+{
+	plan skip_all =>
+	  'test idle_replication_slot_timeout not enabled in PG_TEST_EXTRA';
+}
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to idle timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $idle_timeout_1min = 1;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep($idle_timeout_1min * 60 + 10);
+
+# On standby, synced slots are not invalidated by the idle timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1min}min';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to idle timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'idle_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_mins) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $idle_timeout_mins);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_mins) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	sleep($idle_timeout_mins * 60 + 10);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#337

Peter Smith

smithpb2250@gmail.com

12 months ago

In reply to: Nisha Moond (#336)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Some review comments for patch v61-0001.

======
src/backend/replication/slot.c

InvalidatePossiblyObsoleteSlot:

1.
  /*
  * Check if the slot needs to be invalidated. If it needs to be
- * invalidated, and is not currently acquired, acquire it and mark it
- * as having been invalidated.  We do this with the spinlock held to
- * avoid race conditions -- for example the restart_lsn could move
- * forward, or the slot could be dropped.
+ * invalidated, and is already ours, mark it as having been
+ * invalidated; otherwise, acquire it first and then mark it as having
+ * been invalidated. We do this with the spinlock held to avoid race
+ * conditions -- for example the restart_lsn could move forward, or
+ * the slot could be dropped.
  */

Can't you just word this as "mark it as invalidated" (which you do
later anyway) instead of "mark is as having been invalidated"? (this
is in two places).

~~~

2.
+ /*
+ * The logical replication slots shouldn't be invalidated as GUC
+ * max_slot_wal_keep_size is set to -1 during the binary upgrade.
+ * See check_old_cluster_for_valid_slots() where we ensure that no
+ * invalidated before the upgrade.
+ */
+ Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));

2a.
I know this sentence was the same before you moved it, but "ensure
that no invalidated" seems like there are some words missing.

2b.
TBH, I this part confused me (the code is repeated in a couple of
places). AFAICT the code/comment does not match quite right. The
comment refers to a setting that is "during binary upgrade", yet the
Assert can only pass if IsBinaryUpgrade is false. (??)

I'm unsure of the intent; perhaps it should be like:

if (IsBinaryUpgrade)
Assert(!(*invalidated && SlotIsLogical(s)));

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#338

Peter Smith

smithpb2250@gmail.com

12 months ago

In reply to: Nisha Moond (#336)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Some review comments for patch v61-0002

======
src/backend/replication/slot.c

1.
  * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
  * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_IDLE_TIMEOUT: idle slot timeout has occurred

1a.
Firstly the wording seems odd. "Is invalidated if it:" (missing words?)

1b.
Secondly, is this comment strictly correct? IIUC it's not *always*
going to be invalidated just because the cause is one of those listed.
e.g. the code calls InvalidatePossiblyObsoleteSlot but it might not
end up invalidating the slot having a cause RS_INVAL_IDLE_TIMEOUT.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#339

Nisha Moond

nisha.moond412@gmail.com

12 months ago

In reply to: Peter Smith (#338)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Jan 21, 2025 at 8:26 AM Peter Smith <smithpb2250@gmail.com> wrote:

Some review comments for patch v61-0002

======
src/backend/replication/slot.c
1.
* Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * invalidated if it:
* - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
* - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
*   db; dboid may be InvalidOid for shared relations
* - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_IDLE_TIMEOUT: idle slot timeout has occurred
1a.
Firstly the wording seems odd. "Is invalidated if it:" (missing words?)

~

1b.
Secondly, is this comment strictly correct? IIUC it's not *always*
going to be invalidated just because the cause is one of those listed.
e.g. the code calls InvalidatePossiblyObsoleteSlot but it might not
end up invalidating the slot having a cause RS_INVAL_IDLE_TIMEOUT.

I feel the phrase "A slot is invalidated if it:" is supposed to be
read alongside the respective cause description, such as:
"A slot is invalidated if it requires an LSN older than…"
"A slot is invalidated if it requires a snapshot <= the…"
"A slot is invalidated if it is logical"

IIUC, each listed cause specifies a clear condition under which the
slot should *always* be invalidated for that cause. To maintain
consistency with the header line "A slot is invalidated if it:", I’ve
modified the description/condition for RS_INVAL_IDLE_TIMEOUT
accordingly. Also, corrected the RS_INVAL_WAL_LEVEL description.

Attached the v62 patch set with above mentioned changes and also
addressed comments at [1]/messages/by-id/CAHut+PvC4uPabeGMvDuTQ4S+5eX66Y6+tU5QMRmB2jDw-Cj2Cw@mail.gmail.com.

[1]: /messages/by-id/CAHut+PvC4uPabeGMvDuTQ4S+5eX66Y6+tU5QMRmB2jDw-Cj2Cw@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v62-0001-Enhance-replication-slot-error-handling-slot-inv.patchapplication/octet-stream; name=v62-0001-Enhance-replication-slot-error-handling-slot-inv.patchDownload

From f33aadb5ea689d6ac4d2c6b1698d7eaea11604cb Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:13:26 +0530
Subject: [PATCH v62 1/2] Enhance replication slot error handling, slot
 invalidation, and inactive_since setting logic

In ReplicationSlotAcquire(), raise an error for invalid slots if the
caller specifies error_if_invalid=true.

Add check if the slot is already acquired, then mark it invalidated directly.

Ensure same inactive_since time for all slots in update_synced_slots_inactive_since()
and RestoreSlotFromDisk().
---
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |  13 +-
 src/backend/replication/slot.c                | 157 ++++++++++++------
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/include/replication/slot.h                |   3 +-
 src/test/recovery/t/019_replslot_limit.pl     |   2 +-
 8 files changed, 123 insertions(+), 62 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index 0148ec3678..ca53caac2f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f6945af1d4..692527b984 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,10 +1540,6 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
 			SpinLockAcquire(&s->mutex);
 			s->inactive_since = now;
 			SpinLockRelease(&s->mutex);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index b30e0473e1..cdd50046b8 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -163,6 +163,7 @@ static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
 static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
+static void RaiseSlotInvalidationError(ReplicationSlot *slot);
 
 /*
  * Report shared-memory space needed by ReplicationSlotsShmemInit.
@@ -535,9 +536,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +619,13 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot is found to
+	 * be invalid.
+	 */
+	if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
+		RaiseSlotInvalidationError(s);
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +796,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +823,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1557,7 +1568,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
- * Acquires the given slot and mark it invalid, if necessary and possible.
+ * Acquires the given slot unless already owned, and mark it invalid
+ * if necessary and possible.
  *
  * Returns whether ReplicationSlotControlLock was released in the interim (and
  * in that case we're not holding the lock at return, otherwise we are).
@@ -1600,10 +1612,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
-		 * invalidated, and is not currently acquired, acquire it and mark it
-		 * as having been invalidated.  We do this with the spinlock held to
-		 * avoid race conditions -- for example the restart_lsn could move
-		 * forward, or the slot could be dropped.
+		 * invalidated, and is already ours, mark it as invalidated;
+		 * otherwise, acquire it first and then mark it as invalidated. We do
+		 * this with the spinlock held to avoid race conditions -- for example
+		 * the restart_lsn could move forward, or the slot could be dropped.
 		 */
 		SpinLockAcquire(&s->mutex);
 
@@ -1676,11 +1688,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s && active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1695,20 +1708,52 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			/* Let caller know */
 			*invalidated = true;
-		}
 
-		SpinLockRelease(&s->mutex);
+			SpinLockRelease(&s->mutex);
 
-		/*
-		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
-		 */
-		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
+			/*
+			 * The logical replication slots shouldn't be invalidated as GUC
+			 * max_slot_wal_keep_size is set to -1 during the binary upgrade.
+			 * See check_old_cluster_for_valid_slots() where we ensure that
+			 * there are no invalidated slots before the upgrade.
+			 */
+			Assert(!(IsBinaryUpgrade && *invalidated && SlotIsLogical(s)));
+
+			/*
+			 * We hold the slot now and have already invalidated it; flush it
+			 * to ensure that state persists.
+			 *
+			 * Don't want to hold ReplicationSlotControlLock across file
+			 * system operations, so release it now but be sure to tell caller
+			 * to restart from scratch.
+			 */
+			LWLockRelease(ReplicationSlotControlLock);
+			released_lock = true;
 
-		if (active_pid != 0)
+			/* Make sure the invalidated state persists across server restart */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+			ReplicationSlotRelease();
+
+			ReportSlotInvalidation(invalidation_cause, false, active_pid,
+								   slotname, restart_lsn,
+								   oldestLSN, snapshotConflictHorizon);
+
+			/* done with this slot for now */
+			break;
+		}
+		else					/* Some other process owns the slot */
 		{
+			SpinLockRelease(&s->mutex);
+
+			/*
+			 * The logical replication slots shouldn't be invalidated as GUC
+			 * max_slot_wal_keep_size is set to -1 during the binary upgrade.
+			 * See check_old_cluster_for_valid_slots() where we ensure that
+			 * there are no invalidated slots before the upgrade.
+			 */
+			Assert(!(IsBinaryUpgrade && *invalidated && SlotIsLogical(s)));
+
 			/*
 			 * Prepare the sleep on the slot's condition variable before
 			 * releasing the lock, to close a possible race condition if the
@@ -1761,31 +1806,6 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 			continue;
 		}
-		else
-		{
-			/*
-			 * We hold the slot now and have already invalidated it; flush it
-			 * to ensure that state persists.
-			 *
-			 * Don't want to hold ReplicationSlotControlLock across file
-			 * system operations, so release it now but be sure to tell caller
-			 * to restart from scratch.
-			 */
-			LWLockRelease(ReplicationSlotControlLock);
-			released_lock = true;
-
-			/* Make sure the invalidated state persists across server restart */
-			ReplicationSlotMarkDirty();
-			ReplicationSlotSave();
-			ReplicationSlotRelease();
-
-			ReportSlotInvalidation(invalidation_cause, false, active_pid,
-								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
-
-			/* done with this slot for now */
-			break;
-		}
 	}
 
 	Assert(released_lock == !LWLockHeldByMe(ReplicationSlotControlLock));
@@ -2208,6 +2228,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2368,6 +2389,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2400,7 +2424,7 @@ RestoreSlotFromDisk(const char *name)
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
@@ -2793,3 +2817,40 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * Raise an error based on the slot's invalidation cause.
+ */
+static void
+RaiseSlotInvalidationError(ReplicationSlot *slot)
+{
+	StringInfo	err_detail = makeStringInfo();
+
+	Assert(slot->data.invalidated != RS_INVAL_NONE);
+
+	switch (slot->data.invalidated)
+	{
+		case RS_INVAL_WAL_REMOVED:
+			appendStringInfoString(err_detail, _("This slot has been invalidated because the required WAL has been removed."));
+			break;
+
+		case RS_INVAL_HORIZON:
+			appendStringInfoString(err_detail, _("This slot has been invalidated because the required rows have been removed."));
+			break;
+
+		case RS_INVAL_WAL_LEVEL:
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."),
+							 "wal_level");
+			break;
+
+		case RS_INVAL_NONE:
+			pg_unreachable();
+	}
+
+	ereport(ERROR,
+			errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+			errmsg("can no longer get changes from replication slot \"%s\"",
+				   NameStr(slot->data.name)),
+			errdetail_internal("%s", err_detail->data));
+}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 977146789f..8be4b8c65b 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a0782b1bbf..3df5bd7b2a 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 9a10907d05..d44f8c262b 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, true);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index bf62b36ad0..47ebdaecb6 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -253,7 +253,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index ae2ad5c933..66ac7c40f1 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This slot has been invalidated because the required WAL has been removed",
 			$logstart))
 	{
 		$failed = 1;
-- 
2.34.1

v62-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v62-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 827ba3daa1003bcc850f75e609fc994143ea90dd Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Tue, 21 Jan 2025 13:27:51 +0530
Subject: [PATCH v62 2/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  39 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/regress.sgml                     |  10 +
 doc/src/sgml/system-views.sgml                |  10 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 167 +++++++++++++--
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  14 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 ++
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 src/test/recovery/README                      |   5 +
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 190 ++++++++++++++++++
 17 files changed, 483 insertions(+), 19 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a8866292d4..2e2935b24f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4450,6 +4450,45 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero disables the idle timeout invalidation mechanism.
+        The default is one day. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 7cc5f4b18d..540dd27f91 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2187,6 +2187,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index f4cef9e80f..f2387de6b5 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -336,6 +336,16 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>idle_replication_slot_timeout</literal></term>
+     <listitem>
+      <para>
+       Runs the test <filename>src/test/recovery/t/043_invalidate_inactive_slots.pl</filename>.
+       Not enabled by default because it is time consuming.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 8e2b0a7927..7d3a0aa709 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 692527b984..7df6892824 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1540,9 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index cdd50046b8..ebde7590ba 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = HOURS_PER_DAY * MINS_PER_HOUR;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -714,16 +721,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1519,7 +1522,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1549,6 +1553,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail,
+							 _("The slot has remained idle since %s, which is longer than the configured \"%s\" duration."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1565,6 +1579,32 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has WAL reserved
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins > 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1593,6 +1633,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1600,6 +1641,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1610,6 +1652,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is already ours, mark it as invalidated;
@@ -1663,6 +1714,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (CanInvalidateIdleSlot(s) &&
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1737,7 +1803,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1780,7 +1847,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1818,12 +1886,14 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *
  * Returns true when any slot have got invalidated.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1876,7 +1946,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1934,6 +2005,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' applies only to
+		 * released slots, and is based on the 'idle_replication_slot_timeout'
+		 * GUC. Active slots currently in use for replication are excluded to
+		 * prevent accidental invalidation. Slots where communication between
+		 * the publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2422,7 +2532,9 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
 		slot->inactive_since = now;
 
@@ -2844,6 +2956,12 @@ RaiseSlotInvalidationError(ReplicationSlot *slot)
 							 "wal_level");
 			break;
 
+		case RS_INVAL_IDLE_TIMEOUT:
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(err_detail, _("This slot has been invalidated because it has remained idle longer than the configured \"%s\" duration."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -2854,3 +2972,22 @@ RaiseSlotInvalidationError(ReplicationSlot *slot)
 				   NameStr(slot->data.name)),
 			errdetail_internal("%s", err_detail->data));
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 38cb9e970d..7cbba03bc1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,20 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		HOURS_PER_DAY * MINS_PER_HOUR,	/* 1 day */
+		0,
+		INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 079efa1baa..0ed9eb057e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -329,6 +329,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#idle_replication_slot_timeout = 1d	# in minutes; 0 disables
 
 # - Primary Server -
 
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 47ebdaecb6..f3994ab000 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..e1d05d6779 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -143,5 +143,8 @@ extern int	date2isoyear(int year, int mon, int mday);
 extern int	date2isoyearday(int year, int mon, int mday);
 
 extern bool TimestampTimestampTzRequiresRewrite(void);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 #endif							/* TIMESTAMP_H */
diff --git a/src/test/recovery/README b/src/test/recovery/README
index 896df0ad05..2941776780 100644
--- a/src/test/recovery/README
+++ b/src/test/recovery/README
@@ -30,4 +30,9 @@ PG_TEST_EXTRA=wal_consistency_checking
 to the "make" command.  This is resource-intensive, so it's not done
 by default.
 
+If you want to test idle_replication_slot_timeout, add
+PG_TEST_EXTRA=idle_replication_slot_timeout
+to the "make" command. This test takes over 2 minutes, so it's not done
+by default.
+
 See src/test/perl/README for more info about running these tests.
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..efd2b050f0
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,190 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# The test takes over two minutes to complete. Run it only if
+# idle_replication_slot_timeout is specified in PG_TEST_EXTRA.
+if (  !$ENV{PG_TEST_EXTRA}
+	|| $ENV{PG_TEST_EXTRA} !~ /\bidle_replication_slot_timeout\b/)
+{
+	plan skip_all =>
+	  'test idle_replication_slot_timeout not enabled in PG_TEST_EXTRA';
+}
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to idle timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $idle_timeout_1min = 1;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep($idle_timeout_1min * 60 + 10);
+
+# On standby, synced slots are not invalidated by the idle timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1min}min';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to idle timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'idle_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_mins) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $idle_timeout_mins);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_mins) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	sleep($idle_timeout_mins * 60 + 10);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#340

Nisha Moond

nisha.moond412@gmail.com

12 months ago

In reply to: Peter Smith (#337)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Jan 21, 2025 at 8:22 AM Peter Smith <smithpb2250@gmail.com> wrote:

Some review comments for patch v61-0001.

======
src/backend/replication/slot.c

InvalidatePossiblyObsoleteSlot:

1.
/*
* Check if the slot needs to be invalidated. If it needs to be
- * invalidated, and is not currently acquired, acquire it and mark it
- * as having been invalidated.  We do this with the spinlock held to
- * avoid race conditions -- for example the restart_lsn could move
- * forward, or the slot could be dropped.
+ * invalidated, and is already ours, mark it as having been
+ * invalidated; otherwise, acquire it first and then mark it as having
+ * been invalidated. We do this with the spinlock held to avoid race
+ * conditions -- for example the restart_lsn could move forward, or
+ * the slot could be dropped.
*/

Can't you just word this as "mark it as invalidated" (which you do
later anyway) instead of "mark is as having been invalidated"? (this
is in two places).

Thanks for pointing it out, I had considered it but was trying to keep
the language consistent with the previous style. I've now made the
change in v62.

~~~

2.
+ /*
+ * The logical replication slots shouldn't be invalidated as GUC
+ * max_slot_wal_keep_size is set to -1 during the binary upgrade.
+ * See check_old_cluster_for_valid_slots() where we ensure that no
+ * invalidated before the upgrade.
+ */
+ Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));

2a.
I know this sentence was the same before you moved it, but "ensure
that no invalidated" seems like there are some words missing.

Corrected.

2b.
TBH, I this part confused me (the code is repeated in a couple of
places). AFAICT the code/comment does not match quite right. The
comment refers to a setting that is "during binary upgrade", yet the
Assert can only pass if IsBinaryUpgrade is false. (??)

I'm unsure of the intent; perhaps it should be like:

if (IsBinaryUpgrade)
Assert(!(*invalidated && SlotIsLogical(s)));

This Assert condition is correct, as we don’t want to invalidate slots
during a binary upgrade and it only triggers(raise error) when all
three conditions are true, meaning when 'IsBinaryUpgrade' is also
true.
That said, I agree with your point that we are unnecessarily checking
"(*invalidated && SlotIsLogical(s))" even when not in binary upgrade
mode.
To optimize this, we can first check 'IsBinaryUpgrade' before
evaluating the other conditions:
Assert(!(IsBinaryUpgrade && *invalidated && SlotIsLogical(s)));
- Since 'IsBinaryUpgrade' is 'false' most of the time, this approach
short-circuits evaluation, making it more efficient.

However, if you feel that this condition isn’t as clear to read or
understand, we can separate the 'IsBinaryUpgrade' check outside the
'Assert', as you suggested. Let me know what you think!

--
Thanks,
Nisha

#341

Nisha Moond

nisha.moond412@gmail.com

12 months ago

In reply to: Nisha Moond (#340)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Jan 22, 2025 at 10:49 AM Nisha Moond <nisha.moond412@gmail.com> wrote:

On Tue, Jan 21, 2025 at 8:22 AM Peter Smith <smithpb2250@gmail.com> wrote:
Some review comments for patch v61-0001.

======
src/backend/replication/slot.c
2.
+ /*
+ * The logical replication slots shouldn't be invalidated as GUC
+ * max_slot_wal_keep_size is set to -1 during the binary upgrade.
+ * See check_old_cluster_for_valid_slots() where we ensure that no
+ * invalidated before the upgrade.
+ */
+ Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
2a.
I know this sentence was the same before you moved it, but "ensure
that no invalidated" seems like there are some words missing.
Corrected.

2b.
TBH, I this part confused me (the code is repeated in a couple of
places). AFAICT the code/comment does not match quite right. The
comment refers to a setting that is "during binary upgrade", yet the
Assert can only pass if IsBinaryUpgrade is false. (??)

I'm unsure of the intent; perhaps it should be like:

if (IsBinaryUpgrade)
Assert(!(*invalidated && SlotIsLogical(s)));

This Assert condition is correct, as we don’t want to invalidate slots
during a binary upgrade and it only triggers(raise error) when all
three conditions are true, meaning when 'IsBinaryUpgrade' is also
true.
That said, I agree with your point that we are unnecessarily checking
"(*invalidated && SlotIsLogical(s))" even when not in binary upgrade
mode.
To optimize this, we can first check 'IsBinaryUpgrade' before
evaluating the other conditions:
Assert(!(IsBinaryUpgrade && *invalidated && SlotIsLogical(s)));
- Since 'IsBinaryUpgrade' is 'false' most of the time, this approach
short-circuits evaluation, making it more efficient.

However, if you feel that this condition isn’t as clear to read or
understand, we can separate the 'IsBinaryUpgrade' check outside the
'Assert', as you suggested. Let me know what you think!

I discussed the above comments further with Peter off-list, and here
are the v63 patches with the following changes:
patch-001: The Assert and related comments have been updated for clarity.
patch-002: Comments have been updated at the top of
InvalidateObsoleteReplicationSlots().

--
Thanks,
Nisha

Attachments:

v63-0001-Enhance-replication-slot-error-handling-slot-inv.patchapplication/octet-stream; name=v63-0001-Enhance-replication-slot-error-handling-slot-inv.patchDownload

From 1e379feaa715626c7e2744bdb764bedb0fc3cab7 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 18 Nov 2024 16:13:26 +0530
Subject: [PATCH v63 1/2] Enhance replication slot error handling, slot
 invalidation, and inactive_since setting logic

In ReplicationSlotAcquire(), raise an error for invalid slots if the
caller specifies error_if_invalid=true.

Add check if the slot is already acquired, then mark it invalidated directly.

Ensure same inactive_since time for all slots in update_synced_slots_inactive_since()
and RestoreSlotFromDisk().
---
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |  13 +-
 src/backend/replication/slot.c                | 161 ++++++++++++------
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/include/replication/slot.h                |   3 +-
 src/test/recovery/t/019_replslot_limit.pl     |   2 +-
 8 files changed, 127 insertions(+), 62 deletions(-)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index 0148ec3678..ca53caac2f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f6945af1d4..692527b984 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1508,7 +1508,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 static void
 update_synced_slots_inactive_since(void)
 {
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	/*
 	 * We need to update inactive_since only when we are promoting standby to
@@ -1523,6 +1523,9 @@ update_synced_slots_inactive_since(void)
 	/* The slot sync worker or SQL function mustn't be running by now */
 	Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 
 	for (int i = 0; i < max_replication_slots; i++)
@@ -1537,10 +1540,6 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			/* Use the same inactive_since time for all the slots. */
-			if (now == 0)
-				now = GetCurrentTimestamp();
-
 			SpinLockAcquire(&s->mutex);
 			s->inactive_since = now;
 			SpinLockRelease(&s->mutex);
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index b30e0473e1..0330dcf57d 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -163,6 +163,7 @@ static void ReplicationSlotDropPtr(ReplicationSlot *slot);
 static void RestoreSlotFromDisk(const char *name);
 static void CreateSlotOnDisk(ReplicationSlot *slot);
 static void SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel);
+static void RaiseSlotInvalidationError(ReplicationSlot *slot);
 
 /*
  * Report shared-memory space needed by ReplicationSlotsShmemInit.
@@ -535,9 +536,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +619,13 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot is found to
+	 * be invalid.
+	 */
+	if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
+		RaiseSlotInvalidationError(s);
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +796,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +823,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1557,7 +1568,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
- * Acquires the given slot and mark it invalid, if necessary and possible.
+ * Acquires the given slot unless already owned, and mark it invalid
+ * if necessary and possible.
  *
  * Returns whether ReplicationSlotControlLock was released in the interim (and
  * in that case we're not holding the lock at return, otherwise we are).
@@ -1600,10 +1612,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
-		 * invalidated, and is not currently acquired, acquire it and mark it
-		 * as having been invalidated.  We do this with the spinlock held to
-		 * avoid race conditions -- for example the restart_lsn could move
-		 * forward, or the slot could be dropped.
+		 * invalidated, and is already ours, mark it as invalidated;
+		 * otherwise, acquire it first and then mark it as invalidated. We do
+		 * this with the spinlock held to avoid race conditions -- for example
+		 * the restart_lsn could move forward, or the slot could be dropped.
 		 */
 		SpinLockAcquire(&s->mutex);
 
@@ -1676,11 +1688,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		active_pid = s->active_pid;
 
 		/*
-		 * If the slot can be acquired, do so and mark it invalidated
-		 * immediately.  Otherwise we'll signal the owning process, below, and
-		 * retry.
+		 * If the slot can be acquired, do so and mark it as invalidated. If
+		 * the slot is already ours, mark it as invalidated. Otherwise, we'll
+		 * signal the owning process below and retry.
 		 */
-		if (active_pid == 0)
+		if (active_pid == 0 ||
+			(MyReplicationSlot == s && active_pid == MyProcPid))
 		{
 			MyReplicationSlot = s;
 			s->active_pid = MyProcPid;
@@ -1695,20 +1708,56 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			/* Let caller know */
 			*invalidated = true;
-		}
 
-		SpinLockRelease(&s->mutex);
+			SpinLockRelease(&s->mutex);
 
-		/*
-		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
-		 */
-		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
+			/*
+			 * Ensure that no logical slots are invalidated during binary
+			 * upgrade mode. This is guaranteed because, in binary upgrade
+			 * mode, the GUC max_slot_wal_keep_size is set to -1, and
+			 * check_old_cluster_for_valid_slots() verifies that no slots are
+			 * invalidated before the upgrade begins.
+			 */
+			if (IsBinaryUpgrade)
+				Assert(!(*invalidated && SlotIsLogical(s)));
+
+			/*
+			 * We hold the slot now and have already invalidated it; flush it
+			 * to ensure that state persists.
+			 *
+			 * Don't want to hold ReplicationSlotControlLock across file
+			 * system operations, so release it now but be sure to tell caller
+			 * to restart from scratch.
+			 */
+			LWLockRelease(ReplicationSlotControlLock);
+			released_lock = true;
 
-		if (active_pid != 0)
+			/* Make sure the invalidated state persists across server restart */
+			ReplicationSlotMarkDirty();
+			ReplicationSlotSave();
+			ReplicationSlotRelease();
+
+			ReportSlotInvalidation(invalidation_cause, false, active_pid,
+								   slotname, restart_lsn,
+								   oldestLSN, snapshotConflictHorizon);
+
+			/* done with this slot for now */
+			break;
+		}
+		else					/* Some other process owns the slot */
 		{
+			SpinLockRelease(&s->mutex);
+
+			/*
+			 * Ensure that no logical slots are invalidated during binary
+			 * upgrade mode. This is guaranteed because, in binary upgrade
+			 * mode, the GUC max_slot_wal_keep_size is set to -1, and
+			 * check_old_cluster_for_valid_slots() verifies that no slots are
+			 * invalidated before the upgrade begins.
+			 */
+			if (IsBinaryUpgrade)
+				Assert(!(*invalidated && SlotIsLogical(s)));
+
 			/*
 			 * Prepare the sleep on the slot's condition variable before
 			 * releasing the lock, to close a possible race condition if the
@@ -1761,31 +1810,6 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
 			continue;
 		}
-		else
-		{
-			/*
-			 * We hold the slot now and have already invalidated it; flush it
-			 * to ensure that state persists.
-			 *
-			 * Don't want to hold ReplicationSlotControlLock across file
-			 * system operations, so release it now but be sure to tell caller
-			 * to restart from scratch.
-			 */
-			LWLockRelease(ReplicationSlotControlLock);
-			released_lock = true;
-
-			/* Make sure the invalidated state persists across server restart */
-			ReplicationSlotMarkDirty();
-			ReplicationSlotSave();
-			ReplicationSlotRelease();
-
-			ReportSlotInvalidation(invalidation_cause, false, active_pid,
-								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
-
-			/* done with this slot for now */
-			break;
-		}
 	}
 
 	Assert(released_lock == !LWLockHeldByMe(ReplicationSlotControlLock));
@@ -2208,6 +2232,7 @@ RestoreSlotFromDisk(const char *name)
 	bool		restored = false;
 	int			readBytes;
 	pg_crc32c	checksum;
+	TimestampTz now;
 
 	/* no need to lock here, no concurrent access allowed yet */
 
@@ -2368,6 +2393,9 @@ RestoreSlotFromDisk(const char *name)
 						NameStr(cp.slotdata.name)),
 				 errhint("Change \"wal_level\" to be \"replica\" or higher.")));
 
+	/* Use same inactive_since time for all slots */
+	now = GetCurrentTimestamp();
+
 	/* nothing can be active yet, don't lock anything */
 	for (i = 0; i < max_replication_slots; i++)
 	{
@@ -2400,7 +2428,7 @@ RestoreSlotFromDisk(const char *name)
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
 		 * makes the slot active will reset it.
 		 */
-		slot->inactive_since = GetCurrentTimestamp();
+		slot->inactive_since = now;
 
 		restored = true;
 		break;
@@ -2793,3 +2821,40 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * Raise an error based on the slot's invalidation cause.
+ */
+static void
+RaiseSlotInvalidationError(ReplicationSlot *slot)
+{
+	StringInfo	err_detail = makeStringInfo();
+
+	Assert(slot->data.invalidated != RS_INVAL_NONE);
+
+	switch (slot->data.invalidated)
+	{
+		case RS_INVAL_WAL_REMOVED:
+			appendStringInfoString(err_detail, _("This slot has been invalidated because the required WAL has been removed."));
+			break;
+
+		case RS_INVAL_HORIZON:
+			appendStringInfoString(err_detail, _("This slot has been invalidated because the required rows have been removed."));
+			break;
+
+		case RS_INVAL_WAL_LEVEL:
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."),
+							 "wal_level");
+			break;
+
+		case RS_INVAL_NONE:
+			pg_unreachable();
+	}
+
+	ereport(ERROR,
+			errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+			errmsg("can no longer get changes from replication slot \"%s\"",
+				   NameStr(slot->data.name)),
+			errdetail_internal("%s", err_detail->data));
+}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 977146789f..8be4b8c65b 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a0782b1bbf..3df5bd7b2a 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 9a10907d05..d44f8c262b 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, true);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index bf62b36ad0..47ebdaecb6 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -253,7 +253,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index ae2ad5c933..66ac7c40f1 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This slot has been invalidated because the required WAL has been removed",
 			$logstart))
 	{
 		$failed = 1;
-- 
2.34.1

v63-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v63-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 2b72525cc5092ad80639a5208668efde4a93721a Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Tue, 21 Jan 2025 13:27:51 +0530
Subject: [PATCH v63 2/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  39 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/regress.sgml                     |  10 +
 doc/src/sgml/system-views.sgml                |  10 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 171 ++++++++++++++--
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  14 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 ++
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 src/test/recovery/README                      |   5 +
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 190 ++++++++++++++++++
 17 files changed, 486 insertions(+), 20 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a782f10998..342be29112 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4450,6 +4450,45 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero disables the idle timeout invalidation mechanism.
+        The default is one day. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 07a07dfe0b..2fb8bcd736 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2198,6 +2198,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index f4cef9e80f..f2387de6b5 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -336,6 +336,16 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>idle_replication_slot_timeout</literal></term>
+     <listitem>
+      <para>
+       Runs the test <filename>src/test/recovery/t/043_invalidate_inactive_slots.pl</filename>.
+       Not enabled by default because it is time consuming.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 8e2b0a7927..7d3a0aa709 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 692527b984..7df6892824 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1540,9 +1540,7 @@ update_synced_slots_inactive_since(void)
 			/* The slot must not be acquired by any process */
 			Assert(s->active_pid == 0);
 
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 0330dcf57d..b4659f65f0 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = HOURS_PER_DAY * MINS_PER_HOUR;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -714,16 +721,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1519,7 +1522,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1549,6 +1553,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail,
+							 _("The slot has remained idle since %s, which is longer than the configured \"%s\" duration."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1565,6 +1579,32 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has WAL reserved
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins > 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1593,6 +1633,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1600,6 +1641,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1610,6 +1652,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is already ours, mark it as invalidated;
@@ -1663,6 +1714,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (CanInvalidateIdleSlot(s) &&
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1715,6 +1781,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			 * Ensure that no logical slots are invalidated during binary
 			 * upgrade mode. This is guaranteed because, in binary upgrade
 			 * mode, the GUC max_slot_wal_keep_size is set to -1, and
+			 * idle_replication_slot_timeout is set to 0. Additionally,
 			 * check_old_cluster_for_valid_slots() verifies that no slots are
 			 * invalidated before the upgrade begins.
 			 */
@@ -1739,7 +1806,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1752,6 +1820,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			 * Ensure that no logical slots are invalidated during binary
 			 * upgrade mode. This is guaranteed because, in binary upgrade
 			 * mode, the GUC max_slot_wal_keep_size is set to -1, and
+			 * idle_replication_slot_timeout is set to 0. Additionally,
 			 * check_old_cluster_for_valid_slots() verifies that no slots are
 			 * invalidated before the upgrade begins.
 			 */
@@ -1784,7 +1853,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1820,14 +1890,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 /*
  * Invalidate slots that require resources about to be removed.
  *
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1880,7 +1952,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1938,6 +2011,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' applies only to
+		 * released slots, and is based on the 'idle_replication_slot_timeout'
+		 * GUC. Active slots currently in use for replication are excluded to
+		 * prevent accidental invalidation. Slots where communication between
+		 * the publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2426,7 +2538,9 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
 		slot->inactive_since = now;
 
@@ -2848,6 +2962,12 @@ RaiseSlotInvalidationError(ReplicationSlot *slot)
 							 "wal_level");
 			break;
 
+		case RS_INVAL_IDLE_TIMEOUT:
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(err_detail, _("This slot has been invalidated because it has remained idle longer than the configured \"%s\" duration."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -2858,3 +2978,22 @@ RaiseSlotInvalidationError(ReplicationSlot *slot)
 				   NameStr(slot->data.name)),
 			errdetail_internal("%s", err_detail->data));
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 38cb9e970d..7cbba03bc1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,20 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		HOURS_PER_DAY * MINS_PER_HOUR,	/* 1 day */
+		0,
+		INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 079efa1baa..0ed9eb057e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -329,6 +329,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#idle_replication_slot_timeout = 1d	# in minutes; 0 disables
 
 # - Primary Server -
 
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 47ebdaecb6..f3994ab000 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..e1d05d6779 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -143,5 +143,8 @@ extern int	date2isoyear(int year, int mon, int mday);
 extern int	date2isoyearday(int year, int mon, int mday);
 
 extern bool TimestampTimestampTzRequiresRewrite(void);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 #endif							/* TIMESTAMP_H */
diff --git a/src/test/recovery/README b/src/test/recovery/README
index 896df0ad05..2941776780 100644
--- a/src/test/recovery/README
+++ b/src/test/recovery/README
@@ -30,4 +30,9 @@ PG_TEST_EXTRA=wal_consistency_checking
 to the "make" command.  This is resource-intensive, so it's not done
 by default.
 
+If you want to test idle_replication_slot_timeout, add
+PG_TEST_EXTRA=idle_replication_slot_timeout
+to the "make" command. This test takes over 2 minutes, so it's not done
+by default.
+
 See src/test/perl/README for more info about running these tests.
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..efd2b050f0
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,190 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# The test takes over two minutes to complete. Run it only if
+# idle_replication_slot_timeout is specified in PG_TEST_EXTRA.
+if (  !$ENV{PG_TEST_EXTRA}
+	|| $ENV{PG_TEST_EXTRA} !~ /\bidle_replication_slot_timeout\b/)
+{
+	plan skip_all =>
+	  'test idle_replication_slot_timeout not enabled in PG_TEST_EXTRA';
+}
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to idle timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $idle_timeout_1min = 1;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep($idle_timeout_1min * 60 + 10);
+
+# On standby, synced slots are not invalidated by the idle timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1min}min';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to idle timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'idle_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_mins) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $idle_timeout_mins);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_mins) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	sleep($idle_timeout_mins * 60 + 10);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#342

Amit Kapila

amit.kapila16@gmail.com

12 months ago

In reply to: Nisha Moond (#341)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Jan 27, 2025 at 11:00 AM Nisha Moond <nisha.moond412@gmail.com> wrote:

I discussed the above comments further with Peter off-list, and here
are the v63 patches with the following changes:
patch-001: The Assert and related comments have been updated for clarity.

The 0001 patch should be discussed in a separate thread as those are
general improvements that are useful even without the main patch we
are trying to achieve in this thread. I suggest we break it into three
patches (a) Ensure the same inactive_since time for all slots, (b)
Raise an error for invalid slots during ReplicationSlotAcquire(); tell
in the commit message, without this patch when such an ERROR would
have otherwise occurred, and (c) Changes in
InvalidatePossiblyObsoleteSlot(), I suggest to leave this change for
later as this impacts the core logic of invalidation.

*
@@ -812,7 +823,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
Assert(MyReplicationSlot == NULL);
Assert(failover || two_phase);

- ReplicationSlotAcquire(name, false);
+ ReplicationSlotAcquire(name, false, false);

Why don't we want to give ERROR during Alter? I think it is okay to
not give ERROR for invalid slots during Drop as we are anyway removing
such slots.

--
With Regards,
Amit Kapila.

#343

Amit Kapila

amit.kapila16@gmail.com

12 months ago

In reply to: Peter Smith (#322)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Dec 30, 2024 at 11:05 AM Peter Smith <smithpb2250@gmail.com> wrote:

I think we are often too quick to throw out perfectly good tests.
Citing that some similar GUCs don't do testing as a reason to skip
them just seems to me like an example of "two wrongs don't make a
right".

There is a third option.

Keep the tests. Because they take excessive time to run, that simply
means you should run them *conditionally* based on the PG_TEST_EXTRA
environment variable so they don't impact the normal BF execution. The
documentation [1] says this env var is for "resource intensive" tests
-- AFAIK this is exactly the scenario we find ourselves in, so is
exactly what this env var was meant for.

Search other *.pl tests for PG_TEST_EXTRA to see some examples.

I don't see the long-running tests to be added under PG_TEST_EXTRA as
that will make it unusable after some point. Now, if multiple senior
members feel it is okay to add long-running tests under PG_TEST_EXTRA
then I am open to considering it. We can keep this test as a separate
patch so that the patch is being tested in CI or in manual tests
before commit.

--
With Regards,
Amit Kapila.

#344

Nisha Moond

nisha.moond412@gmail.com

12 months ago

In reply to: Amit Kapila (#342)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Jan 27, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Jan 27, 2025 at 11:00 AM Nisha Moond <nisha.moond412@gmail.com> wrote:

I discussed the above comments further with Peter off-list, and here
are the v63 patches with the following changes:
patch-001: The Assert and related comments have been updated for clarity.

The 0001 patch should be discussed in a separate thread as those are
general improvements that are useful even without the main patch we
are trying to achieve in this thread. I suggest we break it into three
patches (a) Ensure the same inactive_since time for all slots, (b)
Raise an error for invalid slots during ReplicationSlotAcquire(); tell
in the commit message, without this patch when such an ERROR would
have otherwise occurred, and (c) Changes in
InvalidatePossiblyObsoleteSlot(), I suggest to leave this change for
later as this impacts the core logic of invalidation.

I have started a new thread for these general improvements and have
separated the changes (a) and (b) into different patches.

You can find the new thread at [1]/messages/by-id/CABdArM6pBL5hPnSQ+5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw@mail.gmail.com.

*
@@ -812,7 +823,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
Assert(MyReplicationSlot == NULL);
Assert(failover || two_phase);
- ReplicationSlotAcquire(name, false);
+ ReplicationSlotAcquire(name, false, false);
Why don't we want to give ERROR during Alter? I think it is okay to
not give ERROR for invalid slots during Drop as we are anyway removing
such slots.

Because ReplicationSlotAlter() already handles errors immediately
after acquiring the slot. It raises errors for invalidated slots and
also raises a different error message if the slot is a physical one.
So, In case of ALTER, I feel it is okay to acquire the slot first
without raising errors and then handle errors in the pre-defined way.
Similar immediate error handling is not available at other places.

[1]: /messages/by-id/CABdArM6pBL5hPnSQ+5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw@mail.gmail.com

--
Thanks,
Nisha

#345

Nisha Moond

nisha.moond412@gmail.com

12 months ago

In reply to: Amit Kapila (#343)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Jan 28, 2025 at 3:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Dec 30, 2024 at 11:05 AM Peter Smith <smithpb2250@gmail.com> wrote:

I think we are often too quick to throw out perfectly good tests.
Citing that some similar GUCs don't do testing as a reason to skip
them just seems to me like an example of "two wrongs don't make a
right".

There is a third option.

Keep the tests. Because they take excessive time to run, that simply
means you should run them *conditionally* based on the PG_TEST_EXTRA
environment variable so they don't impact the normal BF execution. The
documentation [1] says this env var is for "resource intensive" tests
-- AFAIK this is exactly the scenario we find ourselves in, so is
exactly what this env var was meant for.

Search other *.pl tests for PG_TEST_EXTRA to see some examples.

I don't see the long-running tests to be added under PG_TEST_EXTRA as
that will make it unusable after some point. Now, if multiple senior
members feel it is okay to add long-running tests under PG_TEST_EXTRA
then I am open to considering it. We can keep this test as a separate
patch so that the patch is being tested in CI or in manual tests
before commit.

Please find the attached v64 patches. The changes in this version
w.r.t. older patch v63 are as -
- The changes from the v63-0001 patch have been moved to a separate thread [1]/messages/by-id/CABdArM6pBL5hPnSQ+5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw@mail.gmail.com.
- The v63-0002 patch has been split into two parts in v64:
1) 001 patch: Implements the main feature - inactive timeout-based
slot invalidation.
2) 002 patch: Separates the TAP test "044_invalidate_inactive_slots"
as suggested above.

[1]: /messages/by-id/CABdArM6pBL5hPnSQ+5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v64-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v64-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 1aa86d7a05fac28c87baa612e3dfac38e75bc91c Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Tue, 28 Jan 2025 16:23:53 +0530
Subject: [PATCH v64 1/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  39 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |  10 +-
 .../replication/logical/logicalfuncs.c        |   2 +-
 src/backend/replication/logical/slotsync.c    |   8 +-
 src/backend/replication/slot.c                | 187 ++++++++++++++++--
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/replication/walsender.c           |   4 +-
 src/backend/utils/adt/pg_upgrade_support.c    |   2 +-
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  14 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  25 ++-
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 17 files changed, 302 insertions(+), 31 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a782f10998..342be29112 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4450,6 +4450,45 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero disables the idle timeout invalidation mechanism.
+        The default is one day. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-track-commit-timestamp" xreflabel="track_commit_timestamp">
       <term><varname>track_commit_timestamp</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 07a07dfe0b..2fb8bcd736 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2198,6 +2198,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 8e2b0a7927..7d3a0aa709 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index 0148ec3678..ca53caac2f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f6945af1d4..987857b949 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
@@ -1541,9 +1541,7 @@ update_synced_slots_inactive_since(void)
 			if (now == 0)
 				now = GetCurrentTimestamp();
 
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index b30e0473e1..ee5aec817a 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = HOURS_PER_DAY * MINS_PER_HOUR;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -535,9 +542,12 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -615,6 +625,21 @@ retry:
 	/* We made this slot active, so it's ours now. */
 	MyReplicationSlot = s;
 
+	/*
+	 * An error is raised if error_if_invalid is true and the slot has been
+	 * previously invalidated due to inactive timeout.
+	 */
+	if (error_if_invalid && s->data.invalidated == RS_INVAL_IDLE_TIMEOUT)
+	{
+		Assert(s->inactive_since > 0);
+		ereport(ERROR,
+				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				 errmsg("can no longer get changes from replication slot \"%s\"",
+						NameStr(s->data.name)),
+				 errdetail("This slot has been invalidated because it has remained idle longer than the configured \"%s\" duration.",
+						   "idle_replication_slot_timeout")));
+	}
+
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -703,16 +728,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -785,7 +806,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +833,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, false);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -1508,7 +1529,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1538,6 +1560,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail,
+							 _("The slot has remained idle since %s, which is longer than the configured \"%s\" duration."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1554,6 +1586,32 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has WAL reserved
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins > 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1581,6 +1639,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1588,6 +1647,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1598,6 +1658,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1651,6 +1720,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (CanInvalidateIdleSlot(s) &&
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1735,7 +1819,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1781,7 +1866,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1796,14 +1882,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 /*
  * Invalidate slots that require resources about to be removed.
  *
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1856,7 +1944,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1914,6 +2003,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' applies only to
+		 * released slots, and is based on the 'idle_replication_slot_timeout'
+		 * GUC. Active slots currently in use for replication are excluded to
+		 * prevent accidental invalidation. Slots where communication between
+		 * the publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2398,7 +2526,9 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
 		slot->inactive_since = GetCurrentTimestamp();
 
@@ -2793,3 +2923,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 977146789f..8be4b8c65b 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bac504b554..446d10c1a7 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 9a10907d05..d44f8c262b 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, true);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 38cb9e970d..7cbba03bc1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,20 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		HOURS_PER_DAY * MINS_PER_HOUR,	/* 1 day */
+		0,
+		INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 079efa1baa..0ed9eb057e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -329,6 +329,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#idle_replication_slot_timeout = 1d	# in minutes; 0 disables
 
 # - Primary Server -
 
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index bf62b36ad0..f3994ab000 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -253,7 +275,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..e1d05d6779 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -143,5 +143,8 @@ extern int	date2isoyear(int year, int mon, int mday);
 extern int	date2isoyearday(int year, int mon, int mday);
 
 extern bool TimestampTimestampTzRequiresRewrite(void);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 #endif							/* TIMESTAMP_H */
-- 
2.34.1

v64-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/octet-stream; name=v64-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From 1d15d46a9ea461e9f4da21dd8a4f907b73df1f07 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Tue, 28 Jan 2025 16:26:20 +0530
Subject: [PATCH v64 2/2] Add TAP test for slot invalidation based on inactive
 timeout.

Since the minimum value for GUC 'idle_replication_slot_timeout' is one minute,
the test takes 2-3 minutes to complete and is disabled by default.
Use PG_TEST_EXTRA=idle_replication_slot_timeout with "make" to run the test.
---
 doc/src/sgml/regress.sgml                     |  10 +
 src/test/recovery/README                      |   5 +
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 190 ++++++++++++++++++
 4 files changed, 206 insertions(+)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 7c474559bd..193622dbf5 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -347,6 +347,16 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>idle_replication_slot_timeout</literal></term>
+     <listitem>
+      <para>
+       Runs the test <filename>src/test/recovery/t/044_invalidate_inactive_slots.pl</filename>.
+       Not enabled by default because it is time consuming.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/src/test/recovery/README b/src/test/recovery/README
index 896df0ad05..2941776780 100644
--- a/src/test/recovery/README
+++ b/src/test/recovery/README
@@ -30,4 +30,9 @@ PG_TEST_EXTRA=wal_consistency_checking
 to the "make" command.  This is resource-intensive, so it's not done
 by default.
 
+If you want to test idle_replication_slot_timeout, add
+PG_TEST_EXTRA=idle_replication_slot_timeout
+to the "make" command. This test takes over 2 minutes, so it's not done
+by default.
+
 See src/test/perl/README for more info about running these tests.
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..efd2b050f0
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,190 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# The test takes over two minutes to complete. Run it only if
+# idle_replication_slot_timeout is specified in PG_TEST_EXTRA.
+if (  !$ENV{PG_TEST_EXTRA}
+	|| $ENV{PG_TEST_EXTRA} !~ /\bidle_replication_slot_timeout\b/)
+{
+	plan skip_all =>
+	  'test idle_replication_slot_timeout not enabled in PG_TEST_EXTRA';
+}
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to idle timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+my $idle_timeout_1min = 1;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep($idle_timeout_1min * 60 + 10);
+
+# On standby, synced slots are not invalidated by the idle timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '${idle_timeout_1min}min';
+]);
+$primary->reload;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to idle timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'idle_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+	$idle_timeout_1min);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_mins) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $idle_timeout_mins);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer get changes from replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $idle_timeout_mins) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	sleep($idle_timeout_mins * 60 + 10);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#346

vignesh C

vignesh21@gmail.com

12 months ago

In reply to: Nisha Moond (#345)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, 28 Jan 2025 at 17:28, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Tue, Jan 28, 2025 at 3:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Dec 30, 2024 at 11:05 AM Peter Smith <smithpb2250@gmail.com> wrote:

I think we are often too quick to throw out perfectly good tests.
Citing that some similar GUCs don't do testing as a reason to skip
them just seems to me like an example of "two wrongs don't make a
right".

There is a third option.

Keep the tests. Because they take excessive time to run, that simply
means you should run them *conditionally* based on the PG_TEST_EXTRA
environment variable so they don't impact the normal BF execution. The
documentation [1] says this env var is for "resource intensive" tests
-- AFAIK this is exactly the scenario we find ourselves in, so is
exactly what this env var was meant for.

Search other *.pl tests for PG_TEST_EXTRA to see some examples.

I don't see the long-running tests to be added under PG_TEST_EXTRA as
that will make it unusable after some point. Now, if multiple senior
members feel it is okay to add long-running tests under PG_TEST_EXTRA
then I am open to considering it. We can keep this test as a separate
patch so that the patch is being tested in CI or in manual tests
before commit.

Please find the attached v64 patches. The changes in this version
w.r.t. older patch v63 are as -
- The changes from the v63-0001 patch have been moved to a separate thread [1].
- The v63-0002 patch has been split into two parts in v64:
1) 001 patch: Implements the main feature - inactive timeout-based
slot invalidation.
2) 002 patch: Separates the TAP test "044_invalidate_inactive_slots"
as suggested above.

[1] /messages/by-id/CABdArM6pBL5hPnSQ+5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw@mail.gmail.com

Few comments:
1) We can mention about the slot that do not reserve WAL is also not applicable:
+       <para>
+        Note that the idle timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from the primary server (i.e., standby slots having
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>

2) Similarly we can mention in the commit message also that it will
not be considered for slot that do not reserve WAL:
Note that the idle timeout invalidation mechanism is not
applicable for slots on the standby server that are being synced
from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because
they don't perform logical decoding to produce changes.

3) Since idle_replication_slot_timeout is somewhat similar to
max_slot_wal_keep_size, we can move idle_replication_slot_timeout
after max_slot_wal_keep_size instead of keeping it after
wal_sender_timeout.
+     <varlistentry id="guc-idle-replication-slot-timeout"
xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname>
(<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname>
configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero disables the idle timeout
invalidation mechanism.
+        The default is one day. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server
command line.
+       </para>

4) We can try to keep it to less than 80 char wherever possible:
a) Like in this case, "mechanism" can be moved to the next line:
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero disables the idle timeout
invalidation mechanism.

b) Similarly here too, "slot's" can be moved to the next line:
+        inactive slots. The duration of slot inactivity is calculated
using the slot's
+        <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>

5) You can use new ereport style to exclude brackets around errcode:
+               ereport(ERROR,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                                errmsg("can no longer get changes
from replication slot \"%s\"",
+                                               NameStr(s->data.name)),
+                                errdetail("This slot has been
invalidated because it has remained idle longer than the configured
\"%s\" duration.",
+
"idle_replication_slot_timeout")));

Regards,
Vignesh

#347

Peter Smith

smithpb2250@gmail.com

12 months ago

In reply to: Nisha Moond (#345)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Jan 28, 2025 at 10:58 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

Please find the attached v64 patches. The changes in this version
w.r.t. older patch v63 are as -
- The changes from the v63-0001 patch have been moved to a separate thread [1].
- The v63-0002 patch has been split into two parts in v64:
1) 001 patch: Implements the main feature - inactive timeout-based
slot invalidation.
2) 002 patch: Separates the TAP test "044_invalidate_inactive_slots"
as suggested above.

Hi Nisha.

Some review comments for patch v64-0001.

======
1. General

Too much of this patch v64-0001 is identical/duplicated code with the
recent "spin-off" patch v1-0002 [1]/messages/by-id/CABdArM6pBL5hPnSQ+5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw@mail.gmail.com. e.g. Most of v1-0001 is now also
embedded in the v64-0001.

This is making for an unnecessarily tricky 2 x review of all the same
code, and it will also cause rebase hassles later.

Even if you wanted the 'error_in_invalid' stuff to be discussed and
pushed separately, I think it will be much easier to keep a "COPY" of
that v1-0002 patch here as a pre-requisite for v64-0001 so then all of
the current code duplications can be removed.

======
src/backend/replication/slot.c

ReplicationSlotAcquire:

2.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid.
  */

and

+ /*
+ * An error is raised if error_if_invalid is true and the slot has been
+ * previously invalidated due to inactive timeout.
+ */
+ if (error_if_invalid && s->data.invalidated == RS_INVAL_IDLE_TIMEOUT)
+ {

Although those comments are correct for v1-0001 [1]/messages/by-id/CABdArM6pBL5hPnSQ+5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw@mail.gmail.com it is a misleading
comment in the hacked into v64-0001 because here you are only checking
invalidation cause RS_INVAL_IDLE_TIMEOUT but none of the other
possible causes.

~~~

ReportSlotInvalidation:

3.
+ case RS_INVAL_IDLE_TIMEOUT:
+ Assert(inactive_since > 0);
+ /* translator: second %s is a GUC variable name */
+ appendStringInfo(&err_detail,
+ _("The slot has remained idle since %s, which is longer than the
configured \"%s\" duration."),
+ timestamptz_to_str(inactive_since),
+ "idle_replication_slot_timeout");

I have the same question already asked for my review of patch v1-0002
[1]: /messages/by-id/CABdArM6pBL5hPnSQ+5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw@mail.gmail.com
is for translations, and using the errdetail_internal which is for
strings *not* requiring translation?

~~~

InvalidatePossiblyObsoleteSlot:

4.
/*
* The logical replication slots shouldn't be invalidated as GUC
* max_slot_wal_keep_size is set to -1 during the binary upgrade. See
* check_old_cluster_for_valid_slots() where we ensure that no
* invalidated before the upgrade.
*/
Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));

Unless I am mistaken, all of the v63 cleanups of the above binary
upgrade code assert stuff have vanished somewhere between v63 and v64.
I cannot find them in the spin-off thread. All accidentally lost? (in
2 places)

Not only that but the accompanying comment modification (to mention
"and idle_replication_slot_timeout is set to 0") is also MIA last seen
in v63 (??)

======
[1]: /messages/by-id/CABdArM6pBL5hPnSQ+5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw@mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

#348

vignesh C

vignesh21@gmail.com

12 months ago

In reply to: Nisha Moond (#345)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, 28 Jan 2025 at 17:28, Nisha Moond <nisha.moond412@gmail.com> wrote:

Please find the attached v64 patches. The changes in this version
w.r.t. older patch v63 are as -
- The changes from the v63-0001 patch have been moved to a separate thread [1].
- The v63-0002 patch has been split into two parts in v64:
1) 001 patch: Implements the main feature - inactive timeout-based
slot invalidation.
2) 002 patch: Separates the TAP test "044_invalidate_inactive_slots"
as suggested above.

Currently the test takes around 220 seconds for me. We could do the
following changes to bring it down to around 70 to 80 seconds:
1) Set idle_replication_slot_timeout to 70 seconds
+# Avoid unpredictability
+$primary->append_conf(
+       'postgresql.conf', qq{
+checkpoint_timeout = 1h
+});
+$primary->start;

2) I felt just 1 second more is enough unless you anticipate a random
failure, the test passes for me:
+# Give enough time for inactive_since to exceed the timeout
+sleep($idle_timeout_1min * 60 + 10);

3) Since we will be setting it to 70 seconds above, changing the
configuration and reload is not required:
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+       'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO
'${idle_timeout_1min}min';
+]);
+$primary->reload;

4) Here you can add some comments that 60s has elapsed and the slot
will get invalidated in another 10 seconds, and pass timeout as 10s to
wait_for_slot_invalidation:
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart,
+       $idle_timeout_1min);

5) We can have another streaming replication cluster setup, may be
primary2 and standby2 nodes and stop the standby2 immediately along
with the first streaming replication cluster itself:
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart,
+       $idle_timeout_1min);

6) We can rename primary to primary or standby1 to standby to keep the
name consistent:
+# Create standby slot on the primary
+$primary->safe_psql(
+       'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name :=
'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);

Regards,
Vignesh

#349

Amit Kapila

amit.kapila16@gmail.com

12 months ago

In reply to: vignesh C (#348)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Jan 29, 2025 at 12:44 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, 28 Jan 2025 at 17:28, Nisha Moond <nisha.moond412@gmail.com> wrote:

Please find the attached v64 patches. The changes in this version
w.r.t. older patch v63 are as -
- The changes from the v63-0001 patch have been moved to a separate thread [1].
- The v63-0002 patch has been split into two parts in v64:
1) 001 patch: Implements the main feature - inactive timeout-based
slot invalidation.
2) 002 patch: Separates the TAP test "044_invalidate_inactive_slots"
as suggested above.

Currently the test takes around 220 seconds for me. We could do the
following changes to bring it down to around 70 to 80 seconds:

Even then it is too long for a single test to be part of committed
code. So, we can temporarily reduce its time but fixing comments on
this is not a good use of time. We need to write this test in some
other way if we want to see it committed.

--
With Regards,
Amit Kapila.

#350

Nisha Moond

nisha.moond412@gmail.com

12 months ago

In reply to: Nisha Moond (#345)

4 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Jan 28, 2025 at 5:28 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

On Tue, Jan 28, 2025 at 3:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Dec 30, 2024 at 11:05 AM Peter Smith <smithpb2250@gmail.com> wrote:

I think we are often too quick to throw out perfectly good tests.
Citing that some similar GUCs don't do testing as a reason to skip
them just seems to me like an example of "two wrongs don't make a
right".

There is a third option.

Keep the tests. Because they take excessive time to run, that simply
means you should run them *conditionally* based on the PG_TEST_EXTRA
environment variable so they don't impact the normal BF execution. The
documentation [1] says this env var is for "resource intensive" tests
-- AFAIK this is exactly the scenario we find ourselves in, so is
exactly what this env var was meant for.

Search other *.pl tests for PG_TEST_EXTRA to see some examples.

I don't see the long-running tests to be added under PG_TEST_EXTRA as
that will make it unusable after some point. Now, if multiple senior
members feel it is okay to add long-running tests under PG_TEST_EXTRA
then I am open to considering it. We can keep this test as a separate
patch so that the patch is being tested in CI or in manual tests
before commit.

Please find the attached v64 patches. The changes in this version
w.r.t. older patch v63 are as -
- The changes from the v63-0001 patch have been moved to a separate thread [1].
- The v63-0002 patch has been split into two parts in v64:
1) 001 patch: Implements the main feature - inactive timeout-based
slot invalidation.
2) 002 patch: Separates the TAP test "044_invalidate_inactive_slots"
as suggested above.

[1] /messages/by-id/CABdArM6pBL5hPnSQ+5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw@mail.gmail.com

Please find the v65 patch set attached with following changes:

- patch-0001 is the copy of v4-001 patch used as a base patch from
[1]: /messages/by-id/CALDaNm1ZUHnKm+PSjjqRrMxcLagrUTS6SADnEsQBfW8rMZFrDA@mail.gmail.com
- patch-0002 is the main patch implementing the feature, this has
also addressed the comments from [2]/messages/by-id/CAHut+PtyUQGee6pHkNN3-ghYhWnY5p-3yWumK7zKupu0S1oVQQ@mail.gmail.com and [3]/messages/by-id/CALDaNm1J_mdqCYjQZgfQMVhJrxndPem5ruxpG_67t4C_2My9WQ@mail.gmail.com
- patch-0003 adds an alternative approach for the TAP test using
injection points to force idle_timeout slot invalidation without
waiting for a minute. This test takes 2-3 seconds to complete.
- patch-0004 maintains the previous test(v64-0002) which is under
PG_TEST_EXTRA. Also, addressed Vignesh's comments [4]/messages/by-id/CALDaNm2dAJB=fJ2X7EMb7meNTjMyL-+-xA93JL_jPkGF4=RUYw@mail.gmail.com for the test and
now it takes 70-80 seconds to complete.

Note: Patches 0003 and 0004 contain the same TAP test but use
different verification methods. We need to decide which one to keep.

[1]: /messages/by-id/CALDaNm1ZUHnKm+PSjjqRrMxcLagrUTS6SADnEsQBfW8rMZFrDA@mail.gmail.com
[2]: /messages/by-id/CAHut+PtyUQGee6pHkNN3-ghYhWnY5p-3yWumK7zKupu0S1oVQQ@mail.gmail.com
[3]: /messages/by-id/CALDaNm1J_mdqCYjQZgfQMVhJrxndPem5ruxpG_67t4C_2My9WQ@mail.gmail.com
[4]: /messages/by-id/CALDaNm2dAJB=fJ2X7EMb7meNTjMyL-+-xA93JL_jPkGF4=RUYw@mail.gmail.com

Thank you, Kuroda-san for providing the TAP test using injection
points (patch-0003).

--
Thanks,
Nisha

Attachments:

v65-0001-Raise-Error-for-Invalid-Slots-in-ReplicationSlot.patchapplication/octet-stream; name=v65-0001-Raise-Error-for-Invalid-Slots-in-ReplicationSlot.patchDownload

From aa47b7d71a12d18ced352e3771055753b993a122 Mon Sep 17 00:00:00 2001
From: Vignesh <vignesh21@gmail.com>
Date: Thu, 30 Jan 2025 18:15:15 +0530
Subject: [PATCH v65 1/4] Raise Error for Invalid Slots in
 ReplicationSlotAcquire()

Once a replication slot is invalidated, it cannot be reused. However, a
process could still acquire an invalid slot and fail later.

For example, if a process acquires a logical slot that was invalidated due
to wal_removed, it will eventually fail in CreateDecodingContext() when
attempting to access the removed WAL. Similarly, for physical replication
slots, even if the slot is invalidated and invalidation_reason is set to
wal_removed, the walsender does not currently check for invalidation when
starting physical replication. Instead, replication starts, and an error
is only reported later by the standby when a missing WAL is detected.

This patch improves error handling by detecting invalid slots earlier.
If error_if_invalid=true is specified when calling ReplicationSlotAcquire(),
an error will be raised immediately instead of letting the process acquire the
slot and fail later due to the invalidated slot.
---
 src/backend/replication/logical/logical.c     | 20 ------------
 .../replication/logical/logicalfuncs.c        |  2 +-
 src/backend/replication/logical/slotsync.c    |  4 +--
 src/backend/replication/slot.c                | 32 +++++++++++--------
 src/backend/replication/slotfuncs.c           |  2 +-
 src/backend/replication/walsender.c           |  4 +--
 src/backend/utils/adt/pg_upgrade_support.c    |  2 +-
 src/include/replication/slot.h                |  3 +-
 src/test/recovery/t/019_replslot_limit.pl     |  2 +-
 .../t/035_standby_logical_decoding.pl         | 15 ++++-----
 10 files changed, 35 insertions(+), 51 deletions(-)

diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index 0b25efafe2..2c8cf516bd 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -542,26 +542,6 @@ CreateDecodingContext(XLogRecPtr start_lsn,
 				errdetail("This replication slot is being synchronized from the primary server."),
 				errhint("Specify another replication slot."));
 
-	/*
-	 * Check if slot has been invalidated due to max_slot_wal_keep_size. Avoid
-	 * "cannot get changes" wording in this errmsg because that'd be
-	 * confusingly ambiguous about no changes being available when called from
-	 * pg_logical_slot_get_changes_guts().
-	 */
-	if (MyReplicationSlot->data.invalidated == RS_INVAL_WAL_REMOVED)
-		ereport(ERROR,
-				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-				 errmsg("can no longer get changes from replication slot \"%s\"",
-						NameStr(MyReplicationSlot->data.name)),
-				 errdetail("This slot has been invalidated because it exceeded the maximum reserved size.")));
-
-	if (MyReplicationSlot->data.invalidated != RS_INVAL_NONE)
-		ereport(ERROR,
-				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-				 errmsg("can no longer get changes from replication slot \"%s\"",
-						NameStr(MyReplicationSlot->data.name)),
-				 errdetail("This slot has been invalidated because it was conflicting with recovery.")));
-
 	Assert(MyReplicationSlot->data.invalidated == RS_INVAL_NONE);
 	Assert(MyReplicationSlot->data.restart_lsn != InvalidXLogRecPtr);
 
diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index 0148ec3678..ca53caac2f 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -197,7 +197,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 	else
 		end_of_wal = GetXLogReplayRecPtr(NULL);
 
-	ReplicationSlotAcquire(NameStr(*name), true);
+	ReplicationSlotAcquire(NameStr(*name), true, true);
 
 	PG_TRY();
 	{
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f6945af1d4..be6f87f00b 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -446,7 +446,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
 
 			if (synced_slot)
 			{
-				ReplicationSlotAcquire(NameStr(local_slot->data.name), true);
+				ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
 				ReplicationSlotDropAcquired();
 			}
 
@@ -665,7 +665,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
 		 * pre-check to ensure that at least one of the slot properties is
 		 * changed before acquiring the slot.
 		 */
-		ReplicationSlotAcquire(remote_slot->name, true);
+		ReplicationSlotAcquire(remote_slot->name, true, false);
 
 		Assert(slot == MyReplicationSlot);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index b30e0473e1..74f7d565f0 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -535,9 +535,13 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * An error is raised if error_if_invalid is true and the slot is found to
+ * be invalid. It should always be set to true, except when we are temporarily
+ * acquiring the slot and doesn't intend to change it.
  */
 void
-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait, bool error_if_invalid)
 {
 	ReplicationSlot *s;
 	int			active_pid;
@@ -585,6 +589,18 @@ retry:
 		active_pid = MyProcPid;
 	LWLockRelease(ReplicationSlotControlLock);
 
+	/* We made this slot active, so it's ours now. */
+	MyReplicationSlot = s;
+
+	/* Invalid slots can't be modified or used before accessing the WAL. */
+	if (error_if_invalid && s->data.invalidated != RS_INVAL_NONE)
+		ereport(ERROR,
+				errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				errmsg("can no longer access replication slot \"%s\"",
+					   NameStr(s->data.name)),
+				errdetail("This replication slot has been invalidated due to \"%s\".",
+						  SlotInvalidationCauses[s->data.invalidated]));
+
 	/*
 	 * If we found the slot but it's already active in another process, we
 	 * wait until the owning process signals us that it's been released, or
@@ -612,9 +628,6 @@ retry:
 	/* Let everybody know we've modified this slot */
 	ConditionVariableBroadcast(&s->active_cv);
 
-	/* We made this slot active, so it's ours now. */
-	MyReplicationSlot = s;
-
 	/*
 	 * The call to pgstat_acquire_replslot() protects against stats for a
 	 * different slot, from before a restart or such, being present during
@@ -785,7 +798,7 @@ ReplicationSlotDrop(const char *name, bool nowait)
 {
 	Assert(MyReplicationSlot == NULL);
 
-	ReplicationSlotAcquire(name, nowait);
+	ReplicationSlotAcquire(name, nowait, false);
 
 	/*
 	 * Do not allow users to drop the slots which are currently being synced
@@ -812,7 +825,7 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 	Assert(MyReplicationSlot == NULL);
 	Assert(failover || two_phase);
 
-	ReplicationSlotAcquire(name, false);
+	ReplicationSlotAcquire(name, false, true);
 
 	if (SlotIsPhysical(MyReplicationSlot))
 		ereport(ERROR,
@@ -820,13 +833,6 @@ ReplicationSlotAlter(const char *name, const bool *failover,
 				errmsg("cannot use %s with a physical replication slot",
 					   "ALTER_REPLICATION_SLOT"));
 
-	if (MyReplicationSlot->data.invalidated != RS_INVAL_NONE)
-		ereport(ERROR,
-				errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-				errmsg("cannot alter invalid replication slot \"%s\"", name),
-				errdetail("This replication slot has been invalidated due to \"%s\".",
-						  SlotInvalidationCauses[MyReplicationSlot->data.invalidated]));
-
 	if (RecoveryInProgress())
 	{
 		/*
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 977146789f..8be4b8c65b 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -536,7 +536,7 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS)
 		moveto = Min(moveto, GetXLogReplayRecPtr(NULL));
 
 	/* Acquire the slot so we "own" it */
-	ReplicationSlotAcquire(NameStr(*slotname), true);
+	ReplicationSlotAcquire(NameStr(*slotname), true, true);
 
 	/* A slot whose restart_lsn has never been reserved cannot be advanced */
 	if (XLogRecPtrIsInvalid(MyReplicationSlot->data.restart_lsn))
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index bac504b554..446d10c1a7 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -816,7 +816,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquire(cmd->slotname, true, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -1434,7 +1434,7 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 
 	Assert(!MyReplicationSlot);
 
-	ReplicationSlotAcquire(cmd->slotname, true);
+	ReplicationSlotAcquire(cmd->slotname, true, true);
 
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
diff --git a/src/backend/utils/adt/pg_upgrade_support.c b/src/backend/utils/adt/pg_upgrade_support.c
index 9a10907d05..d44f8c262b 100644
--- a/src/backend/utils/adt/pg_upgrade_support.c
+++ b/src/backend/utils/adt/pg_upgrade_support.c
@@ -298,7 +298,7 @@ binary_upgrade_logical_slot_has_caught_up(PG_FUNCTION_ARGS)
 	slot_name = PG_GETARG_NAME(0);
 
 	/* Acquire the given slot */
-	ReplicationSlotAcquire(NameStr(*slot_name), true);
+	ReplicationSlotAcquire(NameStr(*slot_name), true, true);
 
 	Assert(SlotIsLogical(MyReplicationSlot));
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index bf62b36ad0..47ebdaecb6 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -253,7 +253,8 @@ extern void ReplicationSlotDropAcquired(void);
 extern void ReplicationSlotAlter(const char *name, const bool *failover,
 								 const bool *two_phase);
 
-extern void ReplicationSlotAcquire(const char *name, bool nowait);
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+								   bool error_if_invalid);
 extern void ReplicationSlotRelease(void);
 extern void ReplicationSlotCleanup(bool synced_only);
 extern void ReplicationSlotSave(void);
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index ae2ad5c933..6468784b83 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -234,7 +234,7 @@ my $failed = 0;
 for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
 {
 	if ($node_standby->log_contains(
-			"requested WAL segment [0-9A-F]+ has already been removed",
+			"This replication slot has been invalidated due to \"wal_removed\".",
 			$logstart))
 	{
 		$failed = 1;
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 7e794c5bea..505e85d1eb 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -533,7 +533,7 @@ check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 	qq[ALTER_REPLICATION_SLOT vacuum_full_inactiveslot (failover);],
 	replication => 'database');
 ok( $stderr =~
-	  /ERROR:  cannot alter invalid replication slot "vacuum_full_inactiveslot"/
+	  /ERROR:  can no longer access replication slot "vacuum_full_inactiveslot"/
 	  && $stderr =~
 	  /DETAIL:  This replication slot has been invalidated due to "rows_removed"./,
 	"invalidated slot cannot be altered");
@@ -551,8 +551,7 @@ $handle =
 
 # We are not able to read from the slot as it has been invalidated
 check_pg_recvlogical_stderr($handle,
-	"can no longer get changes from replication slot \"vacuum_full_activeslot\""
-);
+	"can no longer access replication slot \"vacuum_full_activeslot\"");
 
 # Turn hot_standby_feedback back on
 change_hot_standby_feedback_and_wait_for_xmins(1, 1);
@@ -632,8 +631,7 @@ $handle =
 
 # We are not able to read from the slot as it has been invalidated
 check_pg_recvlogical_stderr($handle,
-	"can no longer get changes from replication slot \"row_removal_activeslot\""
-);
+	"can no longer access replication slot \"row_removal_activeslot\"");
 
 ##################################################
 # Recovery conflict: Same as Scenario 2 but on a shared catalog table
@@ -668,7 +666,7 @@ $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
 
 # We are not able to read from the slot as it has been invalidated
 check_pg_recvlogical_stderr($handle,
-	"can no longer get changes from replication slot \"shared_row_removal_activeslot\""
+	"can no longer access replication slot \"shared_row_removal_activeslot\""
 );
 
 ##################################################
@@ -759,7 +757,7 @@ $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
 
 # We are not able to read from the slot as it has been invalidated
 check_pg_recvlogical_stderr($handle,
-	"can no longer get changes from replication slot \"pruning_activeslot\"");
+	"can no longer access replication slot \"pruning_activeslot\"");
 
 # Turn hot_standby_feedback back on
 change_hot_standby_feedback_and_wait_for_xmins(1, 1);
@@ -818,8 +816,7 @@ $handle =
   make_slot_active($node_standby, 'wal_level_', 0, \$stdout, \$stderr);
 # as the slot has been invalidated we should not be able to read
 check_pg_recvlogical_stderr($handle,
-	"can no longer get changes from replication slot \"wal_level_activeslot\""
-);
+	"can no longer access replication slot \"wal_level_activeslot\"");
 
 ##################################################
 # DROP DATABASE should drop its slots, including active slots.
-- 
2.34.1

v65-0002-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v65-0002-Introduce-inactive_timeout-based-replication-slo.patchDownload

From a07b916edbb25418acdadeeffd9cbd008fd7a315 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Thu, 30 Jan 2025 12:51:11 +0530
Subject: [PATCH v65 2/4] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not applicable
for slots that do not reserve WAL or for slots on the standby server
that are being synced from the primary server (i.e., standby slots
having 'synced' field 'true'). Synced slots are always considered to be
inactive because they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  40 +++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |  10 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 170 ++++++++++++++++--
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  14 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 +++
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 13 files changed, 277 insertions(+), 23 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a782f10998..a065fbbaab 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4423,6 +4423,46 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </listitem>
       </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero disables the idle timeout invalidation
+        mechanism. The default is one day. This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the
+        slot's <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not applicable
+        for slots that do not reserve WAL or for slots on the standby server
+        that are being synced from the primary server (i.e., standby slots
+        having <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
       <term><varname>wal_sender_timeout</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 613abcd28b..3d18e507bb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2390,6 +2390,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 8e2b0a7927..7d3a0aa709 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index be6f87f00b..987857b949 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1541,9 +1541,7 @@ update_synced_slots_inactive_since(void)
 			if (now == 0)
 				now = GetCurrentTimestamp();
 
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 74f7d565f0..2ebf366785 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = HOURS_PER_DAY * MINS_PER_HOUR;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -716,16 +723,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1514,7 +1517,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
 	bool		hint = false;
@@ -1544,6 +1548,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail,
+							 _("The slot has remained idle since %s, which is longer than the configured \"%s\" duration."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1560,6 +1574,32 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 	pfree(err_detail.data);
 }
 
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has WAL reserved
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins > 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
 /*
  * Helper for InvalidateObsoleteReplicationSlots
  *
@@ -1587,6 +1627,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1594,6 +1635,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1604,6 +1646,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1657,6 +1708,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (CanInvalidateIdleSlot(s) &&
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1707,9 +1773,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
+		 * max_slot_wal_keep_size is set to -1 and
+		 * idle_replication_slot_timeout is set to 0 during the binary
+		 * upgrade. See check_old_cluster_for_valid_slots() where we ensure
+		 * that no invalidated before the upgrade.
 		 */
 		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
 
@@ -1741,7 +1808,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1787,7 +1855,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1802,14 +1871,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 /*
  * Invalidate slots that require resources about to be removed.
  *
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1862,7 +1933,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1920,6 +1992,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' applies only to
+		 * released slots, and is based on the 'idle_replication_slot_timeout'
+		 * GUC. Active slots currently in use for replication are excluded to
+		 * prevent accidental invalidation. Slots where communication between
+		 * the publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2404,7 +2515,9 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
 		slot->inactive_since = GetCurrentTimestamp();
 
@@ -2799,3 +2912,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 38cb9e970d..7cbba03bc1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,20 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		HOURS_PER_DAY * MINS_PER_HOUR,	/* 1 day */
+		0,
+		INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 079efa1baa..0ed9eb057e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -329,6 +329,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#idle_replication_slot_timeout = 1d	# in minutes; 0 disables
 
 # - Primary Server -
 
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 47ebdaecb6..f3994ab000 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..e1d05d6779 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -143,5 +143,8 @@ extern int	date2isoyear(int year, int mon, int mday);
 extern int	date2isoyearday(int year, int mon, int mday);
 
 extern bool TimestampTimestampTzRequiresRewrite(void);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 #endif							/* TIMESTAMP_H */
-- 
2.34.1

v65-0003-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/octet-stream; name=v65-0003-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From 2c7a48b926787bad592e91d3270706eab1425077 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Thu, 30 Jan 2025 21:07:12 +0530
Subject: [PATCH v65 3/4] Add TAP test for slot invalidation based on inactive
 timeout.

This test uses injection points to bypass the time overhead caused by the
idle_replication_slot_timeout GUC, which has a minimum value of one minute.
---
 src/backend/replication/slot.c                |   5 +
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 176 ++++++++++++++++++
 3 files changed, 182 insertions(+)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 2ebf366785..2191033e5c 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
+#include "utils/injection_point.h"
 #include "utils/guc_hooks.h"
 #include "utils/varlena.h"
 
@@ -1711,6 +1712,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				case RS_INVAL_IDLE_TIMEOUT:
 					Assert(now > 0);
 
+					/* For testing timeout slot invalidation */
+					if (IS_INJECTION_POINT_ATTACHED("slot-time-out-inval"))
+						s->inactive_since = 1;
+
 					/*
 					 * Check if the slot needs to be invalidated due to
 					 * idle_replication_slot_timeout GUC.
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..ab948aba85
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,176 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to idle timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'logical slot sync_slot1 is synced to standby');
+
+my $logstart = -s $primary->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$primary->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '1min';
+]);
+$primary->reload;
+
+# Register an injection point on the primary to forcibly cause a slot timeout
+$primary->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$primary->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+$primary->safe_psql('postgres',
+	"SELECT injection_points_attach('slot-time-out-inval', 'error');");
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to idle timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'idle_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# Make the standby slot on the primary inactive and check for invalidation
+$standby1->stop;
+wait_for_slot_invalidation($primary, 'sb_slot1', $logstart);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer access replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

v65-0004-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/octet-stream; name=v65-0004-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From b4cac3e24c150c58d3c724e1dc67f5db5d305962 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Wed, 29 Jan 2025 12:12:00 +0530
Subject: [PATCH v65 4/4] Add TAP test for slot invalidation based on inactive
 timeout.

This patch adds the same test, but places it under PG_TEST_EXTRA instead
of using injection points.

Since the minimum value for GUC 'idle_replication_slot_timeout' is one minute,
the test takes more than a minute to complete and is disabled by default.
Use PG_TEST_EXTRA=idle_replication_slot_timeout with "make" to run the test.
---
 .cirrus.tasks.yml                             |   2 +-
 doc/src/sgml/regress.sgml                     |  10 +
 src/test/recovery/README                      |   5 +
 src/test/recovery/meson.build                 |   1 +
 .../045_invalidate_inactive_slots_pg_extra.pl | 208 ++++++++++++++++++
 5 files changed, 225 insertions(+), 1 deletion(-)
 create mode 100644 src/test/recovery/t/045_invalidate_inactive_slots_pg_extra.pl

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index 18e944ca89..8d3c13fcee 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -20,7 +20,7 @@ env:
   MTEST_ARGS: --print-errorlogs --no-rebuild -C build
   PGCTLTIMEOUT: 120 # avoids spurious failures during parallel tests
   TEMP_CONFIG: ${CIRRUS_WORKING_DIR}/src/tools/ci/pg_ci_base.conf
-  PG_TEST_EXTRA: kerberos ldap ssl libpq_encryption load_balance
+  PG_TEST_EXTRA: kerberos ldap ssl libpq_encryption load_balance idle_replication_slot_timeout
 
 
 # What files to preserve in case tests fail
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 7c474559bd..f5b1f2f353 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -347,6 +347,16 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>idle_replication_slot_timeout</literal></term>
+     <listitem>
+      <para>
+       Runs the test <filename>src/test/recovery/t/045_invalidate_inactive_slots_pg_extra.pl</filename>.
+       Not enabled by default because it is time consuming.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/src/test/recovery/README b/src/test/recovery/README
index 896df0ad05..5c066fc41f 100644
--- a/src/test/recovery/README
+++ b/src/test/recovery/README
@@ -30,4 +30,9 @@ PG_TEST_EXTRA=wal_consistency_checking
 to the "make" command.  This is resource-intensive, so it's not done
 by default.
 
+If you want to test idle_replication_slot_timeout, add
+PG_TEST_EXTRA=idle_replication_slot_timeout
+to the "make" command. This test takes over a minutes, so it's not done
+by default.
+
 See src/test/perl/README for more info about running these tests.
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 057bcde143..0a037b4b65 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -53,6 +53,7 @@ tests += {
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
       't/044_invalidate_inactive_slots.pl',
+      't/045_invalidate_inactive_slots_pg_extra.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/045_invalidate_inactive_slots_pg_extra.pl b/src/test/recovery/t/045_invalidate_inactive_slots_pg_extra.pl
new file mode 100644
index 0000000000..577f69d05d
--- /dev/null
+++ b/src/test/recovery/t/045_invalidate_inactive_slots_pg_extra.pl
@@ -0,0 +1,208 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# The test takes over two minutes to complete. Run it only if
+# idle_replication_slot_timeout is specified in PG_TEST_EXTRA.
+if (  !$ENV{PG_TEST_EXTRA}
+	|| $ENV{PG_TEST_EXTRA} !~ /\bidle_replication_slot_timeout\b/)
+{
+	plan skip_all =>
+	  'test idle_replication_slot_timeout not enabled in PG_TEST_EXTRA';
+}
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to idle timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+idle_replication_slot_timeout = 1
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby1's slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby2's slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot2', immediately_reserve := true);
+]);
+
+# Create standby1
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Create standby2
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+$connstr = $primary->connstr;
+$standby2->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot2'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby2->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+
+# Make the standby2's slot on the primary inactive
+$standby2->stop;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep(61);
+
+# On standby, synced slots are not invalidated by the idle timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout as 61 seconds has elapsed and wait for another 10 seconds
+# to make test reliable.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart, 10);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to idle timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'idle_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# By now standby2's slot must be invalidated due to idle timeout,
+# check for invalidation.
+wait_for_slot_invalidation($primary, 'sb_slot2', $logstart, 1);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $wait_time_secs) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $wait_time_secs);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer access replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $wait_time_secs) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	sleep($wait_time_secs);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#351

Peter Smith

smithpb2250@gmail.com

12 months ago

In reply to: Nisha Moond (#350)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha.

Here are some review comments for patch v65-0002

======
src/backend/replication/slot.c

ReportSlotInvalidation:

1.
+
+ case RS_INVAL_IDLE_TIMEOUT:
+ Assert(inactive_since > 0);
+ /* translator: second %s is a GUC variable name */
+ appendStringInfo(&err_detail,
+ _("The slot has remained idle since %s, which is longer than the
configured \"%s\" duration."),
+ timestamptz_to_str(inactive_since),
+ "idle_replication_slot_timeout");
+ break;
+

errdetail:

I guess it is no fault of this patch because I see you've only copied
nearby code, but AFAICT this function is still having an each-way bet
by using a mixture of _() macro which is for strings intended be
translated, but then only using them in errdetail_internal() which is
for strings that are NOT intended to be translated. Isn't it
contradictory? Why don't we use errdetail() here?

errhint:

Also, the way the 'hint' is implemented can only be meaningful for
RS_INVAL_WAL_REMOVED. This is also existing code that IMO it was
always strange, but now that this patch has added another kind of
switch (cause) this hint implementation now looks increasingly hacky
to me; it is also inflexible -- e.g. if you ever wanted to add
different hints. A neater implementation would be to make the code
more like how the err_detail is handled, so then the errhint string
would only be assigned within the "case RS_INVAL_WAL_REMOVED:"

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#352

Amit Kapila

amit.kapila16@gmail.com

12 months ago

In reply to: Peter Smith (#351)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Jan 31, 2025 at 10:40 AM Peter Smith <smithpb2250@gmail.com> wrote:

======
src/backend/replication/slot.c

ReportSlotInvalidation:
1.
+
+ case RS_INVAL_IDLE_TIMEOUT:
+ Assert(inactive_since > 0);
+ /* translator: second %s is a GUC variable name */
+ appendStringInfo(&err_detail,
+ _("The slot has remained idle since %s, which is longer than the
configured \"%s\" duration."),
+ timestamptz_to_str(inactive_since),
+ "idle_replication_slot_timeout");
+ break;
+
errdetail:

I guess it is no fault of this patch because I see you've only copied
nearby code, but AFAICT this function is still having an each-way bet
by using a mixture of _() macro which is for strings intended be
translated, but then only using them in errdetail_internal() which is
for strings that are NOT intended to be translated. Isn't it
contradictory? Why don't we use errdetail() here?

Your question is valid and I don't have an answer. I encourage you to
start a new thread to clarify this.

errhint:

Also, the way the 'hint' is implemented can only be meaningful for
RS_INVAL_WAL_REMOVED. This is also existing code that IMO it was
always strange, but now that this patch has added another kind of
switch (cause) this hint implementation now looks increasingly hacky
to me; it is also inflexible -- e.g. if you ever wanted to add
different hints. A neater implementation would be to make the code
more like how the err_detail is handled, so then the errhint string
would only be assigned within the "case RS_INVAL_WAL_REMOVED:"

This makes sense to me.

+
+ case RS_INVAL_IDLE_TIMEOUT:
+ Assert(inactive_since > 0);
+ /* translator: second %s is a GUC variable name */
+ appendStringInfo(&err_detail,
+ _("The slot has remained idle since %s, which is longer than the
configured \"%s\" duration."),
+ timestamptz_to_str(inactive_since),
+ "idle_replication_slot_timeout");

I think the above message should be constructed on a model similar to
the following nearby message:"The slot's restart_lsn %X/%X exceeds the
limit by %llu bytes.". So, how about the following: "The slot's idle
time %s exceeds the configured \"%s\" duration"?

Also, similar to max_slot_wal_keep_size, we should give a hint in this
case to increase idle_replication_slot_timeout.

It is not clear why the injection point test is doing
pg_sync_replication_slots() etc. in the patch. The test should be
simple such that after creating a new physical or logical slot, enable
the injection point, then run the manual checkpoint command, and check
the invalidation status of the slot.

--
With Regards,
Amit Kapila.

#353

Nisha Moond

nisha.moond412@gmail.com

12 months ago

In reply to: Amit Kapila (#352)

3 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Jan 31, 2025 at 2:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jan 31, 2025 at 10:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
======
src/backend/replication/slot.c

ReportSlotInvalidation:
1.
+
+ case RS_INVAL_IDLE_TIMEOUT:
+ Assert(inactive_since > 0);
+ /* translator: second %s is a GUC variable name */
+ appendStringInfo(&err_detail,
+ _("The slot has remained idle since %s, which is longer than the
configured \"%s\" duration."),
+ timestamptz_to_str(inactive_since),
+ "idle_replication_slot_timeout");
+ break;
+
errdetail:

I guess it is no fault of this patch because I see you've only copied
nearby code, but AFAICT this function is still having an each-way bet
by using a mixture of _() macro which is for strings intended be
translated, but then only using them in errdetail_internal() which is
for strings that are NOT intended to be translated. Isn't it
contradictory? Why don't we use errdetail() here?
Your question is valid and I don't have an answer. I encourage you to
start a new thread to clarify this.

errhint:

Also, the way the 'hint' is implemented can only be meaningful for
RS_INVAL_WAL_REMOVED. This is also existing code that IMO it was
always strange, but now that this patch has added another kind of
switch (cause) this hint implementation now looks increasingly hacky
to me; it is also inflexible -- e.g. if you ever wanted to add
different hints. A neater implementation would be to make the code
more like how the err_detail is handled, so then the errhint string
would only be assigned within the "case RS_INVAL_WAL_REMOVED:"

This makes sense to me.
+
+ case RS_INVAL_IDLE_TIMEOUT:
+ Assert(inactive_since > 0);
+ /* translator: second %s is a GUC variable name */
+ appendStringInfo(&err_detail,
+ _("The slot has remained idle since %s, which is longer than the
configured \"%s\" duration."),
+ timestamptz_to_str(inactive_since),
+ "idle_replication_slot_timeout");
I think the above message should be constructed on a model similar to
the following nearby message:"The slot's restart_lsn %X/%X exceeds the
limit by %llu bytes.". So, how about the following: "The slot's idle
time %s exceeds the configured \"%s\" duration"?

Also, similar to max_slot_wal_keep_size, we should give a hint in this
case to increase idle_replication_slot_timeout.

It is not clear why the injection point test is doing
pg_sync_replication_slots() etc. in the patch. The test should be
simple such that after creating a new physical or logical slot, enable
the injection point, then run the manual checkpoint command, and check
the invalidation status of the slot.

Thanks for the review! I have incorporated the above comments. The
test in patch-002 has been optimized as suggested and now completes in
less than a second.
Please find the attached v66 patch set. The base patch(v65-001) is
committed now, so I have rebased the patches.

Thank you, Kuroda-san, for working on patch-002.

--
Thanks,
Nisha

Attachments:

v66-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v66-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 5fdd6e21ec14a8f5b36b9df28ca17f00e8b04846 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Thu, 30 Jan 2025 12:51:11 +0530
Subject: [PATCH v66 1/3] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not applicable
for slots that do not reserve WAL or for slots on the standby server
that are being synced from the primary server (i.e., standby slots
having 'synced' field 'true'). Synced slots are always considered to be
inactive because they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  40 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |  10 +-
 src/backend/replication/logical/slotsync.c    |   4 +-
 src/backend/replication/slot.c                | 180 ++++++++++++++++--
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  14 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 +++
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 13 files changed, 286 insertions(+), 24 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a782f10998..a065fbbaab 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4423,6 +4423,46 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </listitem>
       </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero disables the idle timeout invalidation
+        mechanism. The default is one day. This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the
+        slot's <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not applicable
+        for slots that do not reserve WAL or for slots on the standby server
+        that are being synced from the primary server (i.e., standby slots
+        having <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
       <term><varname>wal_sender_timeout</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 613abcd28b..3d18e507bb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2390,6 +2390,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 8e2b0a7927..7d3a0aa709 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2566,7 +2566,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </para>
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
         Note that for slots on the standby that are being synced from a
         primary server (whose <structfield>synced</structfield> field is
         <literal>true</literal>), the <structfield>inactive_since</structfield>
@@ -2620,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index be6f87f00b..987857b949 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1541,9 +1541,7 @@ update_synced_slots_inactive_since(void)
 			if (now == 0)
 				now = GetCurrentTimestamp();
 
-			SpinLockAcquire(&s->mutex);
-			s->inactive_since = now;
-			SpinLockRelease(&s->mutex);
+			ReplicationSlotSetInactiveSince(s, now, true);
 		}
 	}
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index c57a13d820..ad1b3799fe 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = HOURS_PER_DAY * MINS_PER_HOUR;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -720,16 +727,12 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->inactive_since = now;
+		ReplicationSlotSetInactiveSince(slot, now, false);
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
-	{
-		SpinLockAcquire(&slot->mutex);
-		slot->inactive_since = now;
-		SpinLockRelease(&slot->mutex);
-	}
+		ReplicationSlotSetInactiveSince(slot, now, true);
 
 	MyReplicationSlot = NULL;
 
@@ -1518,12 +1521,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
+	StringInfoData err_hint;
 	bool		hint = false;
 
 	initStringInfo(&err_detail);
+	initStringInfo(&err_hint);
 
 	switch (cause)
 	{
@@ -1538,6 +1544,8 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 										  ex),
 								 LSN_FORMAT_ARGS(restart_lsn),
 								 ex);
+				appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+								 "max_slot_wal_keep_size");
 				break;
 			}
 		case RS_INVAL_HORIZON:
@@ -1548,6 +1556,19 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+			hint = true;
+
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail, _("The slot's idle time %s exceeds the configured \"%s\" duration."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1559,9 +1580,36 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			errmsg("invalidating obsolete replication slot \"%s\"",
 				   NameStr(slotname)),
 			errdetail_internal("%s", err_detail.data),
-			hint ? errhint("You might need to increase \"%s\".", "max_slot_wal_keep_size") : 0);
+			hint ? errhint("%s", err_hint.data) : 0);
 
 	pfree(err_detail.data);
+	pfree(err_hint.data);
+}
+
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has WAL reserved
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins > 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
 }
 
 /*
@@ -1591,6 +1639,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1598,6 +1647,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1608,6 +1658,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1661,6 +1720,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (CanInvalidateIdleSlot(s) &&
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1711,9 +1785,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
+		 * max_slot_wal_keep_size is set to -1 and
+		 * idle_replication_slot_timeout is set to 0 during the binary
+		 * upgrade. See check_old_cluster_for_valid_slots() where we ensure
+		 * that no invalidated before the upgrade.
 		 */
 		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
 
@@ -1745,7 +1820,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1791,7 +1867,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1806,14 +1883,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 /*
  * Invalidate slots that require resources about to be removed.
  *
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1866,7 +1945,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1924,6 +2004,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' applies only to
+		 * released slots, and is based on the 'idle_replication_slot_timeout'
+		 * GUC. Active slots currently in use for replication are excluded to
+		 * prevent accidental invalidation. Slots where communication between
+		 * the publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2408,7 +2527,9 @@ RestoreSlotFromDisk(const char *name)
 		/*
 		 * Set the time since the slot has become inactive after loading the
 		 * slot from the disk into memory. Whoever acquires the slot i.e.
-		 * makes the slot active will reset it.
+		 * makes the slot active will reset it. Avoid calling
+		 * ReplicationSlotSetInactiveSince() here, as it will not set the time
+		 * for invalid slots.
 		 */
 		slot->inactive_since = GetCurrentTimestamp();
 
@@ -2803,3 +2924,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 38cb9e970d..7cbba03bc1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,20 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		HOURS_PER_DAY * MINS_PER_HOUR,	/* 1 day */
+		0,
+		INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 079efa1baa..0ed9eb057e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -329,6 +329,7 @@
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
+#idle_replication_slot_timeout = 1d	# in minutes; 0 disables
 
 # - Primary Server -
 
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 47ebdaecb6..f3994ab000 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -228,6 +230,25 @@ typedef struct ReplicationSlotCtlData
 	ReplicationSlot replication_slots[1];
 } ReplicationSlotCtlData;
 
+/*
+ * Set slot's inactive_since property unless it was previously invalidated.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz now,
+								bool acquire_lock)
+{
+	if (s->data.invalidated != RS_INVAL_NONE)
+		return;
+
+	if (acquire_lock)
+		SpinLockAcquire(&s->mutex);
+
+	s->inactive_since = now;
+
+	if (acquire_lock)
+		SpinLockRelease(&s->mutex);
+}
+
 /*
  * Pointers to shared memory
  */
@@ -237,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..e1d05d6779 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -143,5 +143,8 @@ extern int	date2isoyear(int year, int mon, int mday);
 extern int	date2isoyearday(int year, int mon, int mday);
 
 extern bool TimestampTimestampTzRequiresRewrite(void);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 #endif							/* TIMESTAMP_H */
-- 
2.34.1

v66-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/octet-stream; name=v66-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From 406c7c708b599675d1a2d6c87ab3425758773b22 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Thu, 30 Jan 2025 21:07:12 +0530
Subject: [PATCH v66 2/3] Add TAP test for slot invalidation based on inactive
 timeout.

This test uses injection points to bypass the time overhead caused by the
idle_replication_slot_timeout GUC, which has a minimum value of one minute.
---
 src/backend/replication/slot.c                |   5 +
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 119 ++++++++++++++++++
 3 files changed, 125 insertions(+)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index ad1b3799fe..c09b90eeac 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
+#include "utils/injection_point.h"
 #include "utils/guc_hooks.h"
 #include "utils/varlena.h"
 
@@ -1723,6 +1724,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				case RS_INVAL_IDLE_TIMEOUT:
 					Assert(now > 0);
 
+					/* For testing timeout slot invalidation */
+					if (IS_INJECTION_POINT_ATTACHED("slot-time-out-inval"))
+						s->inactive_since = 1;
+
 					/*
 					 * Check if the slot needs to be invalidated due to
 					 * idle_replication_slot_timeout GUC.
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..fd907e7f82
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,119 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# ========================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical slot due to idle
+# timeout.
+
+# Initialize primary
+my $node = PostgreSQL::Test::Cluster->new('primary');
+$node->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$node->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+});
+$node->start;
+
+# Create both streaming standby and logical slot
+$node->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'physical_slot', immediately_reserve := true);
+]);
+$node->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('logical_slot', 'test_decoding');}
+);
+
+my $logstart = -s $node->logfile;
+
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$node->safe_psql(
+	'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '1min';
+]);
+$node->reload;
+
+# Register an injection point on the primary to forcibly cause a slot timeout
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('slot-time-out-inval', 'error');");
+
+# Wait for slots to become inactive. Note that nobody has acquired the slot
+# yet, so it must get invalidated due to idle timeout.
+wait_for_slot_invalidation($node, 'physical_slot', $logstart);
+wait_for_slot_invalidation($node, 'logical_slot', $logstart);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer access replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

v66-0003-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/octet-stream; name=v66-0003-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From be7d9f66c30cfafd399863958a65ed0779411269 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Wed, 29 Jan 2025 12:12:00 +0530
Subject: [PATCH v66 3/3] Add TAP test for slot invalidation based on inactive
 timeout.

This patch adds the same test, but places it under PG_TEST_EXTRA instead
of using injection points.

Since the minimum value for GUC 'idle_replication_slot_timeout' is one minute,
the test takes more than a minute to complete and is disabled by default.
Use PG_TEST_EXTRA=idle_replication_slot_timeout with "make" to run the test.
---
 .cirrus.tasks.yml                             |   2 +-
 doc/src/sgml/regress.sgml                     |  10 +
 src/test/recovery/README                      |   5 +
 src/test/recovery/meson.build                 |   1 +
 .../045_invalidate_inactive_slots_pg_extra.pl | 208 ++++++++++++++++++
 5 files changed, 225 insertions(+), 1 deletion(-)
 create mode 100644 src/test/recovery/t/045_invalidate_inactive_slots_pg_extra.pl

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index 18e944ca89..8d3c13fcee 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -20,7 +20,7 @@ env:
   MTEST_ARGS: --print-errorlogs --no-rebuild -C build
   PGCTLTIMEOUT: 120 # avoids spurious failures during parallel tests
   TEMP_CONFIG: ${CIRRUS_WORKING_DIR}/src/tools/ci/pg_ci_base.conf
-  PG_TEST_EXTRA: kerberos ldap ssl libpq_encryption load_balance
+  PG_TEST_EXTRA: kerberos ldap ssl libpq_encryption load_balance idle_replication_slot_timeout
 
 
 # What files to preserve in case tests fail
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 7c474559bd..f5b1f2f353 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -347,6 +347,16 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>idle_replication_slot_timeout</literal></term>
+     <listitem>
+      <para>
+       Runs the test <filename>src/test/recovery/t/045_invalidate_inactive_slots_pg_extra.pl</filename>.
+       Not enabled by default because it is time consuming.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/src/test/recovery/README b/src/test/recovery/README
index 896df0ad05..5c066fc41f 100644
--- a/src/test/recovery/README
+++ b/src/test/recovery/README
@@ -30,4 +30,9 @@ PG_TEST_EXTRA=wal_consistency_checking
 to the "make" command.  This is resource-intensive, so it's not done
 by default.
 
+If you want to test idle_replication_slot_timeout, add
+PG_TEST_EXTRA=idle_replication_slot_timeout
+to the "make" command. This test takes over a minutes, so it's not done
+by default.
+
 See src/test/perl/README for more info about running these tests.
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 057bcde143..0a037b4b65 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -53,6 +53,7 @@ tests += {
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
       't/044_invalidate_inactive_slots.pl',
+      't/045_invalidate_inactive_slots_pg_extra.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/045_invalidate_inactive_slots_pg_extra.pl b/src/test/recovery/t/045_invalidate_inactive_slots_pg_extra.pl
new file mode 100644
index 0000000000..577f69d05d
--- /dev/null
+++ b/src/test/recovery/t/045_invalidate_inactive_slots_pg_extra.pl
@@ -0,0 +1,208 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# The test takes over two minutes to complete. Run it only if
+# idle_replication_slot_timeout is specified in PG_TEST_EXTRA.
+if (  !$ENV{PG_TEST_EXTRA}
+	|| $ENV{PG_TEST_EXTRA} !~ /\bidle_replication_slot_timeout\b/)
+{
+	plan skip_all =>
+	  'test idle_replication_slot_timeout not enabled in PG_TEST_EXTRA';
+}
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to idle timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+idle_replication_slot_timeout = 1
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby1's slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby2's slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot2', immediately_reserve := true);
+]);
+
+# Create standby1
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Create standby2
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+$connstr = $primary->connstr;
+$standby2->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot2'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby2->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+
+# Make the standby2's slot on the primary inactive
+$standby2->stop;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep(61);
+
+# On standby, synced slots are not invalidated by the idle timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout as 61 seconds has elapsed and wait for another 10 seconds
+# to make test reliable.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart, 10);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to idle timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'idle_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# By now standby2's slot must be invalidated due to idle timeout,
+# check for invalidation.
+wait_for_slot_invalidation($primary, 'sb_slot2', $logstart, 1);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $wait_time_secs) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $wait_time_secs);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer access replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $wait_time_secs) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	sleep($wait_time_secs);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#354

vignesh C

vignesh21@gmail.com

12 months ago

In reply to: Nisha Moond (#353)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, 31 Jan 2025 at 17:50, Nisha Moond <nisha.moond412@gmail.com> wrote:

Thanks for the review! I have incorporated the above comments. The
test in patch-002 has been optimized as suggested and now completes in
less than a second.
Please find the attached v66 patch set. The base patch(v65-001) is
committed now, so I have rebased the patches.

Few comments:
1)We should set inactive_since only if the slot can be invalidated:
+                                       /* For testing timeout slot
invalidation */
+                                       if
(IS_INJECTION_POINT_ATTACHED("slot-time-out-inval"))
+                                               s->inactive_since = 1;
+

2) Instead of "alter system set" and reload, let's do this in
$node->append_conf itself:
+# Set timeout GUC so that the next checkpoint will invalidate inactive slots
+$node->safe_psql(
+       'postgres', qq[
+    ALTER SYSTEM SET idle_replication_slot_timeout TO '1min';
+]);
+$node->reload;

3) No need to trigger checkpoint twice, we can move it outside so that
just a single checkpoint will invalidate both the slots:
+sub trigger_slot_invalidation
+{
+       my ($node, $slot, $offset) = @_;
+       my $node_name = $node->name;
+       my $invalidated = 0;
+
+       # Run a checkpoint
+       $node->safe_psql('postgres', "CHECKPOINT");

4) I fel this trigger_slot_invalidation is not required after removing
the checkpoint from the function, let's move the waiting for
"invalidating obsolete replication slot" also to
wait_for_slot_invalidation function:
+       # The slot's invalidation should be logged
+       $node->wait_for_log(qr/invalidating obsolete replication slot
\"$slot\"/,
+               $offset);
+
+       # Check that the invalidation reason is 'idle_timeout'
+       $node->poll_query_until(
+               'postgres', qq[
+               SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+                       WHERE slot_name = '$slot' AND
+                       invalidation_reason = 'idle_timeout';
+       ])

5) Can we move the subroutine to the beginning, I noticed in other
places we have kept it before the tests like in 027_nosuperuser and
040_createsubscriber:
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+       my ($node, $slot, $offset) = @_;
+       my $node_name = $node->name;
+
+       trigger_slot_invalidation($node, $slot, $offset);
+
+       # Check that an invalidated slot cannot be acquired
+       my ($result, $stdout, $stderr);
+       ($result, $stdout, $stderr) = $node->psql(
+               'postgres', qq[
+                       SELECT pg_replication_slot_advance('$slot', '0/1');
+       ]);

6) Since idle_replication_slot_timeout is related more closely with
max_slot_wal_keep_size, let's keep it along with it.
diff --git a/src/backend/utils/misc/postgresql.conf.sample
b/src/backend/utils/misc/postgresql.conf.sample
index 079efa1baa..0ed9eb057e 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -329,6 +329,7 @@
 #wal_sender_timeout = 60s      # in milliseconds; 0 disables
 #track_commit_timestamp = off  # collect timestamp of transaction commit
                                # (change requires restart)
+#idle_replication_slot_timeout = 1d    # in minutes; 0 disables

If you accept the comments, you can merge the changes from the attached patch.

Regards,
Vignesh

Attachments:

Vignesh_review_comment_fix.patchtext/x-patch; charset=US-ASCII; name=Vignesh_review_comment_fix.patchDownload

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index c09b90eeac..e81bce348c 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -1724,20 +1724,22 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				case RS_INVAL_IDLE_TIMEOUT:
 					Assert(now > 0);
 
-					/* For testing timeout slot invalidation */
-					if (IS_INJECTION_POINT_ATTACHED("slot-time-out-inval"))
-						s->inactive_since = 1;
-
-					/*
-					 * Check if the slot needs to be invalidated due to
-					 * idle_replication_slot_timeout GUC.
-					 */
-					if (CanInvalidateIdleSlot(s) &&
-						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
-														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					if (CanInvalidateIdleSlot(s))
 					{
-						invalidation_cause = cause;
-						inactive_since = s->inactive_since;
+						/* For testing timeout slot invalidation */
+						if (IS_INJECTION_POINT_ATTACHED("slot-time-out-inval"))
+							s->inactive_since = 1;
+
+						/*
+						 * Check if the slot needs to be invalidated due to
+						 * idle_replication_slot_timeout GUC.
+						 */
+						if (TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+															  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+						{
+							invalidation_cause = cause;
+							inactive_since = s->inactive_since;
+						}
 					}
 					break;
 				case RS_INVAL_NONE:
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 0ed9eb057e..70be3a2ce5 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -326,10 +326,10 @@
 				# (change requires restart)
 #wal_keep_size = 0		# in megabytes; 0 disables
 #max_slot_wal_keep_size = -1	# in megabytes; -1 disables
+#idle_replication_slot_timeout = 1d	# in minutes; 0 disables
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
-#idle_replication_slot_timeout = 1d	# in minutes; 0 disables
 
 # - Primary Server -
 
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
index fd907e7f82..7f18881cba 100644
--- a/src/test/recovery/t/044_invalidate_inactive_slots.pl
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -13,6 +13,39 @@ if ($ENV{enable_injection_points} ne 'yes')
 	plan skip_all => 'Injection points not supported by this build';
 }
 
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset) = @_;
+	my $node_name = $node->name;
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer access replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
 # ========================================================================
 # Testcase start
 #
@@ -27,6 +60,7 @@ $node->init(allows_streaming => 'logical');
 $node->append_conf(
 	'postgresql.conf', qq{
 checkpoint_timeout = 1h
+idle_replication_slot_timeout = 1min
 });
 $node->start;
 
@@ -41,13 +75,6 @@ $node->psql('postgres',
 
 my $logstart = -s $node->logfile;
 
-# Set timeout GUC so that the next checkpoint will invalidate inactive slots
-$node->safe_psql(
-	'postgres', qq[
-    ALTER SYSTEM SET idle_replication_slot_timeout TO '1min';
-]);
-$node->reload;
-
 # Register an injection point on the primary to forcibly cause a slot timeout
 $node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
 
@@ -62,6 +89,9 @@ if (!$node->check_extension('injection_points'))
 $node->safe_psql('postgres',
 	"SELECT injection_points_attach('slot-time-out-inval', 'error');");
 
+# Run a checkpoint which will invalidate the slots
+$node->safe_psql('postgres', "CHECKPOINT");
+	
 # Wait for slots to become inactive. Note that nobody has acquired the slot
 # yet, so it must get invalidated due to idle timeout.
 wait_for_slot_invalidation($node, 'physical_slot', $logstart);
@@ -70,50 +100,4 @@ wait_for_slot_invalidation($node, 'logical_slot', $logstart);
 # Testcase end
 # =============================================================================
 
-# Wait for slot to first become idle and then get invalidated
-sub wait_for_slot_invalidation
-{
-	my ($node, $slot, $offset) = @_;
-	my $node_name = $node->name;
-
-	trigger_slot_invalidation($node, $slot, $offset);
-
-	# Check that an invalidated slot cannot be acquired
-	my ($result, $stdout, $stderr);
-	($result, $stdout, $stderr) = $node->psql(
-		'postgres', qq[
-			SELECT pg_replication_slot_advance('$slot', '0/1');
-	]);
-	ok( $stderr =~ /can no longer access replication slot "$slot"/,
-		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
-	  )
-	  or die
-	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
-}
-
-# Trigger slot invalidation and confirm it in the server log
-sub trigger_slot_invalidation
-{
-	my ($node, $slot, $offset) = @_;
-	my $node_name = $node->name;
-	my $invalidated = 0;
-
-	# Run a checkpoint
-	$node->safe_psql('postgres', "CHECKPOINT");
-
-	# The slot's invalidation should be logged
-	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
-		$offset);
-
-	# Check that the invalidation reason is 'idle_timeout'
-	$node->poll_query_until(
-		'postgres', qq[
-		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
-			WHERE slot_name = '$slot' AND
-			invalidation_reason = 'idle_timeout';
-	])
-	  or die
-	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
-}
-
 done_testing();

#355

Peter Smith

smithpb2250@gmail.com

11 months ago

In reply to: Nisha Moond (#353)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha,

Some review comments for v66-0001.

======
src/backend/replication/slot.c

ReportSlotInvalidation:

1.
StringInfoData err_detail;
+ StringInfoData err_hint;
bool hint = false;

initStringInfo(&err_detail);
+ initStringInfo(&err_hint);

I don't think you still need the 'hint' boolean anymore.

Instead of:
hint ? errhint("%s", err_hint.data) : 0);

You could just do something like:
err_hint.len ? errhint("%s", err_hint.data) : 0);

~~~

2.
+ appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+ "max_slot_wal_keep_size");
  break;
2a.
In this case, shouldn't you really be using macro _("You might need to
increase \"%s\".") so that the common format string would be got using
gettext()?

2b.
Should you include a /* translator */ comment here? Other places where
GUC name is substituted do this.

~~~

3.
+ appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+ "idle_replication_slot_timeout");
+ break;

3a.
Ditto above. IMO this common format string should be got using macro.
e.g.: _("You might need to increase \"%s\".")

3b.
Should you include a /* translator */ comment here? Other places where
GUC name is substituted do this.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#356

Peter Smith

smithpb2250@gmail.com

11 months ago

In reply to: Amit Kapila (#352)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Jan 31, 2025 at 8:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jan 31, 2025 at 10:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
======
src/backend/replication/slot.c

ReportSlotInvalidation:
1.
+
+ case RS_INVAL_IDLE_TIMEOUT:
+ Assert(inactive_since > 0);
+ /* translator: second %s is a GUC variable name */
+ appendStringInfo(&err_detail,
+ _("The slot has remained idle since %s, which is longer than the
configured \"%s\" duration."),
+ timestamptz_to_str(inactive_since),
+ "idle_replication_slot_timeout");
+ break;
+
errdetail:

I guess it is no fault of this patch because I see you've only copied
nearby code, but AFAICT this function is still having an each-way bet
by using a mixture of _() macro which is for strings intended be
translated, but then only using them in errdetail_internal() which is
for strings that are NOT intended to be translated. Isn't it
contradictory? Why don't we use errdetail() here?
Your question is valid and I don't have an answer. I encourage you to
start a new thread to clarify this.

I think this was a false alarm.

After studying this more deeply, I've changed my mind and now think
the code is OK as-is.

AFAICT errdetail_internal is used when not wanting to translate the
*fmt* string passed to it (see EVALUATE_MESSAGE in elog.c). Now, here
the format string is just "%s" so it's fine to not translate that.
Meanwhile, the string value being substituted to the "%s" was already
translated because of the _(x) macro aka gettext(x).

I found other examples similar to this -- see the
error_view_not_updatable() function in rewriteHandler.c which does:
ereport(ERROR,
...
detail ? errdetail_internal("%s", _(detail)) : 0,
...

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#357

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Peter Smith (#356)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Feb 3, 2025 at 9:04 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Fri, Jan 31, 2025 at 8:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 31, 2025 at 10:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
======
src/backend/replication/slot.c

ReportSlotInvalidation:
1.
+
+ case RS_INVAL_IDLE_TIMEOUT:
+ Assert(inactive_since > 0);
+ /* translator: second %s is a GUC variable name */
+ appendStringInfo(&err_detail,
+ _("The slot has remained idle since %s, which is longer than the
configured \"%s\" duration."),
+ timestamptz_to_str(inactive_since),
+ "idle_replication_slot_timeout");
+ break;
+
errdetail:

I guess it is no fault of this patch because I see you've only copied
nearby code, but AFAICT this function is still having an each-way bet
by using a mixture of _() macro which is for strings intended be
translated, but then only using them in errdetail_internal() which is
for strings that are NOT intended to be translated. Isn't it
contradictory? Why don't we use errdetail() here?
Your question is valid and I don't have an answer. I encourage you to
start a new thread to clarify this.
I think this was a false alarm.

After studying this more deeply, I've changed my mind and now think
the code is OK as-is.

AFAICT errdetail_internal is used when not wanting to translate the
*fmt* string passed to it (see EVALUATE_MESSAGE in elog.c). Now, here
the format string is just "%s" so it's fine to not translate that.
Meanwhile, the string value being substituted to the "%s" was already
translated because of the _(x) macro aka gettext(x).

I didn't get your point about " the "%s" was already translated
because of ...". If we don't want to translate the message then why
add '_(' to it in the first place?

--
With Regards,
Amit Kapila.

#358

Peter Smith

smithpb2250@gmail.com

11 months ago

In reply to: Amit Kapila (#357)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Feb 3, 2025 at 4:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Feb 3, 2025 at 9:04 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Fri, Jan 31, 2025 at 8:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 31, 2025 at 10:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
======
src/backend/replication/slot.c

ReportSlotInvalidation:
1.
+
+ case RS_INVAL_IDLE_TIMEOUT:
+ Assert(inactive_since > 0);
+ /* translator: second %s is a GUC variable name */
+ appendStringInfo(&err_detail,
+ _("The slot has remained idle since %s, which is longer than the
configured \"%s\" duration."),
+ timestamptz_to_str(inactive_since),
+ "idle_replication_slot_timeout");
+ break;
+
errdetail:

I guess it is no fault of this patch because I see you've only copied
nearby code, but AFAICT this function is still having an each-way bet
by using a mixture of _() macro which is for strings intended be
translated, but then only using them in errdetail_internal() which is
for strings that are NOT intended to be translated. Isn't it
contradictory? Why don't we use errdetail() here?
Your question is valid and I don't have an answer. I encourage you to
start a new thread to clarify this.
I think this was a false alarm.

After studying this more deeply, I've changed my mind and now think
the code is OK as-is.

AFAICT errdetail_internal is used when not wanting to translate the
*fmt* string passed to it (see EVALUATE_MESSAGE in elog.c). Now, here
the format string is just "%s" so it's fine to not translate that.
Meanwhile, the string value being substituted to the "%s" was already
translated because of the _(x) macro aka gettext(x).
I didn't get your point about " the "%s" was already translated
because of ...". If we don't want to translate the message then why
add '_(' to it in the first place?

I think this is same point where I was fooling myself yesterday. In
fact we do want to translate the message seen by the user.

errdetail_internal really means don't translate the ***format
string***. In our case "%s" is not the message at all -- it is just
the a *format string* so translating "%s" is kind of meaningless.

e.g. Normally....

errdetail("translate me") <-- This would translate the fmt string but
here the fmt is also the message; i.e. it will do gettext("translate
me") internally.

errdetail_internal("translate me") <-- This won't translate anything;
you will have the raw fmt string "translate me"

But since ReportSlotInvalidation is building the message on the fly
there is no single report so it is a bit different....

errdetail("%s", "translate me") <-- this would just use gettext("%s")
which is kind of useless. And the "translate me" is just a raw string
and won't be translated.

errdetail_internal("%s", "translate me") <-- this won't translate
anything; the fmt string and the "translate me" are just raw strings

errdetail_internal("%s", _("translate me")) <-- This won't translate
the fmt string, but to translate %s is useless anyway. OTOH, the _()
macro means it will do gettext("translate me") so the "translate me"
string will get translated before it is substituted. This is
effectively what the ReportSlotInvalidation code is doing.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#359

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Peter Smith (#355)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Feb 3, 2025 at 6:16 AM Peter Smith <smithpb2250@gmail.com> wrote:

2.
+ appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+ "max_slot_wal_keep_size");
break;
2a.
In this case, shouldn't you really be using macro _("You might need to
increase \"%s\".") so that the common format string would be got using
gettext()?

~~~

3.
+ appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+ "idle_replication_slot_timeout");
+ break;

3a.
Ditto above. IMO this common format string should be got using macro.
e.g.: _("You might need to increase \"%s\".")

Instead, we can directly use '_(' in errhint as we are doing in one
other similar place "errhint("%s", _(view_updatable_error))));". I
think we didn't use it for errdetail because, in one of the cases, it
needs to use ngettext

--
With Regards,
Amit Kapila.

#360

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Nisha Moond (#353)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Jan 31, 2025 at 5:50 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

Please find the attached v66 patch set. The base patch(v65-001) is
committed now, so I have rebased the patches.

*
       <para>
         The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
...
@@ -2408,7 +2527,9 @@ RestoreSlotFromDisk(const char *name)
  /*
  * Set the time since the slot has become inactive after loading the
  * slot from the disk into memory. Whoever acquires the slot i.e.
- * makes the slot active will reset it.
+ * makes the slot active will reset it. Avoid calling
+ * ReplicationSlotSetInactiveSince() here, as it will not set the time
+ * for invalid slots.
  */
  slot->inactive_since = GetCurrentTimestamp();

It looks inconsistent to set inactive_since on restart for invalid
slots but not at other times. We don't need to set inactive_since for
invalid slots. The invalid slots should not be updated. Ideally, this
should be taken care in the patch that introduces inactive_since but
we can do that now. Let's do this as a separate patch altogether in a
new thread.

--
With Regards,
Amit Kapila.

#361

Nisha Moond

nisha.moond412@gmail.com

11 months ago

In reply to: Amit Kapila (#360)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Feb 3, 2025 at 2:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jan 31, 2025 at 5:50 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

Please find the attached v66 patch set. The base patch(v65-001) is
committed now, so I have rebased the patches.
*
<para>
The time when the slot became inactive. <literal>NULL</literal> if the
-        slot is currently being streamed.
+        slot is currently being streamed. If the slot becomes invalidated,
+        this value will remain unchanged until server shutdown.
...
@@ -2408,7 +2527,9 @@ RestoreSlotFromDisk(const char *name)
/*
* Set the time since the slot has become inactive after loading the
* slot from the disk into memory. Whoever acquires the slot i.e.
- * makes the slot active will reset it.
+ * makes the slot active will reset it. Avoid calling
+ * ReplicationSlotSetInactiveSince() here, as it will not set the time
+ * for invalid slots.
*/
slot->inactive_since = GetCurrentTimestamp();
It looks inconsistent to set inactive_since on restart for invalid
slots but not at other times. We don't need to set inactive_since for
invalid slots. The invalid slots should not be updated. Ideally, this
should be taken care in the patch that introduces inactive_since but
we can do that now. Let's do this as a separate patch altogether in a
new thread.

Created a new thread [1]/messages/by-id/CABdArM7QdifQ_MHmMA=Cc4v8+MeckkwKncm2Nn6tX9wSCQ-+iw@mail.gmail.com to address the inactive_since update for
invalid slots in a separate patch.

[1]: /messages/by-id/CABdArM7QdifQ_MHmMA=Cc4v8+MeckkwKncm2Nn6tX9wSCQ-+iw@mail.gmail.com

--
Thanks,
Nisha

#362

Shlok Kyal

shlok.kyal.oss@gmail.com

11 months ago

In reply to: Nisha Moond (#353)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, 31 Jan 2025 at 17:50, Nisha Moond <nisha.moond412@gmail.com> wrote:

On Fri, Jan 31, 2025 at 2:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 31, 2025 at 10:40 AM Peter Smith <smithpb2250@gmail.com> wrote:
======
src/backend/replication/slot.c

ReportSlotInvalidation:
1.
+
+ case RS_INVAL_IDLE_TIMEOUT:
+ Assert(inactive_since > 0);
+ /* translator: second %s is a GUC variable name */
+ appendStringInfo(&err_detail,
+ _("The slot has remained idle since %s, which is longer than the
configured \"%s\" duration."),
+ timestamptz_to_str(inactive_since),
+ "idle_replication_slot_timeout");
+ break;
+
errdetail:

I guess it is no fault of this patch because I see you've only copied
nearby code, but AFAICT this function is still having an each-way bet
by using a mixture of _() macro which is for strings intended be
translated, but then only using them in errdetail_internal() which is
for strings that are NOT intended to be translated. Isn't it
contradictory? Why don't we use errdetail() here?
Your question is valid and I don't have an answer. I encourage you to
start a new thread to clarify this.

errhint:

Also, the way the 'hint' is implemented can only be meaningful for
RS_INVAL_WAL_REMOVED. This is also existing code that IMO it was
always strange, but now that this patch has added another kind of
switch (cause) this hint implementation now looks increasingly hacky
to me; it is also inflexible -- e.g. if you ever wanted to add
different hints. A neater implementation would be to make the code
more like how the err_detail is handled, so then the errhint string
would only be assigned within the "case RS_INVAL_WAL_REMOVED:"

This makes sense to me.
+
+ case RS_INVAL_IDLE_TIMEOUT:
+ Assert(inactive_since > 0);
+ /* translator: second %s is a GUC variable name */
+ appendStringInfo(&err_detail,
+ _("The slot has remained idle since %s, which is longer than the
configured \"%s\" duration."),
+ timestamptz_to_str(inactive_since),
+ "idle_replication_slot_timeout");
I think the above message should be constructed on a model similar to
the following nearby message:"The slot's restart_lsn %X/%X exceeds the
limit by %llu bytes.". So, how about the following: "The slot's idle
time %s exceeds the configured \"%s\" duration"?

Also, similar to max_slot_wal_keep_size, we should give a hint in this
case to increase idle_replication_slot_timeout.

It is not clear why the injection point test is doing
pg_sync_replication_slots() etc. in the patch. The test should be
simple such that after creating a new physical or logical slot, enable
the injection point, then run the manual checkpoint command, and check
the invalidation status of the slot.
Thanks for the review! I have incorporated the above comments. The
test in patch-002 has been optimized as suggested and now completes in
less than a second.
Please find the attached v66 patch set. The base patch(v65-001) is
committed now, so I have rebased the patches.

Thank you, Kuroda-san, for working on patch-002.

Hi Nisha,

I reviewed the v66 patch. I have few comments:

1. I also feel the default value should be set to '0' as suggested by
Vignesh in 1st point of [1]/messages/by-id/CALDaNm14QrW5j6su+EAqjwnHbiwXJwO+yk73_=7yvc5TVY-43g@mail.gmail.com.

2. Should we allow copying of invalidated slots?
Currently we are able to copy slots which are invalidated:

postgres=# select pg_copy_logical_replication_slot('test1', 'test2');
pg_copy_logical_replication_slot
----------------------------------
(test2,0/16FDE18)
(1 row)

3. We have similar behaviour as above for physical slots.

[1]: /messages/by-id/CALDaNm14QrW5j6su+EAqjwnHbiwXJwO+yk73_=7yvc5TVY-43g@mail.gmail.com

Thanks and Regards,
Shlok Kyal

#363

Nisha Moond

nisha.moond412@gmail.com

11 months ago

In reply to: Amit Kapila (#359)

3 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Feb 3, 2025 at 12:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Feb 3, 2025 at 6:16 AM Peter Smith <smithpb2250@gmail.com> wrote:
2.
+ appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+ "max_slot_wal_keep_size");
break;
2a.
In this case, shouldn't you really be using macro _("You might need to
increase \"%s\".") so that the common format string would be got using
gettext()?
~

~~~
3.
+ appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+ "idle_replication_slot_timeout");
+ break;
3a.
Ditto above. IMO this common format string should be got using macro.
e.g.: _("You might need to increase \"%s\".")

~
Instead, we can directly use '_(' in errhint as we are doing in one
other similar place "errhint("%s", _(view_updatable_error))));". I
think we didn't use it for errdetail because, in one of the cases, it
needs to use ngettext

Please find the v67 patches:
- The patches have been rebased after separating the inactive_since
related changes.
- Patches 001 and 002 incorporate the above comments and the comments
from [1]/messages/by-id/CALDaNm0FS+FqQk2dadiJFCMM_MhKROMsJUb=b8wtRH6isScQsQ@mail.gmail.com and [2]/messages/by-id/CAHut+Ps_6+NBOt+KpQQaBG2R3T-FLS93TbUC27uzyDMu=37n-Q@mail.gmail.com.
- No change in patch-003 since the last version.

[1]: /messages/by-id/CALDaNm0FS+FqQk2dadiJFCMM_MhKROMsJUb=b8wtRH6isScQsQ@mail.gmail.com
[2]: /messages/by-id/CAHut+Ps_6+NBOt+KpQQaBG2R3T-FLS93TbUC27uzyDMu=37n-Q@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v67-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/x-patch; name=v67-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 26c4c85a709221d59e9911b693c910f489672ade Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 3 Feb 2025 15:20:40 +0530
Subject: [PATCH v67 1/3] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not applicable
for slots that do not reserve WAL or for slots on the standby server
that are being synced from the primary server (i.e., standby slots
having 'synced' field 'true'). Synced slots are always considered to be
inactive because they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  40 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |   7 +
 src/backend/replication/slot.c                | 171 ++++++++++++++++--
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  14 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |   3 +
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 12 files changed, 260 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a782f10998..a065fbbaab 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4423,6 +4423,46 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </listitem>
       </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero disables the idle timeout invalidation
+        mechanism. The default is one day. This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the
+        slot's <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not applicable
+        for slots that do not reserve WAL or for slots on the standby server
+        that are being synced from the primary server (i.e., standby slots
+        having <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
       <term><varname>wal_sender_timeout</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 613abcd28b..3d18e507bb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2390,6 +2390,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 8e2b0a7927..88003abee9 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2620,6 +2620,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index c57a13d820..4d1e1bc800 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = HOURS_PER_DAY * MINS_PER_HOUR;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -1518,12 +1525,14 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
-	bool		hint = false;
+	StringInfoData err_hint;
 
 	initStringInfo(&err_detail);
+	initStringInfo(&err_hint);
 
 	switch (cause)
 	{
@@ -1531,13 +1540,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			{
 				unsigned long long ex = oldestLSN - restart_lsn;
 
-				hint = true;
 				appendStringInfo(&err_detail,
 								 ngettext("The slot's restart_lsn %X/%X exceeds the limit by %llu byte.",
 										  "The slot's restart_lsn %X/%X exceeds the limit by %llu bytes.",
 										  ex),
 								 LSN_FORMAT_ARGS(restart_lsn),
 								 ex);
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+								 "max_slot_wal_keep_size");
 				break;
 			}
 		case RS_INVAL_HORIZON:
@@ -1548,6 +1559,19 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail, _("The slot's idle time %s exceeds the configured \"%s\" duration."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1559,9 +1583,36 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			errmsg("invalidating obsolete replication slot \"%s\"",
 				   NameStr(slotname)),
 			errdetail_internal("%s", err_detail.data),
-			hint ? errhint("You might need to increase \"%s\".", "max_slot_wal_keep_size") : 0);
+			err_hint.len ? errhint("%s", _(err_hint.data)) : 0);
 
 	pfree(err_detail.data);
+	pfree(err_hint.data);
+}
+
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has WAL reserved
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins > 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
 }
 
 /*
@@ -1591,6 +1642,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1598,6 +1650,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1608,6 +1661,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1661,6 +1723,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (CanInvalidateIdleSlot(s) &&
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1711,9 +1788,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
+		 * max_slot_wal_keep_size is set to -1 and
+		 * idle_replication_slot_timeout is set to 0 during the binary
+		 * upgrade. See check_old_cluster_for_valid_slots() where we ensure
+		 * that no invalidated before the upgrade.
 		 */
 		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
 
@@ -1745,7 +1823,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1791,7 +1870,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1806,14 +1886,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 /*
  * Invalidate slots that require resources about to be removed.
  *
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1866,7 +1948,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1924,6 +2007,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' applies only to
+		 * released slots, and is based on the 'idle_replication_slot_timeout'
+		 * GUC. Active slots currently in use for replication are excluded to
+		 * prevent accidental invalidation. Slots where communication between
+		 * the publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2803,3 +2925,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 71448bb4fd..8b107c6529 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,20 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		HOURS_PER_DAY * MINS_PER_HOUR,	/* 1 day */
+		0,
+		INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 079efa1baa..70be3a2ce5 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -326,6 +326,7 @@
 				# (change requires restart)
 #wal_keep_size = 0		# in megabytes; 0 disables
 #max_slot_wal_keep_size = -1	# in megabytes; -1 disables
+#idle_replication_slot_timeout = 1d	# in minutes; 0 disables
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 47ebdaecb6..5c6458485c 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -237,6 +239,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..e1d05d6779 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -143,5 +143,8 @@ extern int	date2isoyear(int year, int mon, int mday);
 extern int	date2isoyearday(int year, int mon, int mday);
 
 extern bool TimestampTimestampTzRequiresRewrite(void);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 #endif							/* TIMESTAMP_H */
-- 
2.34.1

v67-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/x-patch; name=v67-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From bfeac59d262bbb024aa90292e079e203304930f5 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Thu, 30 Jan 2025 21:07:12 +0530
Subject: [PATCH v67 2/3] Add TAP test for slot invalidation based on inactive
 timeout.

This test uses injection points to bypass the time overhead caused by the
idle_replication_slot_timeout GUC, which has a minimum value of one minute.
---
 src/backend/replication/slot.c                |  34 ++++--
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 103 ++++++++++++++++++
 3 files changed, 129 insertions(+), 9 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 4d1e1bc800..09e618339c 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
+#include "utils/injection_point.h"
 #include "utils/guc_hooks.h"
 #include "utils/varlena.h"
 
@@ -1726,16 +1727,31 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				case RS_INVAL_IDLE_TIMEOUT:
 					Assert(now > 0);
 
-					/*
-					 * Check if the slot needs to be invalidated due to
-					 * idle_replication_slot_timeout GUC.
-					 */
-					if (CanInvalidateIdleSlot(s) &&
-						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
-														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					if (CanInvalidateIdleSlot(s))
 					{
-						invalidation_cause = cause;
-						inactive_since = s->inactive_since;
+#ifdef USE_INJECTION_POINTS
+
+						/*
+						 * To test idle timeout slot invalidation, if the
+						 * slot-time-out-inval injection point is attached,
+						 * set inactive_since to a very old timestamp (1
+						 * microsecond since epoch) to immediately invalidate
+						 * the slot.
+						 */
+						if (IS_INJECTION_POINT_ATTACHED("slot-time-out-inval"))
+							s->inactive_since = 1;
+#endif
+
+						/*
+						 * Check if the slot needs to be invalidated due to
+						 * idle_replication_slot_timeout GUC.
+						 */
+						if (TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+															  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+						{
+							invalidation_cause = cause;
+							inactive_since = s->inactive_since;
+						}
 					}
 					break;
 				case RS_INVAL_NONE:
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..262e603eb5
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,103 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset) = @_;
+	my $node_name = $node->name;
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer access replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# ========================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical slot due to idle
+# timeout.
+
+# Initialize primary
+my $node = PostgreSQL::Test::Cluster->new('primary');
+$node->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$node->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+idle_replication_slot_timeout = 1min
+});
+$node->start;
+
+# Create both streaming standby and logical slot
+$node->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'physical_slot', immediately_reserve := true);
+]);
+$node->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('logical_slot', 'test_decoding');}
+);
+
+my $logstart = -s $node->logfile;
+
+# Register an injection point on the primary to forcibly cause a slot timeout
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('slot-time-out-inval', 'error');");
+
+# Run a checkpoint which will invalidate the slots
+$node->safe_psql('postgres', "CHECKPOINT");
+
+# Wait for slots to become inactive. Note that nobody has acquired the slot
+# yet, so it must get invalidated due to idle timeout.
+wait_for_slot_invalidation($node, 'physical_slot', $logstart);
+wait_for_slot_invalidation($node, 'logical_slot', $logstart);
+
+# Testcase end
+# =============================================================================
+
+done_testing();
-- 
2.34.1

v67-0003-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/x-patch; name=v67-0003-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From 48c9e2e788d83ccb9b830080ea75fc15d3bb034c Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Wed, 29 Jan 2025 12:12:00 +0530
Subject: [PATCH v67 3/3] Add TAP test for slot invalidation based on inactive
 timeout.

This patch adds the same test, but places it under PG_TEST_EXTRA instead
of using injection points.

Since the minimum value for GUC 'idle_replication_slot_timeout' is one minute,
the test takes more than a minute to complete and is disabled by default.
Use PG_TEST_EXTRA=idle_replication_slot_timeout with "make" to run the test.
---
 .cirrus.tasks.yml                             |   2 +-
 doc/src/sgml/regress.sgml                     |  10 +
 src/test/recovery/README                      |   5 +
 src/test/recovery/meson.build                 |   1 +
 .../045_invalidate_inactive_slots_pg_extra.pl | 208 ++++++++++++++++++
 5 files changed, 225 insertions(+), 1 deletion(-)
 create mode 100644 src/test/recovery/t/045_invalidate_inactive_slots_pg_extra.pl

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index 18e944ca89..8d3c13fcee 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -20,7 +20,7 @@ env:
   MTEST_ARGS: --print-errorlogs --no-rebuild -C build
   PGCTLTIMEOUT: 120 # avoids spurious failures during parallel tests
   TEMP_CONFIG: ${CIRRUS_WORKING_DIR}/src/tools/ci/pg_ci_base.conf
-  PG_TEST_EXTRA: kerberos ldap ssl libpq_encryption load_balance
+  PG_TEST_EXTRA: kerberos ldap ssl libpq_encryption load_balance idle_replication_slot_timeout
 
 
 # What files to preserve in case tests fail
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 7c474559bd..f5b1f2f353 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -347,6 +347,16 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>idle_replication_slot_timeout</literal></term>
+     <listitem>
+      <para>
+       Runs the test <filename>src/test/recovery/t/045_invalidate_inactive_slots_pg_extra.pl</filename>.
+       Not enabled by default because it is time consuming.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/src/test/recovery/README b/src/test/recovery/README
index 896df0ad05..5c066fc41f 100644
--- a/src/test/recovery/README
+++ b/src/test/recovery/README
@@ -30,4 +30,9 @@ PG_TEST_EXTRA=wal_consistency_checking
 to the "make" command.  This is resource-intensive, so it's not done
 by default.
 
+If you want to test idle_replication_slot_timeout, add
+PG_TEST_EXTRA=idle_replication_slot_timeout
+to the "make" command. This test takes over a minutes, so it's not done
+by default.
+
 See src/test/perl/README for more info about running these tests.
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 057bcde143..0a037b4b65 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -53,6 +53,7 @@ tests += {
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
       't/044_invalidate_inactive_slots.pl',
+      't/045_invalidate_inactive_slots_pg_extra.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/045_invalidate_inactive_slots_pg_extra.pl b/src/test/recovery/t/045_invalidate_inactive_slots_pg_extra.pl
new file mode 100644
index 0000000000..577f69d05d
--- /dev/null
+++ b/src/test/recovery/t/045_invalidate_inactive_slots_pg_extra.pl
@@ -0,0 +1,208 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# The test takes over two minutes to complete. Run it only if
+# idle_replication_slot_timeout is specified in PG_TEST_EXTRA.
+if (  !$ENV{PG_TEST_EXTRA}
+	|| $ENV{PG_TEST_EXTRA} !~ /\bidle_replication_slot_timeout\b/)
+{
+	plan skip_all =>
+	  'test idle_replication_slot_timeout not enabled in PG_TEST_EXTRA';
+}
+
+# =============================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical failover slot on the
+# primary due to idle timeout. Also, test logical failover slot synced to
+# the standby from the primary doesn't get invalidated on its own, but gets the
+# invalidated state from the primary.
+
+# Initialize primary
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$primary->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+idle_replication_slot_timeout = 1
+});
+$primary->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$primary->backup($backup_name);
+
+# Create sync slot on the primary
+$primary->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('sync_slot1', 'test_decoding', false, false, true);}
+);
+
+# Create standby1's slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot1', immediately_reserve := true);
+]);
+
+# Create standby2's slot on the primary
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot2', immediately_reserve := true);
+]);
+
+# Create standby1
+my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
+$standby1->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+my $connstr = $primary->connstr;
+$standby1->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot1'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby1->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby1);
+
+# Create standby2
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+$connstr = $primary->connstr;
+$standby2->append_conf(
+	'postgresql.conf', qq(
+hot_standby_feedback = on
+primary_slot_name = 'sb_slot2'
+idle_replication_slot_timeout = 1
+primary_conninfo = '$connstr dbname=postgres'
+));
+$standby2->start;
+
+# Wait until the standby has replayed enough data
+$primary->wait_for_catchup($standby2);
+
+# Set timeout GUC on the standby to verify that the next checkpoint will not
+# invalidate synced slots.
+
+# Make the standby2's slot on the primary inactive
+$standby2->stop;
+
+# Sync the primary slots to the standby
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+# Confirm that the logical failover slot is created on the standby
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1' AND synced
+			AND NOT temporary
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'logical slot sync_slot1 is synced to standby');
+
+# Give enough time for inactive_since to exceed the timeout
+sleep(61);
+
+# On standby, synced slots are not invalidated by the idle timeout
+# until the invalidation state is propagated from the primary.
+$standby1->safe_psql('postgres', "CHECKPOINT");
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason IS NULL;}
+	),
+	't',
+	'check that synced slot sync_slot1 has not been invalidated on standby');
+
+my $logstart = -s $primary->logfile;
+
+# Wait for logical failover slot to become inactive on the primary. Note that
+# nobody has acquired the slot yet, so it must get invalidated due to
+# idle timeout as 61 seconds has elapsed and wait for another 10 seconds
+# to make test reliable.
+wait_for_slot_invalidation($primary, 'sync_slot1', $logstart, 10);
+
+# Re-sync the primary slots to the standby. Note that the primary slot was
+# already invalidated (above) due to idle timeout. The standby must just
+# sync the invalidated state.
+$standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
+
+is( $standby1->safe_psql(
+		'postgres',
+		q{SELECT count(*) = 1 FROM pg_replication_slots
+		  WHERE slot_name = 'sync_slot1'
+			AND invalidation_reason = 'idle_timeout';}
+	),
+	"t",
+	'check that invalidation of synced slot sync_slot1 is synced on standby');
+
+# By now standby2's slot must be invalidated due to idle timeout,
+# check for invalidation.
+wait_for_slot_invalidation($primary, 'sb_slot2', $logstart, 1);
+
+# Testcase end
+# =============================================================================
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset, $wait_time_secs) = @_;
+	my $node_name = $node->name;
+
+	trigger_slot_invalidation($node, $slot, $offset, $wait_time_secs);
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer access replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# Trigger slot invalidation and confirm it in the server log
+sub trigger_slot_invalidation
+{
+	my ($node, $slot, $offset, $wait_time_secs) = @_;
+	my $node_name = $node->name;
+	my $invalidated = 0;
+
+	# Give enough time for inactive_since to exceed the timeout
+	sleep($wait_time_secs);
+
+	# Run a checkpoint
+	$node->safe_psql('postgres', "CHECKPOINT");
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+done_testing();
-- 
2.34.1

#364

Peter Smith

smithpb2250@gmail.com

11 months ago

In reply to: Amit Kapila (#359)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Feb 3, 2025 at 5:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Feb 3, 2025 at 6:16 AM Peter Smith <smithpb2250@gmail.com> wrote:
2.
+ appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+ "max_slot_wal_keep_size");
break;
2a.
In this case, shouldn't you really be using macro _("You might need to
increase \"%s\".") so that the common format string would be got using
gettext()?
~

~~~
3.
+ appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+ "idle_replication_slot_timeout");
+ break;
3a.
Ditto above. IMO this common format string should be got using macro.
e.g.: _("You might need to increase \"%s\".")

~
Instead, we can directly use '_(' in errhint as we are doing in one
other similar place "errhint("%s", _(view_updatable_error))));". I
think we didn't use it for errdetail because, in one of the cases, it
needs to use ngettext

-1 for this suggestion because this will end up causing a gettext() on
the entire hint where the GUC has already been substituted.

e.g. it is effectively doing

_("You might need to increase \"max_slot_wal_keep_size\".")
_("You might need to increase \"idle_replication_slot_timeout\".")

But that is contrary to the goal of reducing the burden on translators
by using *common* messages wherever possible. IMO we should only
request translation of the *common* part of the hint message.

e.g.
_("You might need to increase \"%s\".")

~~~

We always do GUC name substitution into a *common* fmt message because
then translators only need to maintain a single translated string
instead of many. You can find examples of this everywhere. For
example, notice the GUC is always substituted into the translated fmt
msgs below; they never have the GUC name included explicitly. The
result is just a single fmt message is needed.

$ grep -r . -e 'errhint("You might need to increase' | grep '.c:'
./src/backend/replication/logical/launcher.c: errhint("You might need
to increase \"%s\".", "max_logical_replication_workers")));
./src/backend/replication/logical/launcher.c: errhint("You might need
to increase \"%s\".", "max_worker_processes")));
./src/backend/storage/lmgr/predicate.c: errhint("You might need to
increase \"%s\".", "max_pred_locks_per_transaction")));
./src/backend/storage/lmgr/predicate.c: errhint("You might need to
increase \"%s\".", "max_pred_locks_per_transaction")));
./src/backend/storage/lmgr/predicate.c: errhint("You might need to
increase \"%s\".", "max_pred_locks_per_transaction")));
./src/backend/storage/lmgr/lock.c: errhint("You might need to increase
\"%s\".", "max_locks_per_transaction")));
./src/backend/storage/lmgr/lock.c: errhint("You might need to increase
\"%s\".", "max_locks_per_transaction")));
./src/backend/storage/lmgr/lock.c: errhint("You might need to increase
\"%s\".", "max_locks_per_transaction")));
./src/backend/storage/lmgr/lock.c: errhint("You might need to increase
\"%s\".", "max_locks_per_transaction")));
./src/backend/storage/lmgr/lock.c: errhint("You might need to increase
\"%s\".", "max_locks_per_transaction")));

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#365

Peter Smith

smithpb2250@gmail.com

11 months ago

In reply to: Nisha Moond (#363)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha.

Some review comments for patch v67-0001.

======
src/backend/replication/slot.c

ReportSlotInvalidation:

Please see my previous post [1]/messages/by-id/CAHut+PuKCv-S+PJ2iybZKiqu0GJ1fSuzy2CcvyRViLou98QpVA@mail.gmail.com where I gave some reasons why I think
the _() macro should be used only for the *common* part of the hint
messages. If you agree, then the following code should be changed.

1.
+ /* translator: %s is a GUC variable name */
+ appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+ "max_slot_wal_keep_size");
  break;

Change to:
_("You might need to increase \"%s\".")

2.
+ /* translator: %s is a GUC variable name */
+ appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+ "idle_replication_slot_timeout");
+ break;

Change to:
_("You might need to increase \"%s\".")

3.
- hint ? errhint("You might need to increase \"%s\".",
"max_slot_wal_keep_size") : 0);
+ err_hint.len ? errhint("%s", _(err_hint.data)) : 0);

Change to:
err_hint.len ? errhint("%s", err_hint.data) : 0);

======
[1]: /messages/by-id/CAHut+PuKCv-S+PJ2iybZKiqu0GJ1fSuzy2CcvyRViLou98QpVA@mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

#366

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Peter Smith (#364)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Feb 4, 2025 at 4:02 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Mon, Feb 3, 2025 at 5:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Feb 3, 2025 at 6:16 AM Peter Smith <smithpb2250@gmail.com> wrote:
2.
+ appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+ "max_slot_wal_keep_size");
break;
2a.
In this case, shouldn't you really be using macro _("You might need to
increase \"%s\".") so that the common format string would be got using
gettext()?
~

~~~
3.
+ appendStringInfo(&err_hint, "You might need to increase \"%s\".",
+ "idle_replication_slot_timeout");
+ break;
3a.
Ditto above. IMO this common format string should be got using macro.
e.g.: _("You might need to increase \"%s\".")

~
Instead, we can directly use '_(' in errhint as we are doing in one
other similar place "errhint("%s", _(view_updatable_error))));". I
think we didn't use it for errdetail because, in one of the cases, it
needs to use ngettext
-1 for this suggestion because this will end up causing a gettext() on
the entire hint where the GUC has already been substituted.

e.g. it is effectively doing

_("You might need to increase \"max_slot_wal_keep_size\".")
_("You might need to increase \"idle_replication_slot_timeout\".")

But that is contrary to the goal of reducing the burden on translators
by using *common* messages wherever possible. IMO we should only
request translation of the *common* part of the hint message.

Fair point. So, we can ignore my suggestion.

--
With Regards,
Amit Kapila.

#367

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Shlok Kyal (#362)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Feb 3, 2025 at 6:35 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:

I reviewed the v66 patch. I have few comments:

1. I also feel the default value should be set to '0' as suggested by
Vignesh in 1st point of [1].

+1. This will ensure that the idle slots won't be invalidated by
default, the same as HEAD. We can change the default value based on
user inputs.

2. Should we allow copying of invalidated slots?
Currently we are able to copy slots which are invalidated:

postgres=# select slot_name, active, restart_lsn, wal_status,
inactive_since , invalidation_reason from pg_replication_slots;
slot_name | active | restart_lsn | wal_status |
inactive_since | invalidation_reason
-----------+--------+-------------+------------+----------------------------------+---------------------
test1 | f | 0/16FDDE0 | lost | 2025-02-03
18:28:01.802463+05:30 | idle_timeout
(1 row)

postgres=# select pg_copy_logical_replication_slot('test1', 'test2');
pg_copy_logical_replication_slot
----------------------------------
(test2,0/16FDE18)
(1 row)

postgres=# select slot_name, active, restart_lsn, wal_status,
inactive_since , invalidation_reason from pg_replication_slots;
slot_name | active | restart_lsn | wal_status |
inactive_since | invalidation_reason
-----------+--------+-------------+------------+----------------------------------+---------------------
test1 | f | 0/16FDDE0 | lost | 2025-02-03
18:28:01.802463+05:30 | idle_timeout
test2 | f | 0/16FDDE0 | reserved | 2025-02-03
18:29:53.478023+05:30 |
(2 rows)

Is this related to this patch or the behavior of HEAD? If this
behavior is not introduced by this patch then we should discuss this
in a separate thread. I couldn't think of why anyone wants to copy the
invalid slots, so we should probably prohibit copying invalid slots
but that is a matter of separate discussion unless introduced by this
patch.

--
With Regards,
Amit Kapila.

#368

Shlok Kyal

shlok.kyal.oss@gmail.com

11 months ago

In reply to: Amit Kapila (#367)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, 4 Feb 2025 at 10:45, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Feb 3, 2025 at 6:35 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:

I reviewed the v66 patch. I have few comments:

1. I also feel the default value should be set to '0' as suggested by
Vignesh in 1st point of [1].

+1. This will ensure that the idle slots won't be invalidated by
default, the same as HEAD. We can change the default value based on
user inputs.

2. Should we allow copying of invalidated slots?
Currently we are able to copy slots which are invalidated:

postgres=# select slot_name, active, restart_lsn, wal_status,
inactive_since , invalidation_reason from pg_replication_slots;
slot_name | active | restart_lsn | wal_status |
inactive_since | invalidation_reason
-----------+--------+-------------+------------+----------------------------------+---------------------
test1 | f | 0/16FDDE0 | lost | 2025-02-03
18:28:01.802463+05:30 | idle_timeout
(1 row)

postgres=# select pg_copy_logical_replication_slot('test1', 'test2');
pg_copy_logical_replication_slot
----------------------------------
(test2,0/16FDE18)
(1 row)

postgres=# select slot_name, active, restart_lsn, wal_status,
inactive_since , invalidation_reason from pg_replication_slots;
slot_name | active | restart_lsn | wal_status |
inactive_since | invalidation_reason
-----------+--------+-------------+------------+----------------------------------+---------------------
test1 | f | 0/16FDDE0 | lost | 2025-02-03
18:28:01.802463+05:30 | idle_timeout
test2 | f | 0/16FDDE0 | reserved | 2025-02-03
18:29:53.478023+05:30 |
(2 rows)

Is this related to this patch or the behavior of HEAD? If this
behavior is not introduced by this patch then we should discuss this
in a separate thread. I couldn't think of why anyone wants to copy the
invalid slots, so we should probably prohibit copying invalid slots
but that is a matter of separate discussion unless introduced by this
patch.

Hi Amit,

I tested and found that this issue is present in HEAD as well.

There are three types of invalidation in HEAD:
1. "wal_removed"
2. "rows_removed"
3. "wal_level_insufficient"

for copying slot with invalidation "wal_removed" we get an error:

But for slot with invalidation "rows_removed" and
"wal_level_insufficient" we are able to copy the slot:

Similarly we can copy slot with invalidation "wal_level_insufficient".
I have started a new thread to address the issue [1]/messages/by-id/CANhcyEU65aH0VYnLiu=OhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg@mail.gmail.com.

[1]: /messages/by-id/CANhcyEU65aH0VYnLiu=OhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg@mail.gmail.com

Thanks and Regards,
Shlok Kyal

#369

Nisha Moond

nisha.moond412@gmail.com

11 months ago

In reply to: Amit Kapila (#367)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Feb 4, 2025 at 10:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Feb 3, 2025 at 6:35 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:

I reviewed the v66 patch. I have few comments:

1. I also feel the default value should be set to '0' as suggested by
Vignesh in 1st point of [1].

+1. This will ensure that the idle slots won't be invalidated by
default, the same as HEAD. We can change the default value based on
user inputs.

Here are the v68 patches, incorporating above as well as comments from [1]/messages/by-id/CAHut+Pv3mjQxmv5tHfgX=o=4C2TfX5rNYGS7xWrHBGcSVwr3mQ@mail.gmail.com.

Note: The 0003 patch with tests under PG_EXTRA_TESTS is not included
for now. If needed, I'll send it later once the first two patches are
committed.

[1]: /messages/by-id/CAHut+Pv3mjQxmv5tHfgX=o=4C2TfX5rNYGS7xWrHBGcSVwr3mQ@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v68-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v68-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 278565e8a7f4dfb81bcc6cfebe1fb153dcd57712 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 3 Feb 2025 15:20:40 +0530
Subject: [PATCH v68 1/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not applicable
for slots that do not reserve WAL or for slots on the standby server
that are being synced from the primary server (i.e., standby slots
having 'synced' field 'true'). Synced slots are always considered to be
inactive because they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  40 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |   7 +
 src/backend/replication/slot.c                | 171 ++++++++++++++++--
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  12 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |   3 +
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 12 files changed, 258 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a782f10998..b038293618 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4423,6 +4423,46 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </listitem>
       </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero (which is default) disables the idle timeout
+        invalidation mechanism. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the
+        slot's <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not applicable
+        for slots that do not reserve WAL or for slots on the standby server
+        that are being synced from the primary server (i.e., standby slots
+        having <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
       <term><varname>wal_sender_timeout</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 613abcd28b..3d18e507bb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2390,6 +2390,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 8e2b0a7927..88003abee9 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2620,6 +2620,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index c57a13d820..3eb1c09f7d 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = 0;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -1518,12 +1525,14 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
-	bool		hint = false;
+	StringInfoData err_hint;
 
 	initStringInfo(&err_detail);
+	initStringInfo(&err_hint);
 
 	switch (cause)
 	{
@@ -1531,13 +1540,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			{
 				unsigned long long ex = oldestLSN - restart_lsn;
 
-				hint = true;
 				appendStringInfo(&err_detail,
 								 ngettext("The slot's restart_lsn %X/%X exceeds the limit by %llu byte.",
 										  "The slot's restart_lsn %X/%X exceeds the limit by %llu bytes.",
 										  ex),
 								 LSN_FORMAT_ARGS(restart_lsn),
 								 ex);
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+								 "max_slot_wal_keep_size");
 				break;
 			}
 		case RS_INVAL_HORIZON:
@@ -1548,6 +1559,19 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail, _("The slot's idle time %s exceeds the configured \"%s\" duration."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1559,9 +1583,36 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			errmsg("invalidating obsolete replication slot \"%s\"",
 				   NameStr(slotname)),
 			errdetail_internal("%s", err_detail.data),
-			hint ? errhint("You might need to increase \"%s\".", "max_slot_wal_keep_size") : 0);
+			err_hint.len ? errhint("%s", err_hint.data) : 0);
 
 	pfree(err_detail.data);
+	pfree(err_hint.data);
+}
+
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has WAL reserved
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins > 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
 }
 
 /*
@@ -1591,6 +1642,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1598,6 +1650,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1608,6 +1661,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1661,6 +1723,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (CanInvalidateIdleSlot(s) &&
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1711,9 +1788,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
+		 * max_slot_wal_keep_size is set to -1 and
+		 * idle_replication_slot_timeout is set to 0 during the binary
+		 * upgrade. See check_old_cluster_for_valid_slots() where we ensure
+		 * that no invalidated before the upgrade.
 		 */
 		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
 
@@ -1745,7 +1823,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1791,7 +1870,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1806,14 +1886,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 /*
  * Invalidate slots that require resources about to be removed.
  *
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1866,7 +1948,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1924,6 +2007,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' applies only to
+		 * released slots, and is based on the 'idle_replication_slot_timeout'
+		 * GUC. Active slots currently in use for replication are excluded to
+		 * prevent accidental invalidation. Slots where communication between
+		 * the publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2803,3 +2925,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 71448bb4fd..070d8e3c56 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		0, 0, INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 079efa1baa..c064a07973 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -326,6 +326,7 @@
 				# (change requires restart)
 #wal_keep_size = 0		# in megabytes; 0 disables
 #max_slot_wal_keep_size = -1	# in megabytes; -1 disables
+#idle_replication_slot_timeout = 0	# in minutes; 0 disables
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 47ebdaecb6..5c6458485c 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -237,6 +239,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..e1d05d6779 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -143,5 +143,8 @@ extern int	date2isoyear(int year, int mon, int mday);
 extern int	date2isoyearday(int year, int mon, int mday);
 
 extern bool TimestampTimestampTzRequiresRewrite(void);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 #endif							/* TIMESTAMP_H */
-- 
2.34.1

v68-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/octet-stream; name=v68-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From 8a00d4d62454e9d3ae05d6aaf66a7de3860fa0e1 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Thu, 30 Jan 2025 21:07:12 +0530
Subject: [PATCH v68 2/2] Add TAP test for slot invalidation based on inactive
 timeout.

This test uses injection points to bypass the time overhead caused by the
idle_replication_slot_timeout GUC, which has a minimum value of one minute.
---
 src/backend/replication/slot.c                |  34 ++++--
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 103 ++++++++++++++++++
 3 files changed, 129 insertions(+), 9 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 3eb1c09f7d..05a22f8b33 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
+#include "utils/injection_point.h"
 #include "utils/guc_hooks.h"
 #include "utils/varlena.h"
 
@@ -1726,16 +1727,31 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				case RS_INVAL_IDLE_TIMEOUT:
 					Assert(now > 0);
 
-					/*
-					 * Check if the slot needs to be invalidated due to
-					 * idle_replication_slot_timeout GUC.
-					 */
-					if (CanInvalidateIdleSlot(s) &&
-						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
-														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					if (CanInvalidateIdleSlot(s))
 					{
-						invalidation_cause = cause;
-						inactive_since = s->inactive_since;
+#ifdef USE_INJECTION_POINTS
+
+						/*
+						 * To test idle timeout slot invalidation, if the
+						 * slot-time-out-inval injection point is attached,
+						 * set inactive_since to a very old timestamp (1
+						 * microsecond since epoch) to immediately invalidate
+						 * the slot.
+						 */
+						if (IS_INJECTION_POINT_ATTACHED("slot-time-out-inval"))
+							s->inactive_since = 1;
+#endif
+
+						/*
+						 * Check if the slot needs to be invalidated due to
+						 * idle_replication_slot_timeout GUC.
+						 */
+						if (TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+															  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+						{
+							invalidation_cause = cause;
+							inactive_since = s->inactive_since;
+						}
 					}
 					break;
 				case RS_INVAL_NONE:
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..262e603eb5
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,103 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset) = @_;
+	my $node_name = $node->name;
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+
+	# Check that an invalidated slot cannot be acquired
+	my ($result, $stdout, $stderr);
+	($result, $stdout, $stderr) = $node->psql(
+		'postgres', qq[
+			SELECT pg_replication_slot_advance('$slot', '0/1');
+	]);
+	ok( $stderr =~ /can no longer access replication slot "$slot"/,
+		"detected error upon trying to acquire invalidated slot $slot on node $node_name"
+	  )
+	  or die
+	  "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
+}
+
+# ========================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical slot due to idle
+# timeout.
+
+# Initialize primary
+my $node = PostgreSQL::Test::Cluster->new('primary');
+$node->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$node->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+idle_replication_slot_timeout = 1min
+});
+$node->start;
+
+# Create both streaming standby and logical slot
+$node->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'physical_slot', immediately_reserve := true);
+]);
+$node->psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('logical_slot', 'test_decoding');}
+);
+
+my $logstart = -s $node->logfile;
+
+# Register an injection point on the primary to forcibly cause a slot timeout
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('slot-time-out-inval', 'error');");
+
+# Run a checkpoint which will invalidate the slots
+$node->safe_psql('postgres', "CHECKPOINT");
+
+# Wait for slots to become inactive. Note that nobody has acquired the slot
+# yet, so it must get invalidated due to idle timeout.
+wait_for_slot_invalidation($node, 'physical_slot', $logstart);
+wait_for_slot_invalidation($node, 'logical_slot', $logstart);
+
+# Testcase end
+# =============================================================================
+
+done_testing();
-- 
2.34.1

#370

vignesh C

vignesh21@gmail.com

11 months ago

In reply to: Nisha Moond (#369)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, 4 Feb 2025 at 15:58, Nisha Moond <nisha.moond412@gmail.com> wrote:

Here are the v68 patches, incorporating above as well as comments from [1].

Few comments:
1) Let's call TimestampDifferenceExceedsSeconds only if
idle_replication_slot_timeout_mins is set to avoid the
TimestampDifferenceExceedsSeconds function call and timestamp diff
calculation if not required:
+ if (CanInvalidateIdleSlot(s) &&
+ TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+   idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+ {
+ invalidation_cause = cause;
+ inactive_since = s->inactive_since;
+ }
+ break;

2) Let's keep the prototype after TimestampDifferenceExceeds to keep
it consistent with the source file and will also make it easy to
search:
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..e1d05d6779 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -143,5 +143,8 @@ extern int  date2isoyear(int year, int mon, int mday);
 extern int     date2isoyearday(int year, int mon, int mday);

 extern bool TimestampTimestampTzRequiresRewrite(void);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+
                   TimestampTz stop_time,
+
                   int threshold_sec);

3)How about we change the below:
+#ifdef USE_INJECTION_POINTS
+
+                                               /*
+                                                * To test idle
timeout slot invalidation, if the
+                                                * slot-time-out-inval
injection point is attached,
+                                                * set inactive_since
to a very old timestamp (1
+                                                * microsecond since
epoch) to immediately invalidate
+                                                * the slot.
+                                                */
+                                               if
(IS_INJECTION_POINT_ATTACHED("slot-time-out-inval"))
+                                                       s->inactive_since = 1;
+#endif
to:
#ifdef USE_INJECTION_POINTS
/*
* To test idle timeout slot invalidation, if the
* slot-time-out-inval injection point is attached,
* set inactive_since to current time and invalidate the slot immediately.
*/
if (IS_INJECTION_POINT_ATTACHED("slot-time-out-inval") &&
idle_replication_slot_timeout_mins)
{
invalidation_cause = cause;
inactive_since = s->inactive_since = now;
}
#else
/*
* Check if the slot needs to be invalidated due to
* idle_replication_slot_timeout GUC.
*/
if (TimestampDifferenceExceedsSeconds(s->inactive_since, now,
  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
{
invalidation_cause = cause;
inactive_since = s->inactive_since;
}
#endif

We can just invalidate the slot directly without checking the time
difference if idle_replication_slot_timeout_mins is set and
inactive_since can hold the now value.

Regards,
Vignesh

#371

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

11 months ago

In reply to: vignesh C (#370)

RE: Introduce XID age and inactive timeout based replication slot invalidation

Dear Nisha,

Thanks for updating the patch! Here are my comments.

01.
```
+# Test for replication slots invalidation
```

Since the file tests only timeout invalidations, the comment seems too general.

02.
```
+       # Check that an invalidated slot cannot be acquired
+       my ($result, $stdout, $stderr);
+       ($result, $stdout, $stderr) = $node->psql(
+               'postgres', qq[
+                       SELECT pg_replication_slot_advance('$slot', '0/1');
+       ]);
+       ok( $stderr =~ /can no longer access replication slot "$slot"/,
+               "detected error upon trying to acquire invalidated slot $slot on node $node_name"
+         )
+         or die
+         "could not detect error upon trying to acquire invalidated slot $slot on node $node_name";
```

This part can be removal because this is not directly related with timeout invalidation.
If needed this can be outside the function and we can confirm only once.

03.
```
+# Initialize primary
+my $node = PostgreSQL::Test::Cluster->new('primary');
+$node->init(allows_streaming => 'logical');
```

I think this node is not "primary" because there are no standby nodes. We can use new('node').
Also some comments which used "primary" can be removed.

04.
```
+$node->psql('postgres',
+       q{SELECT pg_create_logical_replication_slot('logical_slot', 'test_decoding');}
+);
```

Please use safe_psql() instead of psql().

05.
```
my $logstart = -s $node->logfile;
```

According to other tests, the variable name can be $log_offset.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#372

Nisha Moond

nisha.moond412@gmail.com

11 months ago

In reply to: vignesh C (#370)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Feb 4, 2025 at 4:42 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, 4 Feb 2025 at 15:58, Nisha Moond <nisha.moond412@gmail.com> wrote:

Here are the v68 patches, incorporating above as well as comments from [1].

Few comments:
1) Let's call TimestampDifferenceExceedsSeconds only if
idle_replication_slot_timeout_mins is set to avoid the
TimestampDifferenceExceedsSeconds function call and timestamp diff
calculation if not required:
+ if (CanInvalidateIdleSlot(s) &&
+ TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+   idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+ {
+ invalidation_cause = cause;
+ inactive_since = s->inactive_since;
+ }
+ break;

The CanInvalidateIdleSlot(s) call does the check if
idle_replication_slot_timeout_mins is set or not. So we are good here.

2) Let's keep the prototype after TimestampDifferenceExceeds to keep
it consistent with the source file and will also make it easy to
search:
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..e1d05d6779 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -143,5 +143,8 @@ extern int  date2isoyear(int year, int mon, int mday);
extern int     date2isoyearday(int year, int mon, int mday);

extern bool TimestampTimestampTzRequiresRewrite(void);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+
TimestampTz stop_time,
+
int threshold_sec);

Done.

3)How about we change the below:
+#ifdef USE_INJECTION_POINTS
+
+                                               /*
+                                                * To test idle
timeout slot invalidation, if the
+                                                * slot-time-out-inval
injection point is attached,
+                                                * set inactive_since
to a very old timestamp (1
+                                                * microsecond since
epoch) to immediately invalidate
+                                                * the slot.
+                                                */
+                                               if
(IS_INJECTION_POINT_ATTACHED("slot-time-out-inval"))
+                                                       s->inactive_since = 1;
+#endif
to:
#ifdef USE_INJECTION_POINTS
/*
* To test idle timeout slot invalidation, if the
* slot-time-out-inval injection point is attached,
* set inactive_since to current time and invalidate the slot immediately.
*/
if (IS_INJECTION_POINT_ATTACHED("slot-time-out-inval") &&
idle_replication_slot_timeout_mins)
{
invalidation_cause = cause;
inactive_since = s->inactive_since = now;
}
#else
/*
* Check if the slot needs to be invalidated due to
* idle_replication_slot_timeout GUC.
*/
if (TimestampDifferenceExceedsSeconds(s->inactive_since, now,
idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
{
invalidation_cause = cause;
inactive_since = s->inactive_since;
}
#endif

We can just invalidate the slot directly without checking the time
difference if idle_replication_slot_timeout_mins is set and
inactive_since can hold the now value.

+1 to the idea. Implemented it in a slightly different way to avoid
enclosing the main code within "#else".

Here is v69 patch set addressing above and Kuroda-san's comments in [1]/messages/by-id/OSCPR01MB14966A918EBB0674E5423EDE0F5F42@OSCPR01MB14966.jpnprd01.prod.outlook.com.

[1]: /messages/by-id/OSCPR01MB14966A918EBB0674E5423EDE0F5F42@OSCPR01MB14966.jpnprd01.prod.outlook.com

--
Thanks,
Nisha

Attachments:

v69-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/x-patch; name=v69-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 200cbf7d4b6add1fa7bb3f8fb88bacc01fbd01c8 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 3 Feb 2025 15:20:40 +0530
Subject: [PATCH v69 1/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not applicable
for slots that do not reserve WAL or for slots on the standby server
that are being synced from the primary server (i.e., standby slots
having 'synced' field 'true'). Synced slots are always considered to be
inactive because they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  40 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |   7 +
 src/backend/replication/slot.c                | 171 ++++++++++++++++--
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  12 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |   3 +
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 12 files changed, 258 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a782f10998..b038293618 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4423,6 +4423,46 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </listitem>
       </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero (which is default) disables the idle timeout
+        invalidation mechanism. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the
+        slot's <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not applicable
+        for slots that do not reserve WAL or for slots on the standby server
+        that are being synced from the primary server (i.e., standby slots
+        having <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>).
+        Synced slots are always considered to be inactive because they don't
+        perform logical decoding to produce changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
       <term><varname>wal_sender_timeout</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 613abcd28b..3d18e507bb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2390,6 +2390,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 8e2b0a7927..88003abee9 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2620,6 +2620,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index c57a13d820..3eb1c09f7d 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = 0;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -1518,12 +1525,14 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
 	StringInfoData err_detail;
-	bool		hint = false;
+	StringInfoData err_hint;
 
 	initStringInfo(&err_detail);
+	initStringInfo(&err_hint);
 
 	switch (cause)
 	{
@@ -1531,13 +1540,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			{
 				unsigned long long ex = oldestLSN - restart_lsn;
 
-				hint = true;
 				appendStringInfo(&err_detail,
 								 ngettext("The slot's restart_lsn %X/%X exceeds the limit by %llu byte.",
 										  "The slot's restart_lsn %X/%X exceeds the limit by %llu bytes.",
 										  ex),
 								 LSN_FORMAT_ARGS(restart_lsn),
 								 ex);
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+								 "max_slot_wal_keep_size");
 				break;
 			}
 		case RS_INVAL_HORIZON:
@@ -1548,6 +1559,19 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+
+			/* translator: second %s is a GUC variable name */
+			appendStringInfo(&err_detail, _("The slot's idle time %s exceeds the configured \"%s\" duration."),
+							 timestamptz_to_str(inactive_since),
+							 "idle_replication_slot_timeout");
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1559,9 +1583,36 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			errmsg("invalidating obsolete replication slot \"%s\"",
 				   NameStr(slotname)),
 			errdetail_internal("%s", err_detail.data),
-			hint ? errhint("You might need to increase \"%s\".", "max_slot_wal_keep_size") : 0);
+			err_hint.len ? errhint("%s", err_hint.data) : 0);
 
 	pfree(err_detail.data);
+	pfree(err_hint.data);
+}
+
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has WAL reserved
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins > 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
 }
 
 /*
@@ -1591,6 +1642,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1598,6 +1650,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1608,6 +1661,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * We get the current time beforehand to avoid system call while
+			 * holding the spinlock.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1661,6 +1723,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (CanInvalidateIdleSlot(s) &&
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1711,9 +1788,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
+		 * max_slot_wal_keep_size is set to -1 and
+		 * idle_replication_slot_timeout is set to 0 during the binary
+		 * upgrade. See check_old_cluster_for_valid_slots() where we ensure
+		 * that no invalidated before the upgrade.
 		 */
 		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
 
@@ -1745,7 +1823,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1791,7 +1870,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1806,14 +1886,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 /*
  * Invalidate slots that require resources about to be removed.
  *
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1866,7 +1948,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1924,6 +2007,45 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 *
+		 * XXX: Slot invalidation due to 'idle_timeout' applies only to
+		 * released slots, and is based on the 'idle_replication_slot_timeout'
+		 * GUC. Active slots currently in use for replication are excluded to
+		 * prevent accidental invalidation. Slots where communication between
+		 * the publisher and subscriber is down are also excluded, as they are
+		 * managed by the 'wal_sender_timeout'.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2803,3 +2925,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 71448bb4fd..070d8e3c56 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		0, 0, INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 079efa1baa..c064a07973 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -326,6 +326,7 @@
 				# (change requires restart)
 #wal_keep_size = 0		# in megabytes; 0 disables
 #max_slot_wal_keep_size = -1	# in megabytes; -1 disables
+#idle_replication_slot_timeout = 0	# in minutes; 0 disables
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 47ebdaecb6..5c6458485c 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -237,6 +239,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..9963bddc0e 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -107,6 +107,9 @@ extern long TimestampDifferenceMilliseconds(TimestampTz start_time,
 extern bool TimestampDifferenceExceeds(TimestampTz start_time,
 									   TimestampTz stop_time,
 									   int msec);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
 extern pg_time_t timestamptz_to_time_t(TimestampTz t);
-- 
2.34.1

v69-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/x-patch; name=v69-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From 20c4e69d768c16868e9b1cc792566c4aae1530c3 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Thu, 30 Jan 2025 21:07:12 +0530
Subject: [PATCH v69 2/2] Add TAP test for slot invalidation based on inactive
 timeout.

This test uses injection points to bypass the time overhead caused by the
idle_replication_slot_timeout GUC, which has a minimum value of one minute.
---
 src/backend/replication/slot.c                |  36 ++++--
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 104 ++++++++++++++++++
 3 files changed, 132 insertions(+), 9 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 3eb1c09f7d..ef4eced7aa 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
+#include "utils/injection_point.h"
 #include "utils/guc_hooks.h"
 #include "utils/varlena.h"
 
@@ -1726,16 +1727,33 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				case RS_INVAL_IDLE_TIMEOUT:
 					Assert(now > 0);
 
-					/*
-					 * Check if the slot needs to be invalidated due to
-					 * idle_replication_slot_timeout GUC.
-					 */
-					if (CanInvalidateIdleSlot(s) &&
-						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
-														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					if (CanInvalidateIdleSlot(s))
 					{
-						invalidation_cause = cause;
-						inactive_since = s->inactive_since;
+#ifdef USE_INJECTION_POINTS
+
+						/*
+						 * To test idle timeout slot invalidation, if the
+						 * slot-time-out-inval injection point is attached,
+						 * immediately invalidate the slot.
+						 */
+						if (IS_INJECTION_POINT_ATTACHED("slot-time-out-inval"))
+						{
+							invalidation_cause = cause;
+							inactive_since = s->inactive_since = now;
+							break;
+						}
+#endif
+
+						/*
+						 * Check if the slot needs to be invalidated due to
+						 * idle_replication_slot_timeout GUC.
+						 */
+						if (TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+															  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+						{
+							invalidation_cause = cause;
+							inactive_since = s->inactive_since;
+						}
 					}
 					break;
 				case RS_INVAL_NONE:
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..13f4491319
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,104 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation due to idle_timeout
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot, $offset) = @_;
+	my $node_name = $node->name;
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(qr/invalidating obsolete replication slot \"$slot\"/,
+		$offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot to be set on node $node_name";
+}
+
+# ========================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical slot due to idle
+# timeout.
+
+# Initialize the node
+my $node = PostgreSQL::Test::Cluster->new('node');
+$node->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$node->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+idle_replication_slot_timeout = 1min
+});
+$node->start;
+
+# Create both streaming standby and logical slot
+$node->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'physical_slot', immediately_reserve := true);
+]);
+$node->safe_psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('logical_slot', 'test_decoding');}
+);
+
+my $log_offset = -s $node->logfile;
+
+# Register an injection point on the node to forcibly cause a slot
+# invalidation due to idle_timeout
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('slot-time-out-inval', 'error');");
+
+# Run a checkpoint which will invalidate the slots
+$node->safe_psql('postgres', "CHECKPOINT");
+
+# Wait for slots to become inactive. Note that nobody has acquired the slot
+# yet, so it must get invalidated due to idle timeout.
+wait_for_slot_invalidation($node, 'physical_slot', $log_offset);
+wait_for_slot_invalidation($node, 'logical_slot', $log_offset);
+
+# Check that the invalidated slot cannot be acquired
+my $node_name = $node->name;
+my ($result, $stdout, $stderr);
+($result, $stdout, $stderr) = $node->psql(
+	'postgres', qq[
+		SELECT pg_replication_slot_advance('logical_slot', '0/1');
+]);
+ok( $stderr =~ /can no longer access replication slot "logical_slot"/,
+	"detected error upon trying to acquire invalidated slot on node")
+  or die
+  "could not detect error upon trying to acquire invalidated slot \"logical_slot\" on node";
+
+# Testcase end
+# =============================================================================
+
+done_testing();
-- 
2.34.1

#373

Peter Smith

smithpb2250@gmail.com

11 months ago

In reply to: Nisha Moond (#372)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Review comments for v69-0001.

======
doc/src/sgml/config.sgml

1.
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero (which is default) disables the idle timeout
+        invalidation mechanism. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>

Suggest writing "(the default)" instead of "(which is default)" to be
consistent with the wording of other descriptions on this page.

======
src/backend/replication/slot.c

2.
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+ return (idle_replication_slot_timeout_mins > 0 &&
+ !XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+ s->inactive_since > 0 &&
+ !(RecoveryInProgress() && s->data.synced));
 }

I wasn't sure why those conditions were '> 0' instead of just '!= 0'.
IIUC negative values aren't possible for
idle_replication_slot_timeout_mins and active_since anyhow.

But, the current patch code is also ok if you prefer.

~~~

3.
+ if (cause == RS_INVAL_IDLE_TIMEOUT)
+ {
+ /*
+ * We get the current time beforehand to avoid system call while
+ * holding the spinlock.
+ */
+ now = GetCurrentTimestamp();
+ }
+

SUGGESTION
Assign the current time here to reduce system call overhead while
holding the spinlock in subsequent code.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#374

vignesh C

vignesh21@gmail.com

11 months ago

In reply to: Nisha Moond (#372)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, 4 Feb 2025 at 19:56, Nisha Moond <nisha.moond412@gmail.com> wrote:

Here is v69 patch set addressing above and Kuroda-san's comments in [1].

Few minor suggestions:
1) In the slot invalidation reporting below:
+               case RS_INVAL_IDLE_TIMEOUT:
+                       Assert(inactive_since > 0);
+
+                       /* translator: second %s is a GUC variable name */
+                       appendStringInfo(&err_detail, _("The slot's
idle time %s exceeds the configured \"%s\" duration."),
+
timestamptz_to_str(inactive_since),
+
"idle_replication_slot_timeout");
+                       /* translator: %s is a GUC variable name */
+                       appendStringInfo(&err_hint, _("You might need
to increase \"%s\"."),
+
"idle_replication_slot_timeout");

It is logged like:
2025-02-05 10:04:11.616 IST [330567] DETAIL: The slot's idle time
2025-02-05 10:02:49.131631+05:30 exceeds the configured
"idle_replication_slot_timeout" duration.

Here even though we tell idle time, we are logging the inactive_since
value which kind of gives a wrong meaning.

How about we change it to:
The slot has been inactive since 2025-02-05 10:02:49.131631+05:30,
which exceeds the configured "idle_replication_slot_timeout" duration.

2) Here we have mentioned about invalidation happens only for a)
released slots b) inactive slots replication slots c) slot where
communication between pub and sub is down
+                * XXX: Slot invalidation due to 'idle_timeout' applies only to
+                * released slots, and is based on the
'idle_replication_slot_timeout'
+                * GUC. Active slots currently in use for replication
are excluded to
+                * prevent accidental invalidation. Slots where
communication between
+                * the publisher and subscriber is down are also
excluded, as they are
+                * managed by the 'wal_sender_timeout'.
+                */
+               InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+
            0,
+
            InvalidOid,
+
            InvalidTransactionId);
a) Can we include about slots which does not reserve WAL are also not
considered.
b) Could we present this in a bullet-point format like the following:
+                * XXX: Slot invalidation due to 'idle_timeout' applies only to:
+                * 1) released slots, and is based on the
'idle_replication_slot_timeout'
+                * GUC. 2) Active slots currently in use for
replication are excluded to
+                * prevent accidental invalidation. 3) Slots where
communication between
+                * the publisher and subscriber is down are also
excluded, as they are
+                * managed by the 'wal_sender_timeout'.
+                */
c) While I was initially reviewing the patch I also had the similar
thoughts on my mind, if we could mention the one like "Slots where
communication between the publisher and subscriber is down are also
excluded, as they are managed by the 'wal_sender_timeout'" in the
documentation it might be good.

Regards,
Vignesh

#375

Peter Smith

smithpb2250@gmail.com

11 months ago

In reply to: Nisha Moond (#372)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha,

Some review comments for the patch v69-0002.

======
src/backend/replication/slot.c

1.
+#ifdef USE_INJECTION_POINTS
+
+ /*
+ * To test idle timeout slot invalidation, if the
+ * slot-time-out-inval injection point is attached,
+ * immediately invalidate the slot.
+ */
+ if (IS_INJECTION_POINT_ATTACHED("slot-time-out-inval"))
+ {
+ invalidation_cause = cause;
+ inactive_since = s->inactive_since = now;
+ break;
+ }
+#endif

1a.
I didn't understand the reason for the assignment ' = now' here. This
is not happening in the normal code path so why do you need to do this
in this test code path? It works for me without doing this.

1b.
For testing, I think we should try to keep the injection code
differences minimal -- e.g. share the same (normal build) code as much
as possible. For example, I suggest refactoring like below. Well, it
works for me.

/*
* Check if the slot needs to be invalidated due to
* idle_replication_slot_timeout GUC.
*
* To test idle timeout slot invalidation, if the
* "slot-time-out-inval" injection point is attached,
* immediately invalidate the slot.
*/
if (
#ifdef USE_INJECTION_POINTS
IS_INJECTION_POINT_ATTACHED("slot-time-out-inval") ||
#endif
TimestampDifferenceExceedsSeconds(s->inactive_since, now,
idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
{
invalidation_cause = cause;
inactive_since = s->inactive_since;
}

1c.
Can we call the injection point "timeout" instead of "time-out"?

======
.../t/044_invalidate_inactive_slots.pl

2.
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}

At first, I had no idea how to build for this test. It would be good
to include a link to the injection build instructions in a comment
somewhere near here.

~~~

3.
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+ my ($node, $slot, $offset) = @_;
+ my $node_name = $node->name;
+

Might be better to call the variable $slot_name instead of $slot.

Also then it will be consistent with $node_name

~~~

4.
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.

I misread this comment at first -- maybe it is clearer to reverse the wording?

/extension injection_points/’injection_points’ extension/

~~~

5.
+# Run a checkpoint which will invalidate the slots
+$node->safe_psql('postgres', "CHECKPOINT");

The explanation seems a bit terse -- I think the comment should
elaborate a bit more to explain that CHECKPOINT is just where the idle
slot timeout is checked, but since the test is using injection point
and the injection code enforces immediate idle timeout THAT is why it
will invalidate the slots...

~~~

6.
+# Wait for slots to become inactive. Note that nobody has acquired the slot
+# yet, so it must get invalidated due to idle timeout.

IIUC this comment means:

SUGGESTION
Note that since nobody has acquired the slot yet, then if it has been
invalidated that can only be due to the idle timeout mechanism.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#376

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: vignesh C (#374)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Feb 5, 2025 at 10:30 AM vignesh C <vignesh21@gmail.com> wrote:

On Tue, 4 Feb 2025 at 19:56, Nisha Moond <nisha.moond412@gmail.com> wrote:

Here is v69 patch set addressing above and Kuroda-san's comments in [1].
Few minor suggestions:
1) In the slot invalidation reporting below:
+               case RS_INVAL_IDLE_TIMEOUT:
+                       Assert(inactive_since > 0);
+
+                       /* translator: second %s is a GUC variable name */
+                       appendStringInfo(&err_detail, _("The slot's
idle time %s exceeds the configured \"%s\" duration."),
+
timestamptz_to_str(inactive_since),
+
"idle_replication_slot_timeout");
+                       /* translator: %s is a GUC variable name */
+                       appendStringInfo(&err_hint, _("You might need
to increase \"%s\"."),
+
"idle_replication_slot_timeout");
It is logged like:
2025-02-05 10:04:11.616 IST [330567] DETAIL: The slot's idle time
2025-02-05 10:02:49.131631+05:30 exceeds the configured
"idle_replication_slot_timeout" duration.

Here even though we tell idle time, we are logging the inactive_since
value which kind of gives a wrong meaning.

How about we change it to:
The slot has been inactive since 2025-02-05 10:02:49.131631+05:30,
which exceeds the configured "idle_replication_slot_timeout" duration.

Would it address your concern if we write the actual idle duration
(now - inactive_since) instead of directly using inactive_since in the
above message?

A few other comments:
1.
+ * 4. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.

The 4th point in the above comment and the rest of the comment is
mostly saying the same thing.

2.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1924,6 +2007,45 @@ CheckPointReplicationSlots(bool is_shutdown)

Can we try and see how the patch looks if we try to invalidate the
slot due to idle time at the same time when we are trying to
invalidate due to WAL?

--
With Regards,
Amit Kapila.

#377

Nisha Moond

nisha.moond412@gmail.com

11 months ago

In reply to: vignesh C (#374)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Feb 5, 2025 at 10:30 AM vignesh C <vignesh21@gmail.com> wrote:

On Tue, 4 Feb 2025 at 19:56, Nisha Moond <nisha.moond412@gmail.com> wrote:

Here is v69 patch set addressing above and Kuroda-san's comments in [1].

2) Here we have mentioned about invalidation happens only for a)
released slots b) inactive slots replication slots c) slot where
communication between pub and sub is down
+                * XXX: Slot invalidation due to 'idle_timeout' applies only to
+                * released slots, and is based on the
'idle_replication_slot_timeout'
+                * GUC. Active slots currently in use for replication
are excluded to
+                * prevent accidental invalidation. Slots where
communication between
+                * the publisher and subscriber is down are also
excluded, as they are
+                * managed by the 'wal_sender_timeout'.
+                */
+               InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+
0,
+
InvalidOid,
+
InvalidTransactionId);
a) Can we include about slots which does not reserve WAL are also not
considered.

We have included all the info regarding which slots are excluded in
the documents, so I feel we can remove the XXX: comment from here.
(done in v70).

c) While I was initially reviewing the patch I also had the similar
thoughts on my mind, if we could mention the one like "Slots where
communication between the publisher and subscriber is down are also
excluded, as they are managed by the 'wal_sender_timeout'" in the
documentation it might be good.

v70 adds the suggested info in the docs.

--
Thanks,
Nisha

#378

Nisha Moond

nisha.moond412@gmail.com

11 months ago

In reply to: Peter Smith (#375)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Feb 5, 2025 at 12:58 PM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha,

Some review comments for the patch v69-0002.

======
.../t/044_invalidate_inactive_slots.pl
2.
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
At first, I had no idea how to build for this test. It would be good
to include a link to the injection build instructions in a comment
somewhere near here.

I’ve added comments with build instructions in v70, but I’m not sure
if a link to the documentation is necessary. I didn’t find similar
instructions in other injection point-dependent tests. Let’s see what
others think.

--
Thanks,
Nisha

#379

Nisha Moond

nisha.moond412@gmail.com

11 months ago

In reply to: Amit Kapila (#376)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Feb 5, 2025 at 2:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Feb 5, 2025 at 10:30 AM vignesh C <vignesh21@gmail.com> wrote:
On Tue, 4 Feb 2025 at 19:56, Nisha Moond <nisha.moond412@gmail.com> wrote:

Here is v69 patch set addressing above and Kuroda-san's comments in [1].
Few minor suggestions:
1) In the slot invalidation reporting below:
+               case RS_INVAL_IDLE_TIMEOUT:
+                       Assert(inactive_since > 0);
+
+                       /* translator: second %s is a GUC variable name */
+                       appendStringInfo(&err_detail, _("The slot's
idle time %s exceeds the configured \"%s\" duration."),
+
timestamptz_to_str(inactive_since),
+
"idle_replication_slot_timeout");
+                       /* translator: %s is a GUC variable name */
+                       appendStringInfo(&err_hint, _("You might need
to increase \"%s\"."),
+
"idle_replication_slot_timeout");
It is logged like:
2025-02-05 10:04:11.616 IST [330567] DETAIL: The slot's idle time
2025-02-05 10:02:49.131631+05:30 exceeds the configured
"idle_replication_slot_timeout" duration.

Here even though we tell idle time, we are logging the inactive_since
value which kind of gives a wrong meaning.

How about we change it to:
The slot has been inactive since 2025-02-05 10:02:49.131631+05:30,
which exceeds the configured "idle_replication_slot_timeout" duration.
Would it address your concern if we write the actual idle duration
(now - inactive_since) instead of directly using inactive_since in the
above message?

Simply using the raw timestamp difference (now - inactive_since) would
look odd. We should convert it into a user-friendly format. Since
idle_replication_slot_timeout is in minutes, we can express the
difference in minutes and seconds in the log.
For example:
DETAIL: The slot's idle time of 1 minute and 7 seconds exceeds the
configured "idle_replication_slot_timeout" duration.

This has been implemented in v70.
Thoughts?

A few other comments:
1.
+ * 4. The slot is not being synced from the primary while the server
+ *    is in recovery
+ *
+ * Note that the idle timeout invalidation mechanism is not
+ * applicable for slots on the standby server that are being synced
+ * from the primary server (i.e., standby slots having 'synced' field 'true').
+ * Synced slots are always considered to be inactive because they don't
+ * perform logical decoding to produce changes.

The 4th point in the above comment and the rest of the comment is
mostly saying the same thing.

Done. I've merged the additional info and 4th point.

2.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
*
* It is convenient to flush dirty replication slots at the time of checkpoint.
* Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1924,6 +2007,45 @@ CheckPointReplicationSlots(bool is_shutdown)
Can we try and see how the patch looks if we try to invalidate the
slot due to idle time at the same time when we are trying to
invalidate due to WAL?

I'll consider the suggested change in the next version.
~~~~

Here are the v70 patches - addressed above and other comments in [1]/messages/by-id/CAHut+PvW3pr3P3hXwBskXrDmJYKedmqRaPZcL4iLRQ51=XxOBw@mail.gmail.com,
[2]: /messages/by-id/CALDaNm0X_vgAxKPT+c14yqKcgE5-x4XBdXsCAVqD6_aa-QYUvg@mail.gmail.com

[1]: /messages/by-id/CAHut+PvW3pr3P3hXwBskXrDmJYKedmqRaPZcL4iLRQ51=XxOBw@mail.gmail.com
[2]: /messages/by-id/CALDaNm0X_vgAxKPT+c14yqKcgE5-x4XBdXsCAVqD6_aa-QYUvg@mail.gmail.com
[3]: /messages/by-id/CAHut+PtCpOnifF9wnhJ=jo7KLmtT=MikuYnM9GGPTVA80rq7OA@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v70-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v70-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 6ccc387d4825ed265a8f82629d27ed7f340a27e7 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 3 Feb 2025 15:20:40 +0530
Subject: [PATCH v70 1/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not applicable
for slots that do not reserve WAL or for slots on the standby server
that are being synced from the primary server (i.e., standby slots
having 'synced' field 'true'). Synced slots are always considered to be
inactive because they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  42 +++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |   7 +
 src/backend/replication/slot.c                | 168 ++++++++++++++++--
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  12 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |   3 +
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 12 files changed, 257 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a782f10998..11fb18b458 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4423,6 +4423,48 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </listitem>
       </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero (the default) disables the idle timeout
+        invalidation mechanism. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the
+        slot's <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not applicable
+        for slots that do not reserve WAL or for slots on the standby server
+        that are being synced from the primary server (i.e., standby slots
+        having <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>). Synced slots are always considered to
+        be inactive because they don't perform logical decoding to produce
+        changes. Slots that appear idle due to a disrupted connection between
+        the publisher and subscriber are also excluded, as they are managed by
+        <link linkend="guc-wal-sender-timeout"><varname>wal_sender_timeout</varname></link>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
       <term><varname>wal_sender_timeout</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 613abcd28b..3d18e507bb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2390,6 +2390,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be81c2b51d..f58b9406e4 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2621,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fe5acd8b1f..29749ce917 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = 0;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -1512,12 +1519,18 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since)
 {
+	int			minutes = 0;
+	int			secs = 0;
+	long		elapsed_secs = 0;
+	TimestampTz now = 0;
 	StringInfoData err_detail;
-	bool		hint = false;
+	StringInfoData err_hint;
 
 	initStringInfo(&err_detail);
+	initStringInfo(&err_hint);
 
 	switch (cause)
 	{
@@ -1525,13 +1538,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			{
 				unsigned long long ex = oldestLSN - restart_lsn;
 
-				hint = true;
 				appendStringInfo(&err_detail,
 								 ngettext("The slot's restart_lsn %X/%X exceeds the limit by %llu byte.",
 										  "The slot's restart_lsn %X/%X exceeds the limit by %llu bytes.",
 										  ex),
 								 LSN_FORMAT_ARGS(restart_lsn),
 								 ex);
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+								 "max_slot_wal_keep_size");
 				break;
 			}
 		case RS_INVAL_HORIZON:
@@ -1542,6 +1557,24 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+
+			/* Calculate the idle time duration of the slot */
+			now = GetCurrentTimestamp();
+			elapsed_secs = (now - inactive_since) / USECS_PER_SEC;
+			minutes = elapsed_secs / SECS_PER_MINUTE;
+			secs = elapsed_secs % SECS_PER_MINUTE;
+
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_detail, _("The slot's idle time of %d minutes and %d seconds exceeds the configured \"%s\" duration."),
+							 minutes, secs, "idle_replication_slot_timeout");
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1553,9 +1586,31 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			errmsg("invalidating obsolete replication slot \"%s\"",
 				   NameStr(slotname)),
 			errdetail_internal("%s", err_detail.data),
-			hint ? errhint("You might need to increase \"%s\".", "max_slot_wal_keep_size") : 0);
+			err_hint.len ? errhint("%s", err_hint.data) : 0);
 
 	pfree(err_detail.data);
+	pfree(err_hint.data);
+}
+
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has WAL reserved
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server
+ *    is in recovery. As synced slots are always considered to be inactive
+ *    because they don't perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins != 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
 }
 
 /*
@@ -1585,6 +1640,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1592,6 +1648,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1602,6 +1659,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * Assign the current time here to reduce system call overhead
+			 * while holding the spinlock in subsequent code.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1655,6 +1721,21 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					if (SlotIsLogical(s))
 						invalidation_cause = cause;
 					break;
+				case RS_INVAL_IDLE_TIMEOUT:
+					Assert(now > 0);
+
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 */
+					if (CanInvalidateIdleSlot(s) &&
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = cause;
+						inactive_since = s->inactive_since;
+					}
+					break;
 				case RS_INVAL_NONE:
 					pg_unreachable();
 			}
@@ -1705,9 +1786,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
+		 * max_slot_wal_keep_size is set to -1 and
+		 * idle_replication_slot_timeout is set to 0 during the binary
+		 * upgrade. See check_old_cluster_for_valid_slots() where we ensure
+		 * that no invalidated before the upgrade.
 		 */
 		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
 
@@ -1739,7 +1821,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1785,7 +1868,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since);
 
 			/* done with this slot for now */
 			break;
@@ -1800,14 +1884,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 /*
  * Invalidate slots that require resources about to be removed.
  *
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1860,7 +1946,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1918,6 +2005,38 @@ CheckPointReplicationSlots(bool is_shutdown)
 		SaveSlotToPath(s, path, LOG);
 	}
 	LWLockRelease(ReplicationSlotAllocationLock);
+
+	if (!is_shutdown)
+	{
+		elog(DEBUG1, "performing replication slot invalidation checks");
+
+		/*
+		 * NB: We will make another pass over replication slots for
+		 * invalidation checks to keep the code simple. Testing shows that
+		 * there is no noticeable overhead (when compared with wal_removed
+		 * invalidation) even if we were to do idle_timeout invalidation of
+		 * thousands of replication slots here. If it is ever proven that this
+		 * assumption is wrong, we will have to perform the invalidation
+		 * checks in the above for loop with the following changes:
+		 *
+		 * - Acquire ControlLock lock once before the loop.
+		 *
+		 * - Call InvalidatePossiblyObsoleteSlot for each slot.
+		 *
+		 * - Handle the cases in which ControlLock gets released just like
+		 * InvalidateObsoleteReplicationSlots does.
+		 *
+		 * - Avoid saving slot info to disk two times for each invalidated
+		 * slot.
+		 *
+		 * XXX: Should we move idle_timeout invalidation check closer to
+		 * wal_removed in CreateCheckPoint and CreateRestartPoint?
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_IDLE_TIMEOUT,
+										   0,
+										   InvalidOid,
+										   InvalidTransactionId);
+	}
 }
 
 /*
@@ -2802,3 +2921,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 71448bb4fd..070d8e3c56 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		0, 0, INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 079efa1baa..c064a07973 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -326,6 +326,7 @@
 				# (change requires restart)
 #wal_keep_size = 0		# in megabytes; 0 disables
 #max_slot_wal_keep_size = -1	# in megabytes; -1 disables
+#idle_replication_slot_timeout = 0	# in minutes; 0 disables
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 000c36d30d..b69ddc1fb1 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -56,6 +56,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
 	RS_INVAL_WAL_LEVEL,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -254,6 +256,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..9963bddc0e 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -107,6 +107,9 @@ extern long TimestampDifferenceMilliseconds(TimestampTz start_time,
 extern bool TimestampDifferenceExceeds(TimestampTz start_time,
 									   TimestampTz stop_time,
 									   int msec);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
 extern pg_time_t timestamptz_to_time_t(TimestampTz t);
-- 
2.34.1

v70-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/octet-stream; name=v70-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From 0643dad3d343b9b9557704abef564881df9b4dc0 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Thu, 30 Jan 2025 21:07:12 +0530
Subject: [PATCH v70 2/2] Add TAP test for slot invalidation based on inactive
 timeout.

This test uses injection points to bypass the time overhead caused by the
idle_replication_slot_timeout GUC, which has a minimum value of one minute.
---
 src/backend/replication/slot.c                |  29 +++--
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 110 ++++++++++++++++++
 3 files changed, 131 insertions(+), 9 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 29749ce917..601e8063f3 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
+#include "utils/injection_point.h"
 #include "utils/guc_hooks.h"
 #include "utils/varlena.h"
 
@@ -1724,16 +1725,26 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				case RS_INVAL_IDLE_TIMEOUT:
 					Assert(now > 0);
 
-					/*
-					 * Check if the slot needs to be invalidated due to
-					 * idle_replication_slot_timeout GUC.
-					 */
-					if (CanInvalidateIdleSlot(s) &&
-						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
-														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					if (CanInvalidateIdleSlot(s))
 					{
-						invalidation_cause = cause;
-						inactive_since = s->inactive_since;
+						/*
+						* Check if the slot needs to be invalidated due to
+						* idle_replication_slot_timeout GUC.
+						*
+						* To test idle timeout slot invalidation, if the
+						* "slot-timeout-inval" injection point is attached,
+						* immediately invalidate the slot.
+						*/
+						if (
+						#ifdef USE_INJECTION_POINTS
+							IS_INJECTION_POINT_ATTACHED("slot-timeout-inval") ||
+						#endif
+							TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+															  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+						{
+							invalidation_cause = cause;
+							inactive_since = s->inactive_since;
+						}
 					}
 					break;
 				case RS_INVAL_NONE:
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..2392f24711
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,110 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation due to idle_timeout
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# This test depends on injection point that forces slot invalidation
+# due to idle_timeout. Enabling injections points requires
+# --enable-injection-points with configure or
+# -Dinjection_points=true with Meson.
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $node_name = $node->name;
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(
+		qr/invalidating obsolete replication slot \"$slot_name\"/, $offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot_name to be set on node $node_name";
+}
+
+# ========================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical slot due to idle
+# timeout.
+
+# Initialize the node
+my $node = PostgreSQL::Test::Cluster->new('node');
+$node->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$node->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+idle_replication_slot_timeout = 1min
+});
+$node->start;
+
+# Create both streaming standby and logical slot
+$node->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'physical_slot', immediately_reserve := true);
+]);
+$node->safe_psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('logical_slot', 'test_decoding');}
+);
+
+my $log_offset = -s $node->logfile;
+
+# Register an injection point on the node to forcibly cause a slot
+# invalidation due to idle_timeout
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Check if the 'injection_points' extension is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('slot-timeout-inval', 'error');");
+
+# Idle timeout slot invalidation occurs during a checkpoint, so run a
+# checkpoint to invalidate the slots.
+$node->safe_psql('postgres', "CHECKPOINT");
+
+# Wait for slots to become inactive. Note that since nobody has acquired the
+# slot yet, then if it has been invalidated that can only be due to the idle
+# timeout mechanism.
+wait_for_slot_invalidation($node, 'physical_slot', $log_offset);
+wait_for_slot_invalidation($node, 'logical_slot', $log_offset);
+
+# Check that the invalidated slot cannot be acquired
+my $node_name = $node->name;
+my ($result, $stdout, $stderr);
+($result, $stdout, $stderr) = $node->psql(
+	'postgres', qq[
+		SELECT pg_replication_slot_advance('logical_slot', '0/1');
+]);
+ok( $stderr =~ /can no longer access replication slot "logical_slot"/,
+	"detected error upon trying to acquire invalidated slot on node")
+  or die
+  "could not detect error upon trying to acquire invalidated slot \"logical_slot\" on node";
+
+# Testcase end
+# =============================================================================
+
+done_testing();
-- 
2.34.1

#380

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Nisha Moond (#379)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Feb 6, 2025 at 8:02 AM Nisha Moond <nisha.moond412@gmail.com> wrote:

On Wed, Feb 5, 2025 at 2:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Would it address your concern if we write the actual idle duration
(now - inactive_since) instead of directly using inactive_since in the
above message?

Simply using the raw timestamp difference (now - inactive_since) would
look odd. We should convert it into a user-friendly format. Since
idle_replication_slot_timeout is in minutes, we can express the
difference in minutes and seconds in the log.
For example:
DETAIL: The slot's idle time of 1 minute and 7 seconds exceeds the
configured "idle_replication_slot_timeout" duration.

This is better but the implementation should be done on the caller
side mainly because we don't want to call a new GetCurrentTimestamp()
in ReportSlotInvalidation.

2.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
*
* It is convenient to flush dirty replication slots at the time of checkpoint.
* Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1924,6 +2007,45 @@ CheckPointReplicationSlots(bool is_shutdown)
Can we try and see how the patch looks if we try to invalidate the
slot due to idle time at the same time when we are trying to
invalidate due to WAL?
I'll consider the suggested change in the next version.

FYI, we discussed this previously (1), but the conclusion that it
won't help much (as it will not help to remove WAL immediately) is
incorrect, especially if we do what is suggested now.

Apart from this, I have made minor changes in the comments. Please
review and include them unless you disagree.

(1) - /messages/by-id/CALj2ACXe8+xSNdMXTMaSRWUwX7v61Ad4iddUwnn=djSwx3GLLg@mail.gmail.com

--
With Regards,
Amit Kapila.

Attachments:

v70-amit.1.patch.txttext/plain; charset=US-ASCII; name=v70-amit.1.patch.txtDownload

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 29749ce917..4eb679e187 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -1598,11 +1598,11 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
  * Idle timeout invalidation is allowed only when:
  *
  * 1. Idle timeout is set
- * 2. Slot has WAL reserved
+ * 2. Slot has reserved WAL
  * 3. Slot is inactive
- * 4. The slot is not being synced from the primary while the server
- *    is in recovery. As synced slots are always considered to be inactive
- *    because they don't perform logical decoding to produce changes.
+ * 4. The slot is not being synced from the primary while the server is in
+ *	  recovery. This is because synced slots are always considered to be
+ *	  inactive because they don't perform logical decoding to produce changes.
  */
 static inline bool
 CanInvalidateIdleSlot(ReplicationSlot *s)
@@ -1662,7 +1662,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		if (cause == RS_INVAL_IDLE_TIMEOUT)
 		{
 			/*
-			 * Assign the current time here to reduce system call overhead
+			 * Assign the current time here to avoid system call overhead
 			 * while holding the spinlock in subsequent code.
 			 */
 			now = GetCurrentTimestamp();

#381

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Amit Kapila (#380)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Feb 6, 2025 at 10:17 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Feb 6, 2025 at 8:02 AM Nisha Moond <nisha.moond412@gmail.com> wrote:
2.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
*
* It is convenient to flush dirty replication slots at the time of checkpoint.
* Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1924,6 +2007,45 @@ CheckPointReplicationSlots(bool is_shutdown)
Can we try and see how the patch looks if we try to invalidate the
slot due to idle time at the same time when we are trying to
invalidate due to WAL?
I'll consider the suggested change in the next version.
FYI, we discussed this previously (1), but the conclusion that it
won't help much (as it will not help to remove WAL immediately) is
incorrect, especially if we do what is suggested now.

The above sentence is incomplete. Let me re-write it. We discussed
this previously, but the conclusion that it won't help much (as it
will not help to remove WAL immediately) at the time shutdown
checkpoint is incorrect, especially if we do what is suggested now.
So, we should try to invalidate the slots even during shutdown
checkpoints.

--
With Regards,
Amit Kapila.

#382

Nisha Moond

nisha.moond412@gmail.com

11 months ago

In reply to: Amit Kapila (#380)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, Feb 6, 2025 at 10:17 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Feb 6, 2025 at 8:02 AM Nisha Moond <nisha.moond412@gmail.com> wrote:

On Wed, Feb 5, 2025 at 2:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Would it address your concern if we write the actual idle duration
(now - inactive_since) instead of directly using inactive_since in the
above message?

Simply using the raw timestamp difference (now - inactive_since) would
look odd. We should convert it into a user-friendly format. Since
idle_replication_slot_timeout is in minutes, we can express the
difference in minutes and seconds in the log.
For example:
DETAIL: The slot's idle time of 1 minute and 7 seconds exceeds the
configured "idle_replication_slot_timeout" duration.

This is better but the implementation should be done on the caller
side mainly because we don't want to call a new GetCurrentTimestamp()
in ReportSlotInvalidation.

Done.

2.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
*
* It is convenient to flush dirty replication slots at the time of checkpoint.
* Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -1924,6 +2007,45 @@ CheckPointReplicationSlots(bool is_shutdown)
Can we try and see how the patch looks if we try to invalidate the
slot due to idle time at the same time when we are trying to
invalidate due to WAL?
I'll consider the suggested change in the next version.

Done the changes as suggested in v71.

FYI, we discussed this previously (1), but the conclusion that it
won't help much (as it will not help to remove WAL immediately) is
incorrect, especially if we do what is suggested now.

Apart from this, I have made minor changes in the comments. Please
review and include them unless you disagree.

Done.
~~~~
Here are the v71 patches with the above comments incorporated.

--
Thanks,
Nisha

Attachments:

v71-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v71-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 59e092d2e52271298cbf09fb2d3e8a07bd17899b Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 3 Feb 2025 15:20:40 +0530
Subject: [PATCH v71 1/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not applicable
for slots that do not reserve WAL or for slots on the standby server
that are being synced from the primary server (i.e., standby slots
having 'synced' field 'true'). Synced slots are always considered to be
inactive because they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  42 ++++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |   7 +
 src/backend/access/transam/xlog.c             |   4 +-
 src/backend/replication/slot.c                | 205 ++++++++++++++----
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  14 +-
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 13 files changed, 274 insertions(+), 50 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 38244409e3..a915a43625 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4423,6 +4423,48 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </listitem>
       </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero (the default) disables the idle timeout
+        invalidation mechanism. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the
+        slot's <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not applicable
+        for slots that do not reserve WAL or for slots on the standby server
+        that are being synced from the primary server (i.e., standby slots
+        having <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>). Synced slots are always considered to
+        be inactive because they don't perform logical decoding to produce
+        changes. Slots that appear idle due to a disrupted connection between
+        the publisher and subscriber are also excluded, as they are managed by
+        <link linkend="guc-wal-sender-timeout"><varname>wal_sender_timeout</varname></link>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
       <term><varname>wal_sender_timeout</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 613abcd28b..3d18e507bb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2390,6 +2390,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be81c2b51d..f58b9406e4 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2621,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9c270e7d46..3eaf0bf311 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7337,7 +7337,7 @@ CreateCheckPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 	KeepLogSeg(recptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
@@ -7792,7 +7792,7 @@ CreateRestartPoint(int flags)
 	replayPtr = GetXLogReplayRecPtr(&replayTLI);
 	endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr;
 	KeepLogSeg(endptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fe5acd8b1f..c3ea38aaa5 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -107,10 +107,11 @@ const char *const SlotInvalidationCauses[] = {
 	[RS_INVAL_WAL_REMOVED] = "wal_removed",
 	[RS_INVAL_HORIZON] = "rows_removed",
 	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
 
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
@@ -141,6 +142,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = 0;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -1512,12 +1519,18 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since,
+					   TimestampTz now)
 {
+	int			minutes = 0;
+	int			secs = 0;
+	long		elapsed_secs = 0;
 	StringInfoData err_detail;
-	bool		hint = false;
+	StringInfoData err_hint;
 
 	initStringInfo(&err_detail);
+	initStringInfo(&err_hint);
 
 	switch (cause)
 	{
@@ -1525,13 +1538,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			{
 				unsigned long long ex = oldestLSN - restart_lsn;
 
-				hint = true;
 				appendStringInfo(&err_detail,
 								 ngettext("The slot's restart_lsn %X/%X exceeds the limit by %llu byte.",
 										  "The slot's restart_lsn %X/%X exceeds the limit by %llu bytes.",
 										  ex),
 								 LSN_FORMAT_ARGS(restart_lsn),
 								 ex);
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+								 "max_slot_wal_keep_size");
 				break;
 			}
 		case RS_INVAL_HORIZON:
@@ -1542,6 +1557,23 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+
+			/* Calculate the idle time duration of the slot */
+			elapsed_secs = (now - inactive_since) / USECS_PER_SEC;
+			minutes = elapsed_secs / SECS_PER_MINUTE;
+			secs = elapsed_secs % SECS_PER_MINUTE;
+
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_detail, _("The slot's idle time of %d minutes and %d seconds exceeds the configured \"%s\" duration."),
+							 minutes, secs, "idle_replication_slot_timeout");
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1553,9 +1585,31 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			errmsg("invalidating obsolete replication slot \"%s\"",
 				   NameStr(slotname)),
 			errdetail_internal("%s", err_detail.data),
-			hint ? errhint("You might need to increase \"%s\".", "max_slot_wal_keep_size") : 0);
+			err_hint.len ? errhint("%s", err_hint.data) : 0);
 
 	pfree(err_detail.data);
+	pfree(err_hint.data);
+}
+
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has reserved WAL
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server is in
+ *	  recovery. This is because synced slots are always considered to be
+ *	  inactive because they don't perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins != 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
 }
 
 /*
@@ -1585,6 +1639,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1592,6 +1647,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1602,6 +1658,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause & RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * Assign the current time here to avoid system call overhead
+			 * while holding the spinlock in subsequent code.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1629,37 +1694,66 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				initial_catalog_effective_xmin = s->effective_catalog_xmin;
 			}
 
-			switch (cause)
+			if (cause & RS_INVAL_WAL_REMOVED)
 			{
-				case RS_INVAL_WAL_REMOVED:
-					if (initial_restart_lsn != InvalidXLogRecPtr &&
-						initial_restart_lsn < oldestLSN)
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_HORIZON:
-					if (!SlotIsLogical(s))
-						break;
-					/* invalid DB oid signals a shared relation */
-					if (dboid != InvalidOid && dboid != s->data.database)
-						break;
-					if (TransactionIdIsValid(initial_effective_xmin) &&
-						TransactionIdPrecedesOrEquals(initial_effective_xmin,
-													  snapshotConflictHorizon))
-						invalidation_cause = cause;
-					else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
-							 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
-														   snapshotConflictHorizon))
-						invalidation_cause = cause;
+				if (initial_restart_lsn != InvalidXLogRecPtr &&
+					initial_restart_lsn < oldestLSN)
+				{
+					invalidation_cause = RS_INVAL_WAL_REMOVED;
+					goto invalidation_marked;
+				}
+			}
+			if (cause & RS_INVAL_HORIZON)
+			{
+				if (!SlotIsLogical(s))
 					break;
-				case RS_INVAL_WAL_LEVEL:
-					if (SlotIsLogical(s))
-						invalidation_cause = cause;
+				/* invalid DB oid signals a shared relation */
+				if (dboid != InvalidOid && dboid != s->data.database)
 					break;
-				case RS_INVAL_NONE:
-					pg_unreachable();
+				if (TransactionIdIsValid(initial_effective_xmin) &&
+					TransactionIdPrecedesOrEquals(initial_effective_xmin,
+												  snapshotConflictHorizon))
+				{
+					invalidation_cause = RS_INVAL_HORIZON;
+					goto invalidation_marked;
+				}
+				else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
+						 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
+													   snapshotConflictHorizon))
+				{
+					invalidation_cause = RS_INVAL_HORIZON;
+					goto invalidation_marked;
+				}
+			}
+			if (cause & RS_INVAL_WAL_LEVEL)
+			{
+				if (SlotIsLogical(s))
+				{
+					invalidation_cause = RS_INVAL_WAL_LEVEL;
+					goto invalidation_marked;
+				}
+			}
+			if (cause & RS_INVAL_IDLE_TIMEOUT)
+			{
+				Assert(now > 0);
+
+				/*
+				 * Check if the slot needs to be invalidated due to
+				 * idle_replication_slot_timeout GUC.
+				 */
+				if (CanInvalidateIdleSlot(s) &&
+					TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+													  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+				{
+					invalidation_cause = RS_INVAL_IDLE_TIMEOUT;
+					inactive_since = s->inactive_since;
+					goto invalidation_marked;
+				}
 			}
 		}
 
+invalidation_marked:
+
 		/*
 		 * The invalidation cause recorded previously should not change while
 		 * the process owning the slot (if any) has been terminated.
@@ -1705,9 +1799,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
+		 * max_slot_wal_keep_size is set to -1 and
+		 * idle_replication_slot_timeout is set to 0 during the binary
+		 * upgrade. See check_old_cluster_for_valid_slots() where we ensure
+		 * that no invalidated before the upgrade.
 		 */
 		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
 
@@ -1739,7 +1834,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since, now);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1785,7 +1881,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since, now);
 
 			/* done with this slot for now */
 			break;
@@ -1800,14 +1897,16 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 /*
  * Invalidate slots that require resources about to be removed.
  *
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1819,9 +1918,9 @@ InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	oldestLSN;
 	bool		invalidated = false;
 
-	Assert(cause != RS_INVAL_HORIZON || TransactionIdIsValid(snapshotConflictHorizon));
-	Assert(cause != RS_INVAL_WAL_REMOVED || oldestSegno > 0);
-	Assert(cause != RS_INVAL_NONE);
+	Assert(!(cause & RS_INVAL_HORIZON) || TransactionIdIsValid(snapshotConflictHorizon));
+	Assert(!(cause & RS_INVAL_WAL_REMOVED) || oldestSegno > 0);
+	Assert(!(cause & RS_INVAL_NONE));
 
 	if (max_replication_slots == 0)
 		return invalidated;
@@ -1860,7 +1959,8 @@ restart:
 }
 
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.
  *
  * It is convenient to flush dirty replication slots at the time of checkpoint.
  * Additionally, in case of a shutdown checkpoint, we also identify the slots
@@ -2802,3 +2902,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index b887d3e598..7fec4e84a2 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		0, 0, INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index c40b7a3121..f9a5561166 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -326,6 +326,7 @@
 				# (change requires restart)
 #wal_keep_size = 0		# in megabytes; 0 disables
 #max_slot_wal_keep_size = -1	# in megabytes; -1 disables
+#idle_replication_slot_timeout = 0	# in minutes; 0 disables
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 000c36d30d..5967385bbb 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -44,18 +44,21 @@ typedef enum ReplicationSlotPersistency
  * Slots can be invalidated, e.g. due to max_slot_wal_keep_size. If so, the
  * 'invalidated' field is set to a value other than _NONE.
  *
- * When adding a new invalidation cause here, remember to update
+ * When adding a new invalidation cause here, the value must be powers of 2
+ * (e.g., 1, 2, 4...) for proper bitwise operations. Also, remember to update
  * SlotInvalidationCauses and RS_INVAL_MAX_CAUSES.
  */
 typedef enum ReplicationSlotInvalidationCause
 {
-	RS_INVAL_NONE,
+	RS_INVAL_NONE = 0x00,
 	/* required WAL has been removed */
-	RS_INVAL_WAL_REMOVED,
+	RS_INVAL_WAL_REMOVED = 0x01,
 	/* required rows have been removed */
-	RS_INVAL_HORIZON,
+	RS_INVAL_HORIZON = 0x02,
 	/* wal_level insufficient for slot */
-	RS_INVAL_WAL_LEVEL,
+	RS_INVAL_WAL_LEVEL = 0x04,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT = 0x08,
 } ReplicationSlotInvalidationCause;
 
 extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
@@ -254,6 +257,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..9963bddc0e 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -107,6 +107,9 @@ extern long TimestampDifferenceMilliseconds(TimestampTz start_time,
 extern bool TimestampDifferenceExceeds(TimestampTz start_time,
 									   TimestampTz stop_time,
 									   int msec);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
 extern pg_time_t timestamptz_to_time_t(TimestampTz t);
-- 
2.34.1

v71-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/octet-stream; name=v71-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From a36a92b561d5b19cfd6ad03ae153a2d5ce177e21 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Thu, 6 Feb 2025 15:35:06 +0530
Subject: [PATCH v71 2/2] Add TAP test for slot invalidation based on inactive
 timeout.

This test uses injection points to bypass the time overhead caused by the
idle_replication_slot_timeout GUC, which has a minimum value of one minute.
---
 src/backend/replication/slot.c                |  31 +++--
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 110 ++++++++++++++++++
 3 files changed, 132 insertions(+), 10 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index c3ea38aaa5..e38c5edab6 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
+#include "utils/injection_point.h"
 #include "utils/guc_hooks.h"
 #include "utils/varlena.h"
 
@@ -1737,17 +1738,27 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				Assert(now > 0);
 
-				/*
-				 * Check if the slot needs to be invalidated due to
-				 * idle_replication_slot_timeout GUC.
-				 */
-				if (CanInvalidateIdleSlot(s) &&
-					TimestampDifferenceExceedsSeconds(s->inactive_since, now,
-													  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+				if (CanInvalidateIdleSlot(s))
 				{
-					invalidation_cause = RS_INVAL_IDLE_TIMEOUT;
-					inactive_since = s->inactive_since;
-					goto invalidation_marked;
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 *
+					 * To test idle timeout slot invalidation, if the
+					 * "slot-timeout-inval" injection point is attached,
+					 * immediately invalidate the slot.
+					 */
+					if (
+#ifdef USE_INJECTION_POINTS
+						IS_INJECTION_POINT_ATTACHED("slot-timeout-inval") ||
+#endif
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = RS_INVAL_IDLE_TIMEOUT;
+						inactive_since = s->inactive_since;
+						goto invalidation_marked;
+					}
 				}
 			}
 		}
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..2392f24711
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,110 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation due to idle_timeout
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# This test depends on injection point that forces slot invalidation
+# due to idle_timeout. Enabling injections points requires
+# --enable-injection-points with configure or
+# -Dinjection_points=true with Meson.
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $node_name = $node->name;
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(
+		qr/invalidating obsolete replication slot \"$slot_name\"/, $offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot_name to be set on node $node_name";
+}
+
+# ========================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical slot due to idle
+# timeout.
+
+# Initialize the node
+my $node = PostgreSQL::Test::Cluster->new('node');
+$node->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$node->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+idle_replication_slot_timeout = 1min
+});
+$node->start;
+
+# Create both streaming standby and logical slot
+$node->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'physical_slot', immediately_reserve := true);
+]);
+$node->safe_psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('logical_slot', 'test_decoding');}
+);
+
+my $log_offset = -s $node->logfile;
+
+# Register an injection point on the node to forcibly cause a slot
+# invalidation due to idle_timeout
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Check if the 'injection_points' extension is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('slot-timeout-inval', 'error');");
+
+# Idle timeout slot invalidation occurs during a checkpoint, so run a
+# checkpoint to invalidate the slots.
+$node->safe_psql('postgres', "CHECKPOINT");
+
+# Wait for slots to become inactive. Note that since nobody has acquired the
+# slot yet, then if it has been invalidated that can only be due to the idle
+# timeout mechanism.
+wait_for_slot_invalidation($node, 'physical_slot', $log_offset);
+wait_for_slot_invalidation($node, 'logical_slot', $log_offset);
+
+# Check that the invalidated slot cannot be acquired
+my $node_name = $node->name;
+my ($result, $stdout, $stderr);
+($result, $stdout, $stderr) = $node->psql(
+	'postgres', qq[
+		SELECT pg_replication_slot_advance('logical_slot', '0/1');
+]);
+ok( $stderr =~ /can no longer access replication slot "logical_slot"/,
+	"detected error upon trying to acquire invalidated slot on node")
+  or die
+  "could not detect error upon trying to acquire invalidated slot \"logical_slot\" on node";
+
+# Testcase end
+# =============================================================================
+
+done_testing();
-- 
2.34.1

#383

vignesh C

vignesh21@gmail.com

11 months ago

In reply to: Nisha Moond (#382)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Thu, 6 Feb 2025 at 16:08, Nisha Moond <nisha.moond412@gmail.com> wrote:

Here are the v71 patches with the above comments incorporated.

Few comments:
1) While changing the switch to an if condition, the behavior of the
break statement has changed. Previously, it would exit the switch, but
now it exits the main for loop without releasing the locks. These
should be replaced with a goto to ensure the locks are properly
released.
+                       if (cause & RS_INVAL_HORIZON)
+                       {
+                               if (!SlotIsLogical(s))
                                        break;
-                               case RS_INVAL_WAL_LEVEL:
-                                       if (SlotIsLogical(s))
-                                               invalidation_cause = cause;
+                               /* invalid DB oid signals a shared relation */
+                               if (dboid != InvalidOid && dboid !=
s->data.database)
                                        break;

2) None of this initialization is required, as we will be setting
these values before using it:
+       int                     minutes = 0;
+       int                     secs = 0;
+       long            elapsed_secs = 0;

Regards,
Vignesh

#384

Peter Smith

smithpb2250@gmail.com

11 months ago

In reply to: Nisha Moond (#382)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha,

Some review comments for v71-0001.

======
src/backend/access/transam/xlog.c

1.
  XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
  KeepLogSeg(recptr, &_logSegNo);
- if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+ if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED |
RS_INVAL_IDLE_TIMEOUT,
     _logSegNo, InvalidOid,
     InvalidTransactionId))
  {
@@ -7792,7 +7792,7 @@ CreateRestartPoint(int flags)
  replayPtr = GetXLogReplayRecPtr(&replayTLI);
  endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr;
  KeepLogSeg(endptr, &_logSegNo);
- if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+ if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED |
RS_INVAL_IDLE_TIMEOUT,
     _logSegNo, InvalidOid,
     InvalidTransactionId))

It seems fundamentally strange to me to assign multiple simultaneous
causes like this. IMO you can't invalidate something that is invalid
already. I gues v71 was an attempt to implement Amit's:

Can we try and see how the patch looks if we try to invalidate the
slot due to idle time at the same time when we are trying to
invalidate due to WAL?

But, AFAICT the current code now has a confused mixture of:
'cause' parameter meaning "this is the invalidation cause", versus
'cause' parameter meaning "here is a mask of possible causes"

======
src/backend/replication/slot.c

SlotInvalidationCauses[]

2.
[RS_INVAL_WAL_REMOVED] = "wal_removed",
[RS_INVAL_HORIZON] = "rows_removed",
[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+ [RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
};

By using bit flags in the enum (see slot.h) and designated
initializers here in SlotInvalidationCauses[], you'll end up with 9
entries (0-0x08) instead of 4, and the other undesignated entries will
be all NULL. Maybe it is intended, but if it is I think it is strange
to be indexing by bit flags so at least you should have a comment.

If you really need bitflags then perhaps it is better to maintain them
in addition to the v70 enum values (??)

~~~

3.
 /* Maximum number of invalidation causes */
-#define RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT

Hmm. The impact of using bit flags has (probably) unintended
consequences. e.g. Now you've made the GetSlotInvalidationCause()
function worse than before because now it will be iterating over all
the undesignated NULL entries of the array when searching for the
matching cause.

~~~

4.
+ /* Calculate the idle time duration of the slot */
+ elapsed_secs = (now - inactive_since) / USECS_PER_SEC;
+ minutes = elapsed_secs / SECS_PER_MINUTE;
+ secs = elapsed_secs % SECS_PER_MINUTE;
+
+ /* translator: %s is a GUC variable name */
+ appendStringInfo(&err_detail, _("The slot's idle time of %d minutes
and %d seconds exceeds the configured \"%s\" duration."),
+ minutes, secs, "idle_replication_slot_timeout");

Idleness timeout durations defined like 1d aren't going to look pretty
using this log format. We already discussed off-list about how to make
this better, but not done yet?

~~~

5.
+ if (cause & RS_INVAL_HORIZON)
+ {
+ if (!SlotIsLogical(s))
  break;

The meaning of the 'break' here is different to before. Now breaking
the entire for-loop instead of just breaking from the switch.
(same already posted by Vignesh)

~~~

6.
  ReportSlotInvalidation(invalidation_cause, true, active_pid,
     slotname, restart_lsn,
-    oldestLSN, snapshotConflictHorizon);
+    oldestLSN, snapshotConflictHorizon,
+    inactive_since, now);

if (MyBackendType == B_STARTUP)
(void) SendProcSignal(active_pid,
@@ -1785,7 +1881,8 @@
InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,

  ReportSlotInvalidation(invalidation_cause, false, active_pid,
     slotname, restart_lsn,
-    oldestLSN, snapshotConflictHorizon);
+    oldestLSN, snapshotConflictHorizon,
+    inactive_since, now);

If the cause was not already (masked with) RS_INVAL_IDLE_TIMEOUT then
AFAICT 'now' will still be 0 here.

This seems an unexpected quirk, which at best is quite misleading.
Even if the code sty like this I felt ReportSlotInvalidation should
Assert 'now' must be 0 unless the cause passed was
RS_INVAL_IDLE_TIMEOUT.

~~~

CheckPointReplicationSlots:

7.
 /*
- * Flush all replication slots to disk.
+ * Flush all replication slots to disk. Also, invalidate obsolete slots during
+ * non-shutdown checkpoint.

Since the v70 code was removed in v71, the function now is the same as
master. So did we need the function comment change?

======
src/include/replication/slot.h

8.
- * When adding a new invalidation cause here, remember to update
+ * When adding a new invalidation cause here, the value must be powers of 2
+ * (e.g., 1, 2, 4...) for proper bitwise operations. Also, remember to update
  * SlotInvalidationCauses and RS_INVAL_MAX_CAUSES.
  */
 typedef enum ReplicationSlotInvalidationCause
 {
- RS_INVAL_NONE,
+ RS_INVAL_NONE = 0x00,
  /* required WAL has been removed */
- RS_INVAL_WAL_REMOVED,
+ RS_INVAL_WAL_REMOVED = 0x01,
  /* required rows have been removed */
- RS_INVAL_HORIZON,
+ RS_INVAL_HORIZON = 0x02,
  /* wal_level insufficient for slot */
- RS_INVAL_WAL_LEVEL,
+ RS_INVAL_WAL_LEVEL = 0x04,
+ /* idle slot timeout has occurred */
+ RS_INVAL_IDLE_TIMEOUT = 0x08,
 } ReplicationSlotInvalidationCause;

8a.
IMO enums are intended for discrete values like "red" or "blue", but
not combinations of values like "reddy-bluey". AFAIK this kind of
usage is not normal and is discouraged in C programming.

So if you need bitflags then really the bit flags should be #define etc.

8b.
Does it make sense? You can't invalidate something that is already
invalid, so what does it even mean to have multiple simultaneous
ReplicationSlotInvalidationCause values? AFAICT it was only done like
this to CHECK for multiple **possible** causes, but this point is not
very clear

8c.
This introduces a side-effect that now the char *const
SlotInvalidationCauses[] array in slot.c will have 8 entries, half of
them NULL. Already mentioned elsewhere. And, this will get
increasingly worse if more invalidation reasons get added. 8,16,32,64
mostly unused entries etc...

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#385

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Peter Smith (#384)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Feb 7, 2025 at 8:00 AM Peter Smith <smithpb2250@gmail.com> wrote:

======
src/backend/access/transam/xlog.c

1.
XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
KeepLogSeg(recptr, &_logSegNo);
- if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+ if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED |
RS_INVAL_IDLE_TIMEOUT,
_logSegNo, InvalidOid,
InvalidTransactionId))
{
@@ -7792,7 +7792,7 @@ CreateRestartPoint(int flags)
replayPtr = GetXLogReplayRecPtr(&replayTLI);
endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr;
KeepLogSeg(endptr, &_logSegNo);
- if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+ if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED |
RS_INVAL_IDLE_TIMEOUT,
_logSegNo, InvalidOid,
InvalidTransactionId))

It seems fundamentally strange to me to assign multiple simultaneous
causes like this. IMO you can't invalidate something that is invalid
already. I gues v71 was an attempt to implement Amit's:

The idea is to invalidate the slot either due to WAL_REMOVED or
IDLE_TIMEOUT in one go during the checkpoint instead of taking
multiple passes over the slots during the checkpoint. Feel free to
suggest if you can think of a better way to implement it.

--
With Regards,
Amit Kapila.

#386

Nisha Moond

nisha.moond412@gmail.com

11 months ago

In reply to: Peter Smith (#384)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Feb 7, 2025 at 8:00 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha,

Some review comments for v71-0001.

======
src/backend/replication/slot.c

SlotInvalidationCauses[]

2.
[RS_INVAL_WAL_REMOVED] = "wal_removed",
[RS_INVAL_HORIZON] = "rows_removed",
[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+ [RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
};

By using bit flags in the enum (see slot.h) and designated
initializers here in SlotInvalidationCauses[], you'll end up with 9
entries (0-0x08) instead of 4, and the other undesignated entries will
be all NULL. Maybe it is intended, but if it is I think it is strange
to be indexing by bit flags so at least you should have a comment.

If you really need bitflags then perhaps it is better to maintain them
in addition to the v70 enum values (??)

~~~
3.
/* Maximum number of invalidation causes */
-#define RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
Hmm. The impact of using bit flags has (probably) unintended
consequences. e.g. Now you've made the GetSlotInvalidationCause()
function worse than before because now it will be iterating over all
the undesignated NULL entries of the array when searching for the
matching cause.

Introduced a new struct, "SlotInvalidationCauseMap", to store
invalidation cause enums and their corresponding cause_name strings.
Replaced "SlotInvalidationCauses[]" with a map (array of structs),
eliminating extra NULL spaces and reducing unnecessary iterations.
With this change, a new function, GetSlotInvalidationCauseName(), was
added to retrieve the cause_name string for a given cause_enum.

~~~

4.
+ /* Calculate the idle time duration of the slot */
+ elapsed_secs = (now - inactive_since) / USECS_PER_SEC;
+ minutes = elapsed_secs / SECS_PER_MINUTE;
+ secs = elapsed_secs % SECS_PER_MINUTE;
+
+ /* translator: %s is a GUC variable name */
+ appendStringInfo(&err_detail, _("The slot's idle time of %d minutes
and %d seconds exceeds the configured \"%s\" duration."),
+ minutes, secs, "idle_replication_slot_timeout");

Idleness timeout durations defined like 1d aren't going to look pretty
using this log format. We already discussed off-list about how to make
this better, but not done yet?

There was an off-list suggestion to include the configured GUC value
in the err_detail message for better clarity. This change makes it
easier for users to compare, especially for large values like 1d.
For example, if the timeout duration is set to 1d, the message will
now appear as:

" The slot's idle time of 1440 minutes and 54 seconds exceeds the
configured "idle_replication_slot_timeout" duration of 1440 minutes."

Thoughts?

~~~
6.
ReportSlotInvalidation(invalidation_cause, true, active_pid,
slotname, restart_lsn,
-    oldestLSN, snapshotConflictHorizon);
+    oldestLSN, snapshotConflictHorizon,
+    inactive_since, now);
if (MyBackendType == B_STARTUP)
(void) SendProcSignal(active_pid,
@@ -1785,7 +1881,8 @@
InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
ReportSlotInvalidation(invalidation_cause, false, active_pid,
slotname, restart_lsn,
-    oldestLSN, snapshotConflictHorizon);
+    oldestLSN, snapshotConflictHorizon,
+    inactive_since, now);
If the cause was not already (masked with) RS_INVAL_IDLE_TIMEOUT then
AFAICT 'now' will still be 0 here.

This seems an unexpected quirk, which at best is quite misleading.
Even if the code sty like this I felt ReportSlotInvalidation should
Assert 'now' must be 0 unless the cause passed was
RS_INVAL_IDLE_TIMEOUT.

'now' will be non-zero even when RS_INVAL_IDLE_TIMEOUT is masked with
other possible causes like RS_INVAL_WAL_REMOVED, and the slot gets
invalidated first due to RS_INVAL_WAL_REMOVED.
Therefore, 'now' being non-zero is not exclusive to
RS_INVAL_IDLE_TIMEOUT. However, since it must be non-zero when the
cause in ReportSlotInvalidation() is RS_INVAL_IDLE_TIMEOUT, I've added
an Assert for the same.

~~~
======
src/include/replication/slot.h

8.
- * When adding a new invalidation cause here, remember to update
+ * When adding a new invalidation cause here, the value must be powers of 2
+ * (e.g., 1, 2, 4...) for proper bitwise operations. Also, remember to update
* SlotInvalidationCauses and RS_INVAL_MAX_CAUSES.
*/
typedef enum ReplicationSlotInvalidationCause
{
- RS_INVAL_NONE,
+ RS_INVAL_NONE = 0x00,
/* required WAL has been removed */
- RS_INVAL_WAL_REMOVED,
+ RS_INVAL_WAL_REMOVED = 0x01,
/* required rows have been removed */
- RS_INVAL_HORIZON,
+ RS_INVAL_HORIZON = 0x02,
/* wal_level insufficient for slot */
- RS_INVAL_WAL_LEVEL,
+ RS_INVAL_WAL_LEVEL = 0x04,
+ /* idle slot timeout has occurred */
+ RS_INVAL_IDLE_TIMEOUT = 0x08,
} ReplicationSlotInvalidationCause;

8a.
IMO enums are intended for discrete values like "red" or "blue", but
not combinations of values like "reddy-bluey". AFAIK this kind of
usage is not normal and is discouraged in C programming.

So if you need bitflags then really the bit flags should be #define etc.

I feel using the ReplicationSlotInvalidationCause type instead of
"int" (in case we use macros) improves code readability and
maintainability.
OTOH, keeping the enums as they are in v70, and defining new macros
for the very similar purpose could add unnecessary complexity to code
management.

~

8b.
Does it make sense? You can't invalidate something that is already
invalid, so what does it even mean to have multiple simultaneous
ReplicationSlotInvalidationCause values? AFAICT it was only done like
this to CHECK for multiple **possible** causes, but this point is not
very clear

Added comments at the top of InvalidateObsoleteReplicationSlots() to
clarify that it tries to invalidate slots for multiple possible causes
in a single pass, as explained in [1]/messages/by-id/CAA4eK1Jatoapf2NoX2nJOJ8k-RvEZM=MoFUvNWPz4rRR1simQw@mail.gmail.com.

~

8c.
This introduces a side-effect that now the char *const
SlotInvalidationCauses[] array in slot.c will have 8 entries, half of
them NULL. Already mentioned elsewhere. And, this will get
increasingly worse if more invalidation reasons get added. 8,16,32,64
mostly unused entries etc...

This issue is now resolved by replacing SlotInvalidationCauses[] with
a new array of structures.

======

Attached v72 patches, addressed the above comments as well as
Vignesh's comments in [2]/messages/by-id/CALDaNm3wx8ihfkidveKuK=gGujS_yc9sEgq6ev-T+W3zeHM88g@mail.gmail.com.
- There are no new changes in patch-002.

[1]: /messages/by-id/CAA4eK1Jatoapf2NoX2nJOJ8k-RvEZM=MoFUvNWPz4rRR1simQw@mail.gmail.com
[2]: /messages/by-id/CALDaNm3wx8ihfkidveKuK=gGujS_yc9sEgq6ev-T+W3zeHM88g@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v72-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v72-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 18464cdb551bdba04b48dd6498699931ad4f951e Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 3 Feb 2025 15:20:40 +0530
Subject: [PATCH v72 1/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not applicable
for slots that do not reserve WAL or for slots on the standby server
that are being synced from the primary server (i.e., standby slots
having 'synced' field 'true'). Synced slots are always considered to be
inactive because they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  42 +++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |   7 +
 src/backend/access/transam/xlog.c             |   4 +-
 src/backend/replication/slot.c                | 259 ++++++++++++++----
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  23 +-
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 14 files changed, 322 insertions(+), 67 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 38244409e3..a915a43625 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4423,6 +4423,48 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </listitem>
       </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero (the default) disables the idle timeout
+        invalidation mechanism. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the
+        slot's <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not applicable
+        for slots that do not reserve WAL or for slots on the standby server
+        that are being synced from the primary server (i.e., standby slots
+        having <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>). Synced slots are always considered to
+        be inactive because they don't perform logical decoding to produce
+        changes. Slots that appear idle due to a disrupted connection between
+        the publisher and subscriber are also excluded, as they are managed by
+        <link linkend="guc-wal-sender-timeout"><varname>wal_sender_timeout</varname></link>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
       <term><varname>wal_sender_timeout</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 613abcd28b..3d18e507bb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2390,6 +2390,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be81c2b51d..f58b9406e4 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2621,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9c270e7d46..3eaf0bf311 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7337,7 +7337,7 @@ CreateCheckPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 	KeepLogSeg(recptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
@@ -7792,7 +7792,7 @@ CreateRestartPoint(int flags)
 	replayPtr = GetXLogReplayRecPtr(&replayTLI);
 	endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr;
 	KeepLogSeg(endptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fe5acd8b1f..92e90743fc 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -102,18 +102,16 @@ typedef struct
 /*
  * Lookup table for slot invalidation causes.
  */
-const char *const SlotInvalidationCauses[] = {
-	[RS_INVAL_NONE] = "none",
-	[RS_INVAL_WAL_REMOVED] = "wal_removed",
-	[RS_INVAL_HORIZON] = "rows_removed",
-	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+const		SlotInvalidationCauseMap InvalidationCauses[] = {
+	{RS_INVAL_NONE, "none"},
+	{RS_INVAL_WAL_REMOVED, "wal_removed"},
+	{RS_INVAL_HORIZON, "rows_removed"},
+	{RS_INVAL_WAL_LEVEL, "wal_level_insufficient"},
+	{RS_INVAL_IDLE_TIMEOUT, "idle_timeout"},
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
-
-StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
-				 "array length mismatch");
+#define	RS_INVAL_MAX_CAUSES (sizeof(InvalidationCauses) / sizeof(InvalidationCauses[0]))
 
 /* size of version independent data */
 #define ReplicationSlotOnDiskConstantSize \
@@ -141,6 +139,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = 0;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -575,7 +579,7 @@ retry:
 				errmsg("can no longer access replication slot \"%s\"",
 					   NameStr(s->data.name)),
 				errdetail("This replication slot has been invalidated due to \"%s\".",
-						  SlotInvalidationCauses[s->data.invalidated]));
+						  GetSlotInvalidationCauseName(s->data.invalidated)));
 	}
 
 	/*
@@ -1512,12 +1516,18 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since,
+					   TimestampTz now)
 {
+	int			minutes;
+	int			secs;
+	long		elapsed_secs;
 	StringInfoData err_detail;
-	bool		hint = false;
+	StringInfoData err_hint;
 
 	initStringInfo(&err_detail);
+	initStringInfo(&err_hint);
 
 	switch (cause)
 	{
@@ -1525,13 +1535,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			{
 				unsigned long long ex = oldestLSN - restart_lsn;
 
-				hint = true;
 				appendStringInfo(&err_detail,
 								 ngettext("The slot's restart_lsn %X/%X exceeds the limit by %llu byte.",
 										  "The slot's restart_lsn %X/%X exceeds the limit by %llu bytes.",
 										  ex),
 								 LSN_FORMAT_ARGS(restart_lsn),
 								 ex);
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+								 "max_slot_wal_keep_size");
 				break;
 			}
 		case RS_INVAL_HORIZON:
@@ -1542,6 +1554,24 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0 && now > 0);
+
+			/* Calculate the idle time duration of the slot */
+			elapsed_secs = (now - inactive_since) / USECS_PER_SEC;
+			minutes = elapsed_secs / SECS_PER_MINUTE;
+			secs = elapsed_secs % SECS_PER_MINUTE;
+
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_detail, _("The slot's idle time of %d minutes and %02d seconds exceeds the configured \"%s\" duration of %d minutes."),
+							 minutes, secs, "idle_replication_slot_timeout",
+							 idle_replication_slot_timeout_mins);
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1553,9 +1583,31 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			errmsg("invalidating obsolete replication slot \"%s\"",
 				   NameStr(slotname)),
 			errdetail_internal("%s", err_detail.data),
-			hint ? errhint("You might need to increase \"%s\".", "max_slot_wal_keep_size") : 0);
+			err_hint.len ? errhint("%s", err_hint.data) : 0);
 
 	pfree(err_detail.data);
+	pfree(err_hint.data);
+}
+
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has reserved WAL
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server is in
+ *	  recovery. This is because synced slots are always considered to be
+ *	  inactive because they don't perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins != 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
 }
 
 /*
@@ -1585,6 +1637,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1592,6 +1645,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1602,6 +1656,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (cause & RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * Assign the current time here to avoid system call overhead
+			 * while holding the spinlock in subsequent code.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1629,37 +1692,66 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				initial_catalog_effective_xmin = s->effective_catalog_xmin;
 			}
 
-			switch (cause)
+			if (cause & RS_INVAL_WAL_REMOVED)
+			{
+				if (initial_restart_lsn != InvalidXLogRecPtr &&
+					initial_restart_lsn < oldestLSN)
+				{
+					invalidation_cause = RS_INVAL_WAL_REMOVED;
+					goto invalidation_marked;
+				}
+			}
+			if (cause & RS_INVAL_HORIZON)
+			{
+				if (!SlotIsLogical(s))
+					goto invalidation_marked;
+				/* invalid DB oid signals a shared relation */
+				if (dboid != InvalidOid && dboid != s->data.database)
+					goto invalidation_marked;
+				if (TransactionIdIsValid(initial_effective_xmin) &&
+					TransactionIdPrecedesOrEquals(initial_effective_xmin,
+												  snapshotConflictHorizon))
+				{
+					invalidation_cause = RS_INVAL_HORIZON;
+					goto invalidation_marked;
+				}
+				else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
+						 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
+													   snapshotConflictHorizon))
+				{
+					invalidation_cause = RS_INVAL_HORIZON;
+					goto invalidation_marked;
+				}
+			}
+			if (cause & RS_INVAL_WAL_LEVEL)
 			{
-				case RS_INVAL_WAL_REMOVED:
-					if (initial_restart_lsn != InvalidXLogRecPtr &&
-						initial_restart_lsn < oldestLSN)
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_HORIZON:
-					if (!SlotIsLogical(s))
-						break;
-					/* invalid DB oid signals a shared relation */
-					if (dboid != InvalidOid && dboid != s->data.database)
-						break;
-					if (TransactionIdIsValid(initial_effective_xmin) &&
-						TransactionIdPrecedesOrEquals(initial_effective_xmin,
-													  snapshotConflictHorizon))
-						invalidation_cause = cause;
-					else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
-							 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
-														   snapshotConflictHorizon))
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_WAL_LEVEL:
-					if (SlotIsLogical(s))
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_NONE:
-					pg_unreachable();
+				if (SlotIsLogical(s))
+				{
+					invalidation_cause = RS_INVAL_WAL_LEVEL;
+					goto invalidation_marked;
+				}
+			}
+			if (cause & RS_INVAL_IDLE_TIMEOUT)
+			{
+				Assert(now > 0);
+
+				/*
+				 * Check if the slot needs to be invalidated due to
+				 * idle_replication_slot_timeout GUC.
+				 */
+				if (CanInvalidateIdleSlot(s) &&
+					TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+													  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+				{
+					invalidation_cause = RS_INVAL_IDLE_TIMEOUT;
+					inactive_since = s->inactive_since;
+					goto invalidation_marked;
+				}
 			}
 		}
 
+invalidation_marked:
+
 		/*
 		 * The invalidation cause recorded previously should not change while
 		 * the process owning the slot (if any) has been terminated.
@@ -1705,9 +1797,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
+		 * max_slot_wal_keep_size is set to -1 and
+		 * idle_replication_slot_timeout is set to 0 during the binary
+		 * upgrade. See check_old_cluster_for_valid_slots() where we ensure
+		 * that no invalidated before the upgrade.
 		 */
 		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
 
@@ -1739,7 +1832,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since, now);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1785,7 +1879,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since, now);
 
 			/* done with this slot for now */
 			break;
@@ -1800,14 +1895,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 /*
  * Invalidate slots that require resources about to be removed.
  *
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
+ *
+ * Note: This function attempts to invalidate the slot for multiple possible
+ * causes in a single pass, minimizing redundant iterations. The "cause"
+ * parameter can be a MASK representing one or more of the defined causes.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
@@ -1819,9 +1920,9 @@ InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
 	XLogRecPtr	oldestLSN;
 	bool		invalidated = false;
 
-	Assert(cause != RS_INVAL_HORIZON || TransactionIdIsValid(snapshotConflictHorizon));
-	Assert(cause != RS_INVAL_WAL_REMOVED || oldestSegno > 0);
-	Assert(cause != RS_INVAL_NONE);
+	Assert(!(cause & RS_INVAL_HORIZON) || TransactionIdIsValid(snapshotConflictHorizon));
+	Assert(!(cause & RS_INVAL_WAL_REMOVED) || oldestSegno > 0);
+	Assert(!(cause & RS_INVAL_NONE));
 
 	if (max_replication_slots == 0)
 		return invalidated;
@@ -2428,18 +2529,43 @@ RestoreSlotFromDisk(const char *name)
 ReplicationSlotInvalidationCause
 GetSlotInvalidationCause(const char *invalidation_reason)
 {
-	ReplicationSlotInvalidationCause cause;
 	ReplicationSlotInvalidationCause result = RS_INVAL_NONE;
 	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
+	int			cause_idx;
 
 	Assert(invalidation_reason);
 
-	for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
+	for (cause_idx = 0; cause_idx <= RS_INVAL_MAX_CAUSES; cause_idx++)
 	{
-		if (strcmp(SlotInvalidationCauses[cause], invalidation_reason) == 0)
+		if (strcmp(InvalidationCauses[cause_idx].cause_name, invalidation_reason) == 0)
 		{
 			found = true;
-			result = cause;
+			result = InvalidationCauses[cause_idx].cause;
+			break;
+		}
+	}
+
+	Assert(found);
+	return result;
+}
+
+/*
+ * Maps an ReplicationSlotInvalidationCause to the invalidation
+ * reason for a replication slot.
+ */
+const char *
+GetSlotInvalidationCauseName(ReplicationSlotInvalidationCause cause)
+{
+	const char *result = "none";
+	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
+	int			cause_idx;
+
+	for (cause_idx = 0; cause_idx <= RS_INVAL_MAX_CAUSES; cause_idx++)
+	{
+		if (InvalidationCauses[cause_idx].cause == cause)
+		{
+			found = true;
+			result = InvalidationCauses[cause_idx].cause_name;
 			break;
 		}
 	}
@@ -2802,3 +2928,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 8be4b8c65b..f652ec8a73 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -431,7 +431,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		if (cause == RS_INVAL_NONE)
 			nulls[i++] = true;
 		else
-			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+			values[i++] = CStringGetTextDatum(GetSlotInvalidationCauseName(cause));
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index ce7534d4d2..758329b1c1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		0, 0, INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index c40b7a3121..f9a5561166 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -326,6 +326,7 @@
 				# (change requires restart)
 #wal_keep_size = 0		# in megabytes; 0 disables
 #max_slot_wal_keep_size = -1	# in megabytes; -1 disables
+#idle_replication_slot_timeout = 0	# in minutes; 0 disables
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 000c36d30d..653d60fd48 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -44,21 +44,28 @@ typedef enum ReplicationSlotPersistency
  * Slots can be invalidated, e.g. due to max_slot_wal_keep_size. If so, the
  * 'invalidated' field is set to a value other than _NONE.
  *
- * When adding a new invalidation cause here, remember to update
- * SlotInvalidationCauses and RS_INVAL_MAX_CAUSES.
+ * When adding a new invalidation cause here, the value must be powers of 2
+ * (e.g., 1, 2, 4...) for proper bitwise operations. Also, remember to update
+ * SlotInvalidationCauseMap in slot.c.
  */
 typedef enum ReplicationSlotInvalidationCause
 {
-	RS_INVAL_NONE,
+	RS_INVAL_NONE = 0x00,
 	/* required WAL has been removed */
-	RS_INVAL_WAL_REMOVED,
+	RS_INVAL_WAL_REMOVED = 0x01,
 	/* required rows have been removed */
-	RS_INVAL_HORIZON,
+	RS_INVAL_HORIZON = 0x02,
 	/* wal_level insufficient for slot */
-	RS_INVAL_WAL_LEVEL,
+	RS_INVAL_WAL_LEVEL = 0x04,
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT = 0x08,
 } ReplicationSlotInvalidationCause;
 
-extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
+typedef struct SlotInvalidationCauseMap
+{
+	int			cause;
+	const char *cause_name;
+}			SlotInvalidationCauseMap;
 
 /*
  * On-Disk data of a replication slot, preserved across restarts.
@@ -254,6 +261,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -303,6 +311,7 @@ extern void CheckSlotRequirements(void);
 extern void CheckSlotPermissions(void);
 extern ReplicationSlotInvalidationCause
 			GetSlotInvalidationCause(const char *invalidation_reason);
+extern const char *GetSlotInvalidationCauseName(ReplicationSlotInvalidationCause cause);
 
 extern bool SlotExistsInSyncStandbySlots(const char *slot_name);
 extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..9963bddc0e 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -107,6 +107,9 @@ extern long TimestampDifferenceMilliseconds(TimestampTz start_time,
 extern bool TimestampDifferenceExceeds(TimestampTz start_time,
 									   TimestampTz stop_time,
 									   int msec);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
 extern pg_time_t timestamptz_to_time_t(TimestampTz t);
-- 
2.34.1

v72-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/octet-stream; name=v72-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From 2646d9edd04ffd148a292f58c3359fb2802e83ff Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Thu, 6 Feb 2025 15:35:06 +0530
Subject: [PATCH v72 2/2] Add TAP test for slot invalidation based on inactive
 timeout.

This test uses injection points to bypass the time overhead caused by the
idle_replication_slot_timeout GUC, which has a minimum value of one minute.
---
 src/backend/replication/slot.c                |  31 +++--
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 110 ++++++++++++++++++
 3 files changed, 132 insertions(+), 10 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 92e90743fc..6006417866 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
+#include "utils/injection_point.h"
 #include "utils/guc_hooks.h"
 #include "utils/varlena.h"
 
@@ -1735,17 +1736,27 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				Assert(now > 0);
 
-				/*
-				 * Check if the slot needs to be invalidated due to
-				 * idle_replication_slot_timeout GUC.
-				 */
-				if (CanInvalidateIdleSlot(s) &&
-					TimestampDifferenceExceedsSeconds(s->inactive_since, now,
-													  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+				if (CanInvalidateIdleSlot(s))
 				{
-					invalidation_cause = RS_INVAL_IDLE_TIMEOUT;
-					inactive_since = s->inactive_since;
-					goto invalidation_marked;
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 *
+					 * To test idle timeout slot invalidation, if the
+					 * "slot-timeout-inval" injection point is attached,
+					 * immediately invalidate the slot.
+					 */
+					if (
+#ifdef USE_INJECTION_POINTS
+						IS_INJECTION_POINT_ATTACHED("slot-timeout-inval") ||
+#endif
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = RS_INVAL_IDLE_TIMEOUT;
+						inactive_since = s->inactive_since;
+						goto invalidation_marked;
+					}
 				}
 			}
 		}
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..2392f24711
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,110 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation due to idle_timeout
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# This test depends on injection point that forces slot invalidation
+# due to idle_timeout. Enabling injections points requires
+# --enable-injection-points with configure or
+# -Dinjection_points=true with Meson.
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $node_name = $node->name;
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(
+		qr/invalidating obsolete replication slot \"$slot_name\"/, $offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot_name to be set on node $node_name";
+}
+
+# ========================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical slot due to idle
+# timeout.
+
+# Initialize the node
+my $node = PostgreSQL::Test::Cluster->new('node');
+$node->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$node->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+idle_replication_slot_timeout = 1min
+});
+$node->start;
+
+# Create both streaming standby and logical slot
+$node->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'physical_slot', immediately_reserve := true);
+]);
+$node->safe_psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('logical_slot', 'test_decoding');}
+);
+
+my $log_offset = -s $node->logfile;
+
+# Register an injection point on the node to forcibly cause a slot
+# invalidation due to idle_timeout
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Check if the 'injection_points' extension is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('slot-timeout-inval', 'error');");
+
+# Idle timeout slot invalidation occurs during a checkpoint, so run a
+# checkpoint to invalidate the slots.
+$node->safe_psql('postgres', "CHECKPOINT");
+
+# Wait for slots to become inactive. Note that since nobody has acquired the
+# slot yet, then if it has been invalidated that can only be due to the idle
+# timeout mechanism.
+wait_for_slot_invalidation($node, 'physical_slot', $log_offset);
+wait_for_slot_invalidation($node, 'logical_slot', $log_offset);
+
+# Check that the invalidated slot cannot be acquired
+my $node_name = $node->name;
+my ($result, $stdout, $stderr);
+($result, $stdout, $stderr) = $node->psql(
+	'postgres', qq[
+		SELECT pg_replication_slot_advance('logical_slot', '0/1');
+]);
+ok( $stderr =~ /can no longer access replication slot "logical_slot"/,
+	"detected error upon trying to acquire invalidated slot on node")
+  or die
+  "could not detect error upon trying to acquire invalidated slot \"logical_slot\" on node";
+
+# Testcase end
+# =============================================================================
+
+done_testing();
-- 
2.34.1

#387

Zhijie Hou (Fujitsu)

houzj.fnst@fujitsu.com

11 months ago

In reply to: Nisha Moond (#386)

RE: Introduce XID age and inactive timeout based replication slot invalidation

On Friday, February 7, 2025 9:06 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

Attached v72 patches, addressed the above comments as well as Vignesh's
comments in [2].
- There are no new changes in patch-002.

Thanks for updating the patch, I have few review comments:

InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,

I think the type of first parameter 'cause' is not appropriate anymore since
it's now a bitmap flag instead of an enum.

-StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
-				 "array length mismatch");
+#define	RS_INVAL_MAX_CAUSES (sizeof(InvalidationCauses) / sizeof(InvalidationCauses[0]))

I'd like to confirm if the current value of the RS_INVAL_MAX_CAUSES is correct.
Previously, the value is arrary_length - 1, while now it seems equal to the
arrary_length.

And ISTM we could directly call lengthof() here.

+			if (cause & RS_INVAL_HORIZON)
+			{
+				if (!SlotIsLogical(s))
+					goto invalidation_marked;

I am not sure if this logic is correct. Even if the slot would not be
invalidated due to RS_INVAL_HORIZON, we should continue to check other causes.

Besides, instead of using a goto, I personally prefer to move all these codes
into a separate function which would return a single invalidation cause.

4.
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))

I think this change could trigger an unnecessary WAL position re-calculation when
slots are invalidated only due to RS_INVAL_IDLE_TIMEOUT. But since it would not be
a frequent operation so I am OK to leave it unless we have better ideas.

Best Regards,
Hou zj

#388

Peter Smith

smithpb2250@gmail.com

11 months ago

In reply to: Nisha Moond (#386)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha,

Some review comments for v72-0001.

======
GENERAL

My preference was to just keep the enum as per v70 for the *actual*
cause, and introduce a separate set of bit flags for *possible* causes
to be checked. This creates a clear code separation between the actual
and possible. It also eliminates the need to jump through hoops just
to map a cause to its name.

You wrote:

OTOH, keeping the enums as they are in v70, and defining new macros

for the very similar purpose could add unnecessary complexity to code
management.

Since both the enum and the bit flags would be defined in slot.h
adjacent to each other I don't foresee much complexity. I concede, a
dev might write code and accidentally muddle the enum instead of the
flag or vice versa but that's an example of sloppiness, not
complexity. Certainly there would be fewer necessary changes than what
are in the v72 patch due to all the cause/causename mappings. For
example,

slot.h - Now, introduces a NEW typedef SlotInvalidationCauseMap
slot.h - Now, need extern for NEW function GetSlotInvalidationCauseName

slot.c - Now, needed minor rewrite of GetSlotInvalidationCause instead
of leaving it as-is
slot.c - Now, needs a whole NEW looping function
GetSlotInvalidationCauseName instead of direct array index.

Several place now must call to the GetSlotInvalidationCauseName where
previously a simple direct array lookup was done
slot.c - NEW call in ReplicationSlotAcquire
slotfuncs.c - NEW call in pg_get_replication_slots

FWIW, I've attached a topup patch using my idea just to see what it
might look like. The result was 20 lines less code.

Anyway, YMMV.

======

Other review comments follow ...

src/backend/replication/slot.c

InvalidatePossiblyObsoleteSlot:

1.
Having all those 'gotos' seems like something best avoided. Can you
try removing them to see if it improves this function? IIUC you maybe
can try rid all of them using logic like:

- assign invalidation_cause = NONE outside the loop
- loop until invalidation_cause != NONE
- include 'invalidation_cause == NONE' condition with all the bit flag checks
- reassign invalidation_cause = NONE in the racy place where you want
to continue the loop.

and instead just keep looping and checking while the
'invalidation_cause' remains NONE.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachments:

v210-0001-ps-tmp-topup-nisha-v720001.txttext/plain; charset=US-ASCII; name=v210-0001-ps-tmp-topup-nisha-v720001.txtDownload

From c51d7b8eddfc991c5978052b8f99aad0f7330c5c Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 10 Feb 2025 16:39:00 +1100
Subject: [PATCH v210] ps-tmp-topup-nisha-v720001

---
 src/backend/access/transam/xlog.c   |  6 +--
 src/backend/replication/slot.c      | 73 ++++++++++++-------------------------
 src/backend/replication/slotfuncs.c |  2 +-
 src/backend/storage/ipc/standby.c   |  2 +-
 src/include/replication/slot.h      | 32 ++++++++--------
 5 files changed, 46 insertions(+), 69 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d583313..e59df20 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7316,7 +7316,7 @@ CreateCheckPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 	KeepLogSeg(recptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
+	if (InvalidateObsoleteReplicationSlots(RS_CHECK_IF_INVAL_WAL_REMOVED | RS_CHECK_IF_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
@@ -7771,7 +7771,7 @@ CreateRestartPoint(int flags)
 	replayPtr = GetXLogReplayRecPtr(&replayTLI);
 	endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr;
 	KeepLogSeg(endptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
+	if (InvalidateObsoleteReplicationSlots(RS_CHECK_IF_INVAL_WAL_REMOVED | RS_CHECK_IF_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
@@ -8512,7 +8512,7 @@ xlog_redo(XLogReaderState *record)
 		if (InRecovery && InHotStandby &&
 			xlrec.wal_level < WAL_LEVEL_LOGICAL &&
 			wal_level >= WAL_LEVEL_LOGICAL)
-			InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL,
+			InvalidateObsoleteReplicationSlots(RS_CHECK_IF_INVAL_WAL_LEVEL,
 											   0, InvalidOid,
 											   InvalidTransactionId);
 
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index f433bf1..3c7c70d 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -102,16 +102,16 @@ typedef struct
 /*
  * Lookup table for slot invalidation causes.
  */
-const		SlotInvalidationCauseMap InvalidationCauses[] = {
-	{RS_INVAL_NONE, "none"},
-	{RS_INVAL_WAL_REMOVED, "wal_removed"},
-	{RS_INVAL_HORIZON, "rows_removed"},
-	{RS_INVAL_WAL_LEVEL, "wal_level_insufficient"},
-	{RS_INVAL_IDLE_TIMEOUT, "idle_timeout"},
+const char *const SlotInvalidationCauses[] = {
+	[RS_INVAL_NONE] = "none",
+	[RS_INVAL_WAL_REMOVED] = "wal_removed",
+	[RS_INVAL_HORIZON] = "rows_removed",
+	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+	[RS_INVAL_IDLE_TIMEOUT] = "idle_timeout",
 };
 
-/* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES (sizeof(InvalidationCauses) / sizeof(InvalidationCauses[0]))
+StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
+				 "array length mismatch");
 
 /* size of version independent data */
 #define ReplicationSlotOnDiskConstantSize \
@@ -579,7 +579,7 @@ retry:
 				errmsg("can no longer access replication slot \"%s\"",
 					   NameStr(s->data.name)),
 				errdetail("This replication slot has been invalidated due to \"%s\".",
-						  GetSlotInvalidationCauseName(s->data.invalidated)));
+						  SlotInvalidationCauses[s->data.invalidated]));
 	}
 
 	/*
@@ -1630,7 +1630,7 @@ CanInvalidateIdleSlot(ReplicationSlot *s)
  * for syscalls, so caller must restart if we return true.
  */
 static bool
-InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
+InvalidatePossiblyObsoleteSlot(int possible_causes,
 							   ReplicationSlot *s,
 							   XLogRecPtr oldestLSN,
 							   Oid dboid, TransactionId snapshotConflictHorizon,
@@ -1662,7 +1662,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
-		if (cause & RS_INVAL_IDLE_TIMEOUT)
+		if (possible_causes & RS_CHECK_IF_INVAL_IDLE_TIMEOUT)
 		{
 			/*
 			 * Assign the current time here to avoid system call overhead
@@ -1698,7 +1698,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				initial_catalog_effective_xmin = s->effective_catalog_xmin;
 			}
 
-			if (cause & RS_INVAL_WAL_REMOVED)
+			if (possible_causes & RS_CHECK_IF_INVAL_WAL_REMOVED)
 			{
 				if (initial_restart_lsn != InvalidXLogRecPtr &&
 					initial_restart_lsn < oldestLSN)
@@ -1707,7 +1707,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					goto invalidation_marked;
 				}
 			}
-			if (cause & RS_INVAL_HORIZON)
+			if (possible_causes & RS_CHECK_IF_INVAL_HORIZON)
 			{
 				if (!SlotIsLogical(s))
 					goto invalidation_marked;
@@ -1729,7 +1729,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					goto invalidation_marked;
 				}
 			}
-			if (cause & RS_INVAL_WAL_LEVEL)
+			if (possible_causes & RS_CHECK_IF_INVAL_WAL_LEVEL)
 			{
 				if (SlotIsLogical(s))
 				{
@@ -1737,7 +1737,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					goto invalidation_marked;
 				}
 			}
-			if (cause & RS_INVAL_IDLE_TIMEOUT)
+			if (possible_causes & RS_CHECK_IF_INVAL_IDLE_TIMEOUT)
 			{
 				Assert(now > 0);
 
@@ -1919,16 +1919,16 @@ invalidation_marked:
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
 bool
-InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
+InvalidateObsoleteReplicationSlots(int possible_causes,
 								   XLogSegNo oldestSegno, Oid dboid,
 								   TransactionId snapshotConflictHorizon)
 {
 	XLogRecPtr	oldestLSN;
 	bool		invalidated = false;
 
-	Assert(!(cause & RS_INVAL_HORIZON) || TransactionIdIsValid(snapshotConflictHorizon));
-	Assert(!(cause & RS_INVAL_WAL_REMOVED) || oldestSegno > 0);
-	Assert(!(cause & RS_INVAL_NONE));
+	Assert(!(possible_causes & RS_CHECK_IF_INVAL_HORIZON) || TransactionIdIsValid(snapshotConflictHorizon));
+	Assert(!(possible_causes & RS_CHECK_IF_INVAL_WAL_REMOVED) || oldestSegno > 0);
+	Assert(possible_causes);
 
 	if (max_replication_slots == 0)
 		return invalidated;
@@ -1944,7 +1944,7 @@ restart:
 		if (!s->in_use)
 			continue;
 
-		if (InvalidatePossiblyObsoleteSlot(cause, s, oldestLSN, dboid,
+		if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid,
 										   snapshotConflictHorizon,
 										   &invalidated))
 		{
@@ -2530,43 +2530,18 @@ RestoreSlotFromDisk(const char *name)
 ReplicationSlotInvalidationCause
 GetSlotInvalidationCause(const char *invalidation_reason)
 {
+	ReplicationSlotInvalidationCause cause;
 	ReplicationSlotInvalidationCause result = RS_INVAL_NONE;
 	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
-	int			cause_idx;
 
 	Assert(invalidation_reason);
 
-	for (cause_idx = 0; cause_idx <= RS_INVAL_MAX_CAUSES; cause_idx++)
+	for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
 	{
-		if (strcmp(InvalidationCauses[cause_idx].cause_name, invalidation_reason) == 0)
+		if (strcmp(SlotInvalidationCauses[cause], invalidation_reason) == 0)
 		{
 			found = true;
-			result = InvalidationCauses[cause_idx].cause;
-			break;
-		}
-	}
-
-	Assert(found);
-	return result;
-}
-
-/*
- * Maps an ReplicationSlotInvalidationCause to the invalidation
- * reason for a replication slot.
- */
-const char *
-GetSlotInvalidationCauseName(ReplicationSlotInvalidationCause cause)
-{
-	const char *result = "none";
-	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
-	int			cause_idx;
-
-	for (cause_idx = 0; cause_idx <= RS_INVAL_MAX_CAUSES; cause_idx++)
-	{
-		if (InvalidationCauses[cause_idx].cause == cause)
-		{
-			found = true;
-			result = InvalidationCauses[cause_idx].cause_name;
+			result = cause;
 			break;
 		}
 	}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index f652ec8..8be4b8c 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -431,7 +431,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		if (cause == RS_INVAL_NONE)
 			nulls[i++] = true;
 		else
-			values[i++] = CStringGetTextDatum(GetSlotInvalidationCauseName(cause));
+			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 2039062..d68a5a4a0 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -499,7 +499,7 @@ ResolveRecoveryConflictWithSnapshot(TransactionId snapshotConflictHorizon,
 	 * reached, e.g. due to using a physical replication slot.
 	 */
 	if (wal_level >= WAL_LEVEL_LOGICAL && isCatalogRel)
-		InvalidateObsoleteReplicationSlots(RS_INVAL_HORIZON, 0, locator.dbOid,
+		InvalidateObsoleteReplicationSlots(RS_CHECK_IF_INVAL_HORIZON, 0, locator.dbOid,
 										   snapshotConflictHorizon);
 }
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index c4a62bc..d8da3dd 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -44,28 +44,31 @@ typedef enum ReplicationSlotPersistency
  * Slots can be invalidated, e.g. due to max_slot_wal_keep_size. If so, the
  * 'invalidated' field is set to a value other than _NONE.
  *
- * When adding a new invalidation cause here, the value must be powers of 2
- * (e.g., 1, 2, 4...) for proper bitwise operations. Also, remember to update
- * SlotInvalidationCauseMap in slot.c.
+ * When adding a new invalidation cause here, remember to update
+ * SlotInvalidationCauses and RS_INVAL_MAX_CAUSES.
  */
 typedef enum ReplicationSlotInvalidationCause
 {
-	RS_INVAL_NONE = 0x00,
+	RS_INVAL_NONE,
 	/* required WAL has been removed */
-	RS_INVAL_WAL_REMOVED = 0x01,
+	RS_INVAL_WAL_REMOVED,
 	/* required rows have been removed */
-	RS_INVAL_HORIZON = 0x02,
+	RS_INVAL_HORIZON,
 	/* wal_level insufficient for slot */
-	RS_INVAL_WAL_LEVEL = 0x04,
+	RS_INVAL_WAL_LEVEL,
 	/* idle slot timeout has occurred */
-	RS_INVAL_IDLE_TIMEOUT = 0x08,
+	RS_INVAL_IDLE_TIMEOUT,
 } ReplicationSlotInvalidationCause;
 
-typedef struct SlotInvalidationCauseMap
-{
-	int			cause;
-	const char *cause_name;
-}			SlotInvalidationCauseMap;
+#define RS_INVAL_MAX_CAUSES RS_INVAL_IDLE_TIMEOUT
+
+extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
+
+/* Bit flags for checking possible invalidation causes. */
+#define	RS_CHECK_IF_INVAL_WAL_REMOVED	(1U << RS_INVAL_WAL_REMOVED)
+#define	RS_CHECK_IF_INVAL_HORIZON		(1U << RS_INVAL_HORIZON)
+#define	RS_CHECK_IF_INVAL_WAL_LEVEL		(1U << RS_INVAL_WAL_LEVEL)
+#define	RS_CHECK_IF_INVAL_IDLE_TIMEOUT	(1U << RS_INVAL_IDLE_TIMEOUT)
 
 /*
  * On-Disk data of a replication slot, preserved across restarts.
@@ -277,7 +280,7 @@ extern void ReplicationSlotsComputeRequiredLSN(void);
 extern XLogRecPtr ReplicationSlotsComputeLogicalRestartLSN(void);
 extern bool ReplicationSlotsCountDBSlots(Oid dboid, int *nslots, int *nactive);
 extern void ReplicationSlotsDropDBSlots(Oid dboid);
-extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
+extern bool InvalidateObsoleteReplicationSlots(int possible_cause,
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
@@ -294,7 +297,6 @@ extern void CheckSlotRequirements(void);
 extern void CheckSlotPermissions(void);
 extern ReplicationSlotInvalidationCause
 			GetSlotInvalidationCause(const char *invalidation_reason);
-extern const char *GetSlotInvalidationCauseName(ReplicationSlotInvalidationCause cause);
 
 extern bool SlotExistsInSyncStandbySlots(const char *slot_name);
 extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
-- 
1.8.3.1

#389

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Zhijie Hou (Fujitsu) (#387)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sat, Feb 8, 2025 at 12:28 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

3.
+                       if (cause & RS_INVAL_HORIZON)
+                       {
+                               if (!SlotIsLogical(s))
+                                       goto invalidation_marked;
I am not sure if this logic is correct. Even if the slot would not be
invalidated due to RS_INVAL_HORIZON, we should continue to check other causes.

Isn't this comment apply to even the next condition (if (dboid !=
InvalidOid && dboid != s->data.database))? We need to probably
continue to check other invalidation causes unless one is set.

Besides, instead of using a goto, I personally prefer to move all these codes
into a separate function which would return a single invalidation cause.

Instead of using goto label (invalidation_marked:), won't it be better
if we use a boolean invalidation_marked and convert all if's to if ..
else if .. else cases?

4.
-       if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+       if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
_logSegNo, InvalidOid,
InvalidTransactionId))
I think this change could trigger an unnecessary WAL position re-calculation when
slots are invalidated only due to RS_INVAL_IDLE_TIMEOUT.

Why is that unnecessary? If some slots got invalidated due to timeout,
we don't want to retain the WAL corresponding to them.

--
With Regards,
Amit Kapila.

#390

Peter Smith

smithpb2250@gmail.com

11 months ago

In reply to: Amit Kapila (#385)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Feb 7, 2025 at 4:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Feb 7, 2025 at 8:00 AM Peter Smith <smithpb2250@gmail.com> wrote:
======
src/backend/access/transam/xlog.c
1.
XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
KeepLogSeg(recptr, &_logSegNo);
- if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+ if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED |
RS_INVAL_IDLE_TIMEOUT,
_logSegNo, InvalidOid,
InvalidTransactionId))
{
@@ -7792,7 +7792,7 @@ CreateRestartPoint(int flags)
replayPtr = GetXLogReplayRecPtr(&replayTLI);
endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr;
KeepLogSeg(endptr, &_logSegNo);
- if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+ if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED |
RS_INVAL_IDLE_TIMEOUT,
_logSegNo, InvalidOid,
InvalidTransactionId))
It seems fundamentally strange to me to assign multiple simultaneous
causes like this. IMO you can't invalidate something that is invalid
already. I gues v71 was an attempt to implement Amit's:
The idea is to invalidate the slot either due to WAL_REMOVED or
IDLE_TIMEOUT in one go during the checkpoint instead of taking
multiple passes over the slots during the checkpoint. Feel free to
suggest if you can think of a better way to implement it.

Hi Amit,

My preference already suggested was to have a separation between the
concepts of *actual* causes (e.g. discrete enum values like in v70)
and *possible* causes to be checked (using #defines for bit flags).

My v72-0001 review [1]/messages/by-id/CAHut+Pupn_S0mrM2zB+FwAbPqVak7jwSjRhU3WyA18QC1HU__g@mail.gmail.com includes a top-up patch to show what doing it
this way might look like.

======
[1]: /messages/by-id/CAHut+Pupn_S0mrM2zB+FwAbPqVak7jwSjRhU3WyA18QC1HU__g@mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

#391

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Peter Smith (#388)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Feb 10, 2025 at 11:33 AM Peter Smith <smithpb2250@gmail.com> wrote:

Some review comments for v72-0001.

======
GENERAL

My preference was to just keep the enum as per v70 for the *actual*
cause, and introduce a separate set of bit flags for *possible* causes
to be checked. This creates a clear code separation between the actual
and possible. It also eliminates the need to jump through hoops just
to map a cause to its name.

You wrote:

OTOH, keeping the enums as they are in v70, and defining new macros

for the very similar purpose could add unnecessary complexity to code
management.

Since both the enum and the bit flags would be defined in slot.h
adjacent to each other I don't foresee much complexity. I concede, a
dev might write code and accidentally muddle the enum instead of the
flag or vice versa but that's an example of sloppiness, not
complexity. Certainly there would be fewer necessary changes than what
are in the v72 patch due to all the cause/causename mappings. For
example,

slot.h - Now, introduces a NEW typedef SlotInvalidationCauseMap
slot.h - Now, need extern for NEW function GetSlotInvalidationCauseName

slot.c - Now, needed minor rewrite of GetSlotInvalidationCause instead
of leaving it as-is
slot.c - Now, needs a whole NEW looping function
GetSlotInvalidationCauseName instead of direct array index.

Several place now must call to the GetSlotInvalidationCauseName where
previously a simple direct array lookup was done
slot.c - NEW call in ReplicationSlotAcquire
slotfuncs.c - NEW call in pg_get_replication_slots

~

FWIW, I've attached a topup patch using my idea just to see what it
might look like. The result was 20 lines less code.

I don't like the idea of maintaining the same information in two
different ways (as enum and bit flags). We already have a few cases of
defining bit flags as part of enums like ScanOptions and relopt_kind,
so I feel following that model would be a better approach.

--
With Regards,
Amit Kapila.

#392

Zhijie Hou (Fujitsu)

houzj.fnst@fujitsu.com

11 months ago

In reply to: Amit Kapila (#389)

RE: Introduce XID age and inactive timeout based replication slot invalidation

On Monday, February 10, 2025 2:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Feb 8, 2025 at 12:28 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
wrote:
3.
+                       if (cause & RS_INVAL_HORIZON)
+                       {
+                               if (!SlotIsLogical(s))
+                                       goto invalidation_marked;
I am not sure if this logic is correct. Even if the slot would not be
invalidated due to RS_INVAL_HORIZON, we should continue to check other
causes.

Isn't this comment apply to even the next condition (if (dboid != InvalidOid &&
dboid != s->data.database))? We need to probably continue to check other
invalidation causes unless one is set.

Yes, both places need to be fixed.

Besides, instead of using a goto, I personally prefer to move all
these codes into a separate function which would return a single invalidation

cause.

Instead of using goto label (invalidation_marked:), won't it be better if we use a
boolean invalidation_marked and convert all if's to if ..
else if .. else cases?

Yes, I think that would be better.

4.
-       if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+       if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED
|

+ RS_INVAL_IDLE_TIMEOUT,

_logSegNo, InvalidOid,

InvalidTransactionId))

I think this change could trigger an unnecessary WAL position
re-calculation when slots are invalidated only due to

RS_INVAL_IDLE_TIMEOUT.

Why is that unnecessary? If some slots got invalidated due to timeout, we don't
want to retain the WAL corresponding to them.

Sorry, I mistakenly thought that the slot only protected dead tuples.
Please disregard this comment.

Best Regards,
Hou zj

#393

Nisha Moond

nisha.moond412@gmail.com

11 months ago

In reply to: Zhijie Hou (Fujitsu) (#387)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Sat, Feb 8, 2025 at 12:28 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

On Friday, February 7, 2025 9:06 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

Attached v72 patches, addressed the above comments as well as Vignesh's
comments in [2].
- There are no new changes in patch-002.

Thanks for updating the patch, I have few review comments:

1.

InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,

I think the type of first parameter 'cause' is not appropriate anymore since
it's now a bitmap flag instead of an enum.

Changed the type to 'int' and updated the name of 'cause' in both
InvalidateObsoleteReplicationSlots() and
InvalidatePossiblyObsoleteSlot(), as both now use the bitmap flag.

2.
-StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
-                              "array length mismatch");
+#define      RS_INVAL_MAX_CAUSES (sizeof(InvalidationCauses) / sizeof(InvalidationCauses[0]))
I'd like to confirm if the current value of the RS_INVAL_MAX_CAUSES is correct.
Previously, the value is arrary_length - 1, while now it seems equal to the
arrary_length.

And ISTM we could directly call lengthof() here.

Done.

3.
+                       if (cause & RS_INVAL_HORIZON)
+                       {
+                               if (!SlotIsLogical(s))
+                                       goto invalidation_marked;
I am not sure if this logic is correct. Even if the slot would not be
invalidated due to RS_INVAL_HORIZON, we should continue to check other causes.

Used goto here since we do not expect RS_INVAL_HORIZON to be combined
with any other "cause" and to keep the pgHead behavior.
However, with the bitflag approach, the code should be future-safe, so
replacing goto in v73 should handle this now.

Besides, instead of using a goto, I personally prefer to move all these codes
into a separate function which would return a single invalidation cause.

Done.

~~~~
Here are the v73 patches incorporating the comments above and the
subsequent comments from [1]/messages/by-id/CAA4eK1K+AMtGfD3WRK_ivdAeS-CBOUBKJbr-6ku175P1x=wk4g@mail.gmail.com.
- patch 002 is rebased on 001 with no new changes.

[1]: /messages/by-id/CAA4eK1K+AMtGfD3WRK_ivdAeS-CBOUBKJbr-6ku175P1x=wk4g@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v73-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v73-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From f683dd0a5e3b9abe135c6996df4234a9c93debd8 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 3 Feb 2025 15:20:40 +0530
Subject: [PATCH v73 1/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not applicable
for slots that do not reserve WAL or for slots on the standby server
that are being synced from the primary server (i.e., standby slots
having 'synced' field 'true'). Synced slots are always considered to be
inactive because they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  42 +++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |   7 +
 src/backend/access/transam/xlog.c             |   4 +-
 src/backend/replication/slot.c                | 251 +++++++++++++-----
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  25 +-
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 14 files changed, 312 insertions(+), 71 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 38244409e3..a915a43625 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4423,6 +4423,48 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </listitem>
       </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero (the default) disables the idle timeout
+        invalidation mechanism. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the
+        slot's <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not applicable
+        for slots that do not reserve WAL or for slots on the standby server
+        that are being synced from the primary server (i.e., standby slots
+        having <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>). Synced slots are always considered to
+        be inactive because they don't perform logical decoding to produce
+        changes. Slots that appear idle due to a disrupted connection between
+        the publisher and subscriber are also excluded, as they are managed by
+        <link linkend="guc-wal-sender-timeout"><varname>wal_sender_timeout</varname></link>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
       <term><varname>wal_sender_timeout</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 613abcd28b..3d18e507bb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2390,6 +2390,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be81c2b51d..f58b9406e4 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2621,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9c270e7d46..3eaf0bf311 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7337,7 +7337,7 @@ CreateCheckPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 	KeepLogSeg(recptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
@@ -7792,7 +7792,7 @@ CreateRestartPoint(int flags)
 	replayPtr = GetXLogReplayRecPtr(&replayTLI);
 	endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr;
 	KeepLogSeg(endptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fe5acd8b1f..1b34d256a5 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -102,18 +102,16 @@ typedef struct
 /*
  * Lookup table for slot invalidation causes.
  */
-const char *const SlotInvalidationCauses[] = {
-	[RS_INVAL_NONE] = "none",
-	[RS_INVAL_WAL_REMOVED] = "wal_removed",
-	[RS_INVAL_HORIZON] = "rows_removed",
-	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+static const SlotInvalidationCauseMap InvalidationCauses[] = {
+	{RS_INVAL_NONE, "none"},
+	{RS_INVAL_WAL_REMOVED, "wal_removed"},
+	{RS_INVAL_HORIZON, "rows_removed"},
+	{RS_INVAL_WAL_LEVEL, "wal_level_insufficient"},
+	{RS_INVAL_IDLE_TIMEOUT, "idle_timeout"},
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
-
-StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
-				 "array length mismatch");
+#define	RS_INVAL_MAX_CAUSES (lengthof(InvalidationCauses)-1)
 
 /* size of version independent data */
 #define ReplicationSlotOnDiskConstantSize \
@@ -141,6 +139,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = 0;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -575,7 +579,7 @@ retry:
 				errmsg("can no longer access replication slot \"%s\"",
 					   NameStr(s->data.name)),
 				errdetail("This replication slot has been invalidated due to \"%s\".",
-						  SlotInvalidationCauses[s->data.invalidated]));
+						  GetSlotInvalidationCauseName(s->data.invalidated)));
 	}
 
 	/*
@@ -1512,12 +1516,18 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since,
+					   TimestampTz now)
 {
+	int			minutes;
+	int			secs;
+	long		elapsed_secs;
 	StringInfoData err_detail;
-	bool		hint = false;
+	StringInfoData err_hint;
 
 	initStringInfo(&err_detail);
+	initStringInfo(&err_hint);
 
 	switch (cause)
 	{
@@ -1525,13 +1535,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			{
 				unsigned long long ex = oldestLSN - restart_lsn;
 
-				hint = true;
 				appendStringInfo(&err_detail,
 								 ngettext("The slot's restart_lsn %X/%X exceeds the limit by %llu byte.",
 										  "The slot's restart_lsn %X/%X exceeds the limit by %llu bytes.",
 										  ex),
 								 LSN_FORMAT_ARGS(restart_lsn),
 								 ex);
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+								 "max_slot_wal_keep_size");
 				break;
 			}
 		case RS_INVAL_HORIZON:
@@ -1542,6 +1554,24 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0 && now > 0);
+
+			/* Calculate the idle time duration of the slot */
+			elapsed_secs = (now - inactive_since) / USECS_PER_SEC;
+			minutes = elapsed_secs / SECS_PER_MINUTE;
+			secs = elapsed_secs % SECS_PER_MINUTE;
+
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_detail, _("The slot's idle time of %d minutes and %02d seconds exceeds the configured \"%s\" duration of %d minutes."),
+							 minutes, secs, "idle_replication_slot_timeout",
+							 idle_replication_slot_timeout_mins);
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1553,9 +1583,31 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			errmsg("invalidating obsolete replication slot \"%s\"",
 				   NameStr(slotname)),
 			errdetail_internal("%s", err_detail.data),
-			hint ? errhint("You might need to increase \"%s\".", "max_slot_wal_keep_size") : 0);
+			err_hint.len ? errhint("%s", err_hint.data) : 0);
 
 	pfree(err_detail.data);
+	pfree(err_hint.data);
+}
+
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has reserved WAL
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server is in
+ *	  recovery. This is because synced slots are always considered to be
+ *	  inactive because they don't perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins != 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
 }
 
 /*
@@ -1572,7 +1624,7 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
  * for syscalls, so caller must restart if we return true.
  */
 static bool
-InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
+InvalidatePossiblyObsoleteSlot(int possible_causes,
 							   ReplicationSlot *s,
 							   XLogRecPtr oldestLSN,
 							   Oid dboid, TransactionId snapshotConflictHorizon,
@@ -1585,6 +1637,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1592,6 +1645,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1602,6 +1656,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (possible_causes & RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * Assign the current time here to avoid system call overhead
+			 * while holding the spinlock in subsequent code.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1629,34 +1692,49 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				initial_catalog_effective_xmin = s->effective_catalog_xmin;
 			}
 
-			switch (cause)
+			if (possible_causes & RS_INVAL_WAL_REMOVED)
+			{
+				if (initial_restart_lsn != InvalidXLogRecPtr &&
+					initial_restart_lsn < oldestLSN)
+					invalidation_cause = RS_INVAL_WAL_REMOVED;
+			}
+			if (invalidation_cause == RS_INVAL_NONE &&
+				(possible_causes & RS_INVAL_HORIZON))
+			{
+				if (SlotIsLogical(s) &&
+				/* invalid DB oid signals a shared relation */
+					(dboid == InvalidOid || dboid == s->data.database) &&
+					TransactionIdIsValid(initial_effective_xmin) &&
+					TransactionIdPrecedesOrEquals(initial_effective_xmin,
+												  snapshotConflictHorizon))
+					invalidation_cause = RS_INVAL_HORIZON;
+				else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
+						 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
+													   snapshotConflictHorizon))
+					invalidation_cause = RS_INVAL_HORIZON;
+			}
+			if (invalidation_cause == RS_INVAL_NONE &&
+				(possible_causes & RS_INVAL_WAL_LEVEL))
 			{
-				case RS_INVAL_WAL_REMOVED:
-					if (initial_restart_lsn != InvalidXLogRecPtr &&
-						initial_restart_lsn < oldestLSN)
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_HORIZON:
-					if (!SlotIsLogical(s))
-						break;
-					/* invalid DB oid signals a shared relation */
-					if (dboid != InvalidOid && dboid != s->data.database)
-						break;
-					if (TransactionIdIsValid(initial_effective_xmin) &&
-						TransactionIdPrecedesOrEquals(initial_effective_xmin,
-													  snapshotConflictHorizon))
-						invalidation_cause = cause;
-					else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
-							 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
-														   snapshotConflictHorizon))
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_WAL_LEVEL:
-					if (SlotIsLogical(s))
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_NONE:
-					pg_unreachable();
+				if (SlotIsLogical(s))
+					invalidation_cause = RS_INVAL_WAL_LEVEL;
+			}
+			if (invalidation_cause == RS_INVAL_NONE &&
+				(possible_causes & RS_INVAL_IDLE_TIMEOUT))
+			{
+				Assert(now > 0);
+
+				/*
+				 * Check if the slot needs to be invalidated due to
+				 * idle_replication_slot_timeout GUC.
+				 */
+				if (CanInvalidateIdleSlot(s) &&
+					TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+													  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+				{
+					invalidation_cause = RS_INVAL_IDLE_TIMEOUT;
+					inactive_since = s->inactive_since;
+				}
 			}
 		}
 
@@ -1705,9 +1783,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
+		 * max_slot_wal_keep_size is set to -1 and
+		 * idle_replication_slot_timeout is set to 0 during the binary
+		 * upgrade. See check_old_cluster_for_valid_slots() where we ensure
+		 * that no invalidated before the upgrade.
 		 */
 		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
 
@@ -1739,7 +1818,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since, now);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1785,7 +1865,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since, now);
 
 			/* done with this slot for now */
 			break;
@@ -1800,28 +1881,34 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 /*
  * Invalidate slots that require resources about to be removed.
  *
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
+ *
+ * Note: This function attempts to invalidate the slot for multiple possible
+ * causes in a single pass, minimizing redundant iterations. The "cause"
+ * parameter can be a MASK representing one or more of the defined causes.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
 bool
-InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
+InvalidateObsoleteReplicationSlots(int possible_causes,
 								   XLogSegNo oldestSegno, Oid dboid,
 								   TransactionId snapshotConflictHorizon)
 {
 	XLogRecPtr	oldestLSN;
 	bool		invalidated = false;
 
-	Assert(cause != RS_INVAL_HORIZON || TransactionIdIsValid(snapshotConflictHorizon));
-	Assert(cause != RS_INVAL_WAL_REMOVED || oldestSegno > 0);
-	Assert(cause != RS_INVAL_NONE);
+	Assert(!(possible_causes & RS_INVAL_HORIZON) || TransactionIdIsValid(snapshotConflictHorizon));
+	Assert(!(possible_causes & RS_INVAL_WAL_REMOVED) || oldestSegno > 0);
+	Assert(!(possible_causes & RS_INVAL_NONE));
 
 	if (max_replication_slots == 0)
 		return invalidated;
@@ -1837,7 +1924,7 @@ restart:
 		if (!s->in_use)
 			continue;
 
-		if (InvalidatePossiblyObsoleteSlot(cause, s, oldestLSN, dboid,
+		if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid,
 										   snapshotConflictHorizon,
 										   &invalidated))
 		{
@@ -2428,18 +2515,43 @@ RestoreSlotFromDisk(const char *name)
 ReplicationSlotInvalidationCause
 GetSlotInvalidationCause(const char *invalidation_reason)
 {
-	ReplicationSlotInvalidationCause cause;
 	ReplicationSlotInvalidationCause result = RS_INVAL_NONE;
 	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
+	int			cause_idx;
 
 	Assert(invalidation_reason);
 
-	for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
+	for (cause_idx = 0; cause_idx <= RS_INVAL_MAX_CAUSES; cause_idx++)
 	{
-		if (strcmp(SlotInvalidationCauses[cause], invalidation_reason) == 0)
+		if (strcmp(InvalidationCauses[cause_idx].cause_name, invalidation_reason) == 0)
 		{
 			found = true;
-			result = cause;
+			result = InvalidationCauses[cause_idx].cause;
+			break;
+		}
+	}
+
+	Assert(found);
+	return result;
+}
+
+/*
+ * Maps an ReplicationSlotInvalidationCause to the invalidation
+ * reason for a replication slot.
+ */
+const char *
+GetSlotInvalidationCauseName(ReplicationSlotInvalidationCause cause)
+{
+	const char *result = "none";
+	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
+	int			cause_idx;
+
+	for (cause_idx = 0; cause_idx <= RS_INVAL_MAX_CAUSES; cause_idx++)
+	{
+		if (InvalidationCauses[cause_idx].cause == cause)
+		{
+			found = true;
+			result = InvalidationCauses[cause_idx].cause_name;
 			break;
 		}
 	}
@@ -2802,3 +2914,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 8be4b8c65b..f652ec8a73 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -431,7 +431,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		if (cause == RS_INVAL_NONE)
 			nulls[i++] = true;
 		else
-			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+			values[i++] = CStringGetTextDatum(GetSlotInvalidationCauseName(cause));
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index ce7534d4d2..758329b1c1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		0, 0, INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index c40b7a3121..f9a5561166 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -326,6 +326,7 @@
 				# (change requires restart)
 #wal_keep_size = 0		# in megabytes; 0 disables
 #max_slot_wal_keep_size = -1	# in megabytes; -1 disables
+#idle_replication_slot_timeout = 0	# in minutes; 0 disables
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 000c36d30d..ff608b85e0 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -44,21 +44,28 @@ typedef enum ReplicationSlotPersistency
  * Slots can be invalidated, e.g. due to max_slot_wal_keep_size. If so, the
  * 'invalidated' field is set to a value other than _NONE.
  *
- * When adding a new invalidation cause here, remember to update
- * SlotInvalidationCauses and RS_INVAL_MAX_CAUSES.
+ * When adding a new invalidation cause here, the value must be powers of 2
+ * (e.g., 1, 2, 4...) for proper bitwise operations. Also, remember to update
+ * SlotInvalidationCauseMap in slot.c.
  */
 typedef enum ReplicationSlotInvalidationCause
 {
-	RS_INVAL_NONE,
+	RS_INVAL_NONE = 0,
 	/* required WAL has been removed */
-	RS_INVAL_WAL_REMOVED,
+	RS_INVAL_WAL_REMOVED = (1 << 0),
 	/* required rows have been removed */
-	RS_INVAL_HORIZON,
+	RS_INVAL_HORIZON = (1 << 1),
 	/* wal_level insufficient for slot */
-	RS_INVAL_WAL_LEVEL,
+	RS_INVAL_WAL_LEVEL = (1 << 2),
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT = (1 << 3),
 } ReplicationSlotInvalidationCause;
 
-extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
+typedef struct SlotInvalidationCauseMap
+{
+	int			cause;
+	const char *cause_name;
+}			SlotInvalidationCauseMap;
 
 /*
  * On-Disk data of a replication slot, preserved across restarts.
@@ -254,6 +261,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -286,7 +294,7 @@ extern void ReplicationSlotsComputeRequiredLSN(void);
 extern XLogRecPtr ReplicationSlotsComputeLogicalRestartLSN(void);
 extern bool ReplicationSlotsCountDBSlots(Oid dboid, int *nslots, int *nactive);
 extern void ReplicationSlotsDropDBSlots(Oid dboid);
-extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
+extern bool InvalidateObsoleteReplicationSlots(int possible_causes,
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
@@ -303,6 +311,7 @@ extern void CheckSlotRequirements(void);
 extern void CheckSlotPermissions(void);
 extern ReplicationSlotInvalidationCause
 			GetSlotInvalidationCause(const char *invalidation_reason);
+extern const char *GetSlotInvalidationCauseName(ReplicationSlotInvalidationCause cause);
 
 extern bool SlotExistsInSyncStandbySlots(const char *slot_name);
 extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..9963bddc0e 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -107,6 +107,9 @@ extern long TimestampDifferenceMilliseconds(TimestampTz start_time,
 extern bool TimestampDifferenceExceeds(TimestampTz start_time,
 									   TimestampTz stop_time,
 									   int msec);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
 extern pg_time_t timestamptz_to_time_t(TimestampTz t);
-- 
2.34.1

v73-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/octet-stream; name=v73-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From 87cf5eeb9d6fc1e21d192d1a94e58d43055ed510 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 10 Feb 2025 16:26:17 +0530
Subject: [PATCH v73 2/2] Add TAP test for slot invalidation based on inactive
 timeout.

This test uses injection points to bypass the time overhead caused by the
idle_replication_slot_timeout GUC, which has a minimum value of one minute.
---
 src/backend/replication/slot.c                |  29 +++--
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 110 ++++++++++++++++++
 3 files changed, 131 insertions(+), 9 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 1b34d256a5..d7040db955 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
+#include "utils/injection_point.h"
 #include "utils/guc_hooks.h"
 #include "utils/varlena.h"
 
@@ -1724,16 +1725,26 @@ InvalidatePossiblyObsoleteSlot(int possible_causes,
 			{
 				Assert(now > 0);
 
-				/*
-				 * Check if the slot needs to be invalidated due to
-				 * idle_replication_slot_timeout GUC.
-				 */
-				if (CanInvalidateIdleSlot(s) &&
-					TimestampDifferenceExceedsSeconds(s->inactive_since, now,
-													  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+				if (CanInvalidateIdleSlot(s))
 				{
-					invalidation_cause = RS_INVAL_IDLE_TIMEOUT;
-					inactive_since = s->inactive_since;
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 *
+					 * To test idle timeout slot invalidation, if the
+					 * "slot-timeout-inval" injection point is attached,
+					 * immediately invalidate the slot.
+					 */
+					if (
+#ifdef USE_INJECTION_POINTS
+						IS_INJECTION_POINT_ATTACHED("slot-timeout-inval") ||
+#endif
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = RS_INVAL_IDLE_TIMEOUT;
+						inactive_since = s->inactive_since;
+					}
 				}
 			}
 		}
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..2392f24711
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,110 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation due to idle_timeout
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# This test depends on injection point that forces slot invalidation
+# due to idle_timeout. Enabling injections points requires
+# --enable-injection-points with configure or
+# -Dinjection_points=true with Meson.
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $node_name = $node->name;
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(
+		qr/invalidating obsolete replication slot \"$slot_name\"/, $offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot_name to be set on node $node_name";
+}
+
+# ========================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical slot due to idle
+# timeout.
+
+# Initialize the node
+my $node = PostgreSQL::Test::Cluster->new('node');
+$node->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$node->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+idle_replication_slot_timeout = 1min
+});
+$node->start;
+
+# Create both streaming standby and logical slot
+$node->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'physical_slot', immediately_reserve := true);
+]);
+$node->safe_psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('logical_slot', 'test_decoding');}
+);
+
+my $log_offset = -s $node->logfile;
+
+# Register an injection point on the node to forcibly cause a slot
+# invalidation due to idle_timeout
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Check if the 'injection_points' extension is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('slot-timeout-inval', 'error');");
+
+# Idle timeout slot invalidation occurs during a checkpoint, so run a
+# checkpoint to invalidate the slots.
+$node->safe_psql('postgres', "CHECKPOINT");
+
+# Wait for slots to become inactive. Note that since nobody has acquired the
+# slot yet, then if it has been invalidated that can only be due to the idle
+# timeout mechanism.
+wait_for_slot_invalidation($node, 'physical_slot', $log_offset);
+wait_for_slot_invalidation($node, 'logical_slot', $log_offset);
+
+# Check that the invalidated slot cannot be acquired
+my $node_name = $node->name;
+my ($result, $stdout, $stderr);
+($result, $stdout, $stderr) = $node->psql(
+	'postgres', qq[
+		SELECT pg_replication_slot_advance('logical_slot', '0/1');
+]);
+ok( $stderr =~ /can no longer access replication slot "logical_slot"/,
+	"detected error upon trying to acquire invalidated slot on node")
+  or die
+  "could not detect error upon trying to acquire invalidated slot \"logical_slot\" on node";
+
+# Testcase end
+# =============================================================================
+
+done_testing();
-- 
2.34.1

#394

vignesh C

vignesh21@gmail.com

11 months ago

In reply to: Nisha Moond (#393)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, 10 Feb 2025 at 17:33, Nisha Moond <nisha.moond412@gmail.com> wrote:

Here are the v73 patches incorporating the comments above and the
subsequent comments from [1].
- patch 002 is rebased on 001 with no new changes.

Few comments:
1) For some reason SlotInvalidationCauses was with PGDLLIMPORT, this
is removed now. This is required if it needs to be accessible by
loaded modules. Is there any impact or is it ok?
-extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
+typedef struct SlotInvalidationCauseMap
+{
+       int                     cause;
+       const char *cause_name;
+}                      SlotInvalidationCauseMap;

2) The new structure should be added to typedefs.list:
+typedef struct SlotInvalidationCauseMap
+{
+       int                     cause;
+       const char *cause_name;
+}                      SlotInvalidationCauseMap;

3) After adding you can run pgindent on slot.h to indent the following code:
+typedef struct SlotInvalidationCauseMap
+{
+       int                     cause;
+       const char *cause_name;
+}                      SlotInvalidationCauseMap;

Regards,
Vignesh

#395

Nisha Moond

nisha.moond412@gmail.com

11 months ago

In reply to: vignesh C (#394)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Feb 10, 2025 at 6:12 PM vignesh C <vignesh21@gmail.com> wrote:

On Mon, 10 Feb 2025 at 17:33, Nisha Moond <nisha.moond412@gmail.com> wrote:

Here are the v73 patches incorporating the comments above and the
subsequent comments from [1].
- patch 002 is rebased on 001 with no new changes.

Few comments:
1) For some reason SlotInvalidationCauses was with PGDLLIMPORT, this
is removed now. This is required if it needs to be accessible by
loaded modules. Is there any impact or is it ok?
-extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
+typedef struct SlotInvalidationCauseMap
+{
+       int                     cause;
+       const char *cause_name;
+}                      SlotInvalidationCauseMap;

2) The new structure should be added to typedefs.list:
+typedef struct SlotInvalidationCauseMap
+{
+       int                     cause;
+       const char *cause_name;
+}                      SlotInvalidationCauseMap;

3) After adding you can run pgindent on slot.h to indent the following code:
+typedef struct SlotInvalidationCauseMap
+{
+       int                     cause;
+       const char *cause_name;
+}                      SlotInvalidationCauseMap;

Addressed above comments, please find the attached v74 patches.
Also, corrected the type of parameter "possible_causes" to 'uint32' in
InvalidateObsoleteReplicationSlots() and
InvalidatePossiblyObsoleteSlot().

--
Thanks,
Nisha

Attachments:

v74-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v74-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 4987776c0883d4a6a76bd5280ad09ae03e5a8faf Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 3 Feb 2025 15:20:40 +0530
Subject: [PATCH v74 1/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not applicable
for slots that do not reserve WAL or for slots on the standby server
that are being synced from the primary server (i.e., standby slots
having 'synced' field 'true'). Synced slots are always considered to be
inactive because they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  42 +++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |   7 +
 src/backend/access/transam/xlog.c             |   4 +-
 src/backend/replication/slot.c                | 251 +++++++++++++-----
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/utils/adt/timestamp.c             |  18 ++
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  27 +-
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 src/tools/pgindent/typedefs.list              |   1 +
 15 files changed, 315 insertions(+), 71 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 38244409e3..a915a43625 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4423,6 +4423,48 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </listitem>
       </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero (the default) disables the idle timeout
+        invalidation mechanism. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the
+        slot's <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not applicable
+        for slots that do not reserve WAL or for slots on the standby server
+        that are being synced from the primary server (i.e., standby slots
+        having <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>). Synced slots are always considered to
+        be inactive because they don't perform logical decoding to produce
+        changes. Slots that appear idle due to a disrupted connection between
+        the publisher and subscriber are also excluded, as they are managed by
+        <link linkend="guc-wal-sender-timeout"><varname>wal_sender_timeout</varname></link>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
       <term><varname>wal_sender_timeout</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 613abcd28b..3d18e507bb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2390,6 +2390,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be81c2b51d..f58b9406e4 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2621,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9c270e7d46..3eaf0bf311 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7337,7 +7337,7 @@ CreateCheckPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 	KeepLogSeg(recptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
@@ -7792,7 +7792,7 @@ CreateRestartPoint(int flags)
 	replayPtr = GetXLogReplayRecPtr(&replayTLI);
 	endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr;
 	KeepLogSeg(endptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fe5acd8b1f..eccada9d4c 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -102,18 +102,16 @@ typedef struct
 /*
  * Lookup table for slot invalidation causes.
  */
-const char *const SlotInvalidationCauses[] = {
-	[RS_INVAL_NONE] = "none",
-	[RS_INVAL_WAL_REMOVED] = "wal_removed",
-	[RS_INVAL_HORIZON] = "rows_removed",
-	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+const SlotInvalidationCauseMap InvalidationCauses[] = {
+	{RS_INVAL_NONE, "none"},
+	{RS_INVAL_WAL_REMOVED, "wal_removed"},
+	{RS_INVAL_HORIZON, "rows_removed"},
+	{RS_INVAL_WAL_LEVEL, "wal_level_insufficient"},
+	{RS_INVAL_IDLE_TIMEOUT, "idle_timeout"},
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
-
-StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
-				 "array length mismatch");
+#define	RS_INVAL_MAX_CAUSES (lengthof(InvalidationCauses)-1)
 
 /* size of version independent data */
 #define ReplicationSlotOnDiskConstantSize \
@@ -141,6 +139,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = 0;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -575,7 +579,7 @@ retry:
 				errmsg("can no longer access replication slot \"%s\"",
 					   NameStr(s->data.name)),
 				errdetail("This replication slot has been invalidated due to \"%s\".",
-						  SlotInvalidationCauses[s->data.invalidated]));
+						  GetSlotInvalidationCauseName(s->data.invalidated)));
 	}
 
 	/*
@@ -1512,12 +1516,18 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since,
+					   TimestampTz now)
 {
+	int			minutes;
+	int			secs;
+	long		elapsed_secs;
 	StringInfoData err_detail;
-	bool		hint = false;
+	StringInfoData err_hint;
 
 	initStringInfo(&err_detail);
+	initStringInfo(&err_hint);
 
 	switch (cause)
 	{
@@ -1525,13 +1535,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			{
 				unsigned long long ex = oldestLSN - restart_lsn;
 
-				hint = true;
 				appendStringInfo(&err_detail,
 								 ngettext("The slot's restart_lsn %X/%X exceeds the limit by %llu byte.",
 										  "The slot's restart_lsn %X/%X exceeds the limit by %llu bytes.",
 										  ex),
 								 LSN_FORMAT_ARGS(restart_lsn),
 								 ex);
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+								 "max_slot_wal_keep_size");
 				break;
 			}
 		case RS_INVAL_HORIZON:
@@ -1542,6 +1554,24 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0 && now > 0);
+
+			/* Calculate the idle time duration of the slot */
+			elapsed_secs = (now - inactive_since) / USECS_PER_SEC;
+			minutes = elapsed_secs / SECS_PER_MINUTE;
+			secs = elapsed_secs % SECS_PER_MINUTE;
+
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_detail, _("The slot's idle time of %d minutes and %02d seconds exceeds the configured \"%s\" duration of %d minutes."),
+							 minutes, secs, "idle_replication_slot_timeout",
+							 idle_replication_slot_timeout_mins);
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1553,9 +1583,31 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			errmsg("invalidating obsolete replication slot \"%s\"",
 				   NameStr(slotname)),
 			errdetail_internal("%s", err_detail.data),
-			hint ? errhint("You might need to increase \"%s\".", "max_slot_wal_keep_size") : 0);
+			err_hint.len ? errhint("%s", err_hint.data) : 0);
 
 	pfree(err_detail.data);
+	pfree(err_hint.data);
+}
+
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has reserved WAL
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server is in
+ *	  recovery. This is because synced slots are always considered to be
+ *	  inactive because they don't perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins != 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
 }
 
 /*
@@ -1572,7 +1624,7 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
  * for syscalls, so caller must restart if we return true.
  */
 static bool
-InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
+InvalidatePossiblyObsoleteSlot(uint32 possible_causes,
 							   ReplicationSlot *s,
 							   XLogRecPtr oldestLSN,
 							   Oid dboid, TransactionId snapshotConflictHorizon,
@@ -1585,6 +1637,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1592,6 +1645,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1602,6 +1656,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (possible_causes & RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * Assign the current time here to avoid system call overhead
+			 * while holding the spinlock in subsequent code.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1629,34 +1692,49 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				initial_catalog_effective_xmin = s->effective_catalog_xmin;
 			}
 
-			switch (cause)
+			if (possible_causes & RS_INVAL_WAL_REMOVED)
+			{
+				if (initial_restart_lsn != InvalidXLogRecPtr &&
+					initial_restart_lsn < oldestLSN)
+					invalidation_cause = RS_INVAL_WAL_REMOVED;
+			}
+			if (invalidation_cause == RS_INVAL_NONE &&
+				(possible_causes & RS_INVAL_HORIZON))
+			{
+				if (SlotIsLogical(s) &&
+				/* invalid DB oid signals a shared relation */
+					(dboid == InvalidOid || dboid == s->data.database) &&
+					TransactionIdIsValid(initial_effective_xmin) &&
+					TransactionIdPrecedesOrEquals(initial_effective_xmin,
+												  snapshotConflictHorizon))
+					invalidation_cause = RS_INVAL_HORIZON;
+				else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
+						 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
+													   snapshotConflictHorizon))
+					invalidation_cause = RS_INVAL_HORIZON;
+			}
+			if (invalidation_cause == RS_INVAL_NONE &&
+				(possible_causes & RS_INVAL_WAL_LEVEL))
 			{
-				case RS_INVAL_WAL_REMOVED:
-					if (initial_restart_lsn != InvalidXLogRecPtr &&
-						initial_restart_lsn < oldestLSN)
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_HORIZON:
-					if (!SlotIsLogical(s))
-						break;
-					/* invalid DB oid signals a shared relation */
-					if (dboid != InvalidOid && dboid != s->data.database)
-						break;
-					if (TransactionIdIsValid(initial_effective_xmin) &&
-						TransactionIdPrecedesOrEquals(initial_effective_xmin,
-													  snapshotConflictHorizon))
-						invalidation_cause = cause;
-					else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
-							 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
-														   snapshotConflictHorizon))
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_WAL_LEVEL:
-					if (SlotIsLogical(s))
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_NONE:
-					pg_unreachable();
+				if (SlotIsLogical(s))
+					invalidation_cause = RS_INVAL_WAL_LEVEL;
+			}
+			if (invalidation_cause == RS_INVAL_NONE &&
+				(possible_causes & RS_INVAL_IDLE_TIMEOUT))
+			{
+				Assert(now > 0);
+
+				/*
+				 * Check if the slot needs to be invalidated due to
+				 * idle_replication_slot_timeout GUC.
+				 */
+				if (CanInvalidateIdleSlot(s) &&
+					TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+													  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+				{
+					invalidation_cause = RS_INVAL_IDLE_TIMEOUT;
+					inactive_since = s->inactive_since;
+				}
 			}
 		}
 
@@ -1705,9 +1783,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
+		 * max_slot_wal_keep_size is set to -1 and
+		 * idle_replication_slot_timeout is set to 0 during the binary
+		 * upgrade. See check_old_cluster_for_valid_slots() where we ensure
+		 * that no invalidated before the upgrade.
 		 */
 		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
 
@@ -1739,7 +1818,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since, now);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1785,7 +1865,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since, now);
 
 			/* done with this slot for now */
 			break;
@@ -1800,28 +1881,34 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 /*
  * Invalidate slots that require resources about to be removed.
  *
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
+ *
+ * Note: This function attempts to invalidate the slot for multiple possible
+ * causes in a single pass, minimizing redundant iterations. The "cause"
+ * parameter can be a MASK representing one or more of the defined causes.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
 bool
-InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
+InvalidateObsoleteReplicationSlots(uint32 possible_causes,
 								   XLogSegNo oldestSegno, Oid dboid,
 								   TransactionId snapshotConflictHorizon)
 {
 	XLogRecPtr	oldestLSN;
 	bool		invalidated = false;
 
-	Assert(cause != RS_INVAL_HORIZON || TransactionIdIsValid(snapshotConflictHorizon));
-	Assert(cause != RS_INVAL_WAL_REMOVED || oldestSegno > 0);
-	Assert(cause != RS_INVAL_NONE);
+	Assert(!(possible_causes & RS_INVAL_HORIZON) || TransactionIdIsValid(snapshotConflictHorizon));
+	Assert(!(possible_causes & RS_INVAL_WAL_REMOVED) || oldestSegno > 0);
+	Assert(!(possible_causes & RS_INVAL_NONE));
 
 	if (max_replication_slots == 0)
 		return invalidated;
@@ -1837,7 +1924,7 @@ restart:
 		if (!s->in_use)
 			continue;
 
-		if (InvalidatePossiblyObsoleteSlot(cause, s, oldestLSN, dboid,
+		if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid,
 										   snapshotConflictHorizon,
 										   &invalidated))
 		{
@@ -2428,18 +2515,43 @@ RestoreSlotFromDisk(const char *name)
 ReplicationSlotInvalidationCause
 GetSlotInvalidationCause(const char *invalidation_reason)
 {
-	ReplicationSlotInvalidationCause cause;
 	ReplicationSlotInvalidationCause result = RS_INVAL_NONE;
 	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
+	int			cause_idx;
 
 	Assert(invalidation_reason);
 
-	for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
+	for (cause_idx = 0; cause_idx <= RS_INVAL_MAX_CAUSES; cause_idx++)
 	{
-		if (strcmp(SlotInvalidationCauses[cause], invalidation_reason) == 0)
+		if (strcmp(InvalidationCauses[cause_idx].cause_name, invalidation_reason) == 0)
 		{
 			found = true;
-			result = cause;
+			result = InvalidationCauses[cause_idx].cause;
+			break;
+		}
+	}
+
+	Assert(found);
+	return result;
+}
+
+/*
+ * Maps an ReplicationSlotInvalidationCause to the invalidation
+ * reason for a replication slot.
+ */
+const char *
+GetSlotInvalidationCauseName(ReplicationSlotInvalidationCause cause)
+{
+	const char *result = "none";
+	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
+	int			cause_idx;
+
+	for (cause_idx = 0; cause_idx <= RS_INVAL_MAX_CAUSES; cause_idx++)
+	{
+		if (InvalidationCauses[cause_idx].cause == cause)
+		{
+			found = true;
+			result = InvalidationCauses[cause_idx].cause_name;
 			break;
 		}
 	}
@@ -2802,3 +2914,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 8be4b8c65b..f652ec8a73 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -431,7 +431,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		if (cause == RS_INVAL_NONE)
 			nulls[i++] = true;
 		else
-			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+			values[i++] = CStringGetTextDatum(GetSlotInvalidationCauseName(cause));
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index ce7534d4d2..758329b1c1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		0, 0, INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index c40b7a3121..f9a5561166 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -326,6 +326,7 @@
 				# (change requires restart)
 #wal_keep_size = 0		# in megabytes; 0 disables
 #max_slot_wal_keep_size = -1	# in megabytes; -1 disables
+#idle_replication_slot_timeout = 0	# in minutes; 0 disables
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 000c36d30d..161784b15b 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -44,21 +44,30 @@ typedef enum ReplicationSlotPersistency
  * Slots can be invalidated, e.g. due to max_slot_wal_keep_size. If so, the
  * 'invalidated' field is set to a value other than _NONE.
  *
- * When adding a new invalidation cause here, remember to update
- * SlotInvalidationCauses and RS_INVAL_MAX_CAUSES.
+ * When adding a new invalidation cause here, the value must be powers of 2
+ * (e.g., 1, 2, 4...) for proper bitwise operations. Also, remember to update
+ * SlotInvalidationCauseMap in slot.c.
  */
 typedef enum ReplicationSlotInvalidationCause
 {
-	RS_INVAL_NONE,
+	RS_INVAL_NONE = 0,
 	/* required WAL has been removed */
-	RS_INVAL_WAL_REMOVED,
+	RS_INVAL_WAL_REMOVED = (1 << 0),
 	/* required rows have been removed */
-	RS_INVAL_HORIZON,
+	RS_INVAL_HORIZON = (1 << 1),
 	/* wal_level insufficient for slot */
-	RS_INVAL_WAL_LEVEL,
+	RS_INVAL_WAL_LEVEL = (1 << 2),
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT = (1 << 3),
 } ReplicationSlotInvalidationCause;
 
-extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
+typedef struct SlotInvalidationCauseMap
+{
+	int			cause;
+	const char *cause_name;
+} SlotInvalidationCauseMap;
+
+extern PGDLLIMPORT const SlotInvalidationCauseMap InvalidationCauses[];
 
 /*
  * On-Disk data of a replication slot, preserved across restarts.
@@ -254,6 +263,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -286,7 +296,7 @@ extern void ReplicationSlotsComputeRequiredLSN(void);
 extern XLogRecPtr ReplicationSlotsComputeLogicalRestartLSN(void);
 extern bool ReplicationSlotsCountDBSlots(Oid dboid, int *nslots, int *nactive);
 extern void ReplicationSlotsDropDBSlots(Oid dboid);
-extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
+extern bool InvalidateObsoleteReplicationSlots(uint32 possible_causes,
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
@@ -303,6 +313,7 @@ extern void CheckSlotRequirements(void);
 extern void CheckSlotPermissions(void);
 extern ReplicationSlotInvalidationCause
 			GetSlotInvalidationCause(const char *invalidation_reason);
+extern const char *GetSlotInvalidationCauseName(ReplicationSlotInvalidationCause cause);
 
 extern bool SlotExistsInSyncStandbySlots(const char *slot_name);
 extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..9963bddc0e 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -107,6 +107,9 @@ extern long TimestampDifferenceMilliseconds(TimestampTz start_time,
 extern bool TimestampDifferenceExceeds(TimestampTz start_time,
 									   TimestampTz stop_time,
 									   int msec);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
 extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 9a3bee93de..af8eda5c85 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2687,6 +2687,7 @@ SkipPages
 SlabBlock
 SlabContext
 SlabSlot
+SlotInvalidationCauseMap
 SlotNumber
 SlotSyncCtxStruct
 SlruCtl
-- 
2.34.1

v74-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/octet-stream; name=v74-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From 809c29efdc2836f1d9c3c9db87c94dd60796a3b8 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 10 Feb 2025 16:26:17 +0530
Subject: [PATCH v74 2/2] Add TAP test for slot invalidation based on inactive
 timeout.

This test uses injection points to bypass the time overhead caused by the
idle_replication_slot_timeout GUC, which has a minimum value of one minute.
---
 src/backend/replication/slot.c                |  29 +++--
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 110 ++++++++++++++++++
 3 files changed, 131 insertions(+), 9 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index eccada9d4c..8e260272c4 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
+#include "utils/injection_point.h"
 #include "utils/guc_hooks.h"
 #include "utils/varlena.h"
 
@@ -1724,16 +1725,26 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes,
 			{
 				Assert(now > 0);
 
-				/*
-				 * Check if the slot needs to be invalidated due to
-				 * idle_replication_slot_timeout GUC.
-				 */
-				if (CanInvalidateIdleSlot(s) &&
-					TimestampDifferenceExceedsSeconds(s->inactive_since, now,
-													  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+				if (CanInvalidateIdleSlot(s))
 				{
-					invalidation_cause = RS_INVAL_IDLE_TIMEOUT;
-					inactive_since = s->inactive_since;
+					/*
+					 * Check if the slot needs to be invalidated due to
+					 * idle_replication_slot_timeout GUC.
+					 *
+					 * To test idle timeout slot invalidation, if the
+					 * "slot-timeout-inval" injection point is attached,
+					 * immediately invalidate the slot.
+					 */
+					if (
+#ifdef USE_INJECTION_POINTS
+						IS_INJECTION_POINT_ATTACHED("slot-timeout-inval") ||
+#endif
+						TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+														  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+					{
+						invalidation_cause = RS_INVAL_IDLE_TIMEOUT;
+						inactive_since = s->inactive_since;
+					}
 				}
 			}
 		}
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..2392f24711
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,110 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation due to idle_timeout
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# This test depends on injection point that forces slot invalidation
+# due to idle_timeout. Enabling injections points requires
+# --enable-injection-points with configure or
+# -Dinjection_points=true with Meson.
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $node_name = $node->name;
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(
+		qr/invalidating obsolete replication slot \"$slot_name\"/, $offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot_name to be set on node $node_name";
+}
+
+# ========================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical slot due to idle
+# timeout.
+
+# Initialize the node
+my $node = PostgreSQL::Test::Cluster->new('node');
+$node->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$node->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+idle_replication_slot_timeout = 1min
+});
+$node->start;
+
+# Create both streaming standby and logical slot
+$node->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'physical_slot', immediately_reserve := true);
+]);
+$node->safe_psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('logical_slot', 'test_decoding');}
+);
+
+my $log_offset = -s $node->logfile;
+
+# Register an injection point on the node to forcibly cause a slot
+# invalidation due to idle_timeout
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Check if the 'injection_points' extension is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('slot-timeout-inval', 'error');");
+
+# Idle timeout slot invalidation occurs during a checkpoint, so run a
+# checkpoint to invalidate the slots.
+$node->safe_psql('postgres', "CHECKPOINT");
+
+# Wait for slots to become inactive. Note that since nobody has acquired the
+# slot yet, then if it has been invalidated that can only be due to the idle
+# timeout mechanism.
+wait_for_slot_invalidation($node, 'physical_slot', $log_offset);
+wait_for_slot_invalidation($node, 'logical_slot', $log_offset);
+
+# Check that the invalidated slot cannot be acquired
+my $node_name = $node->name;
+my ($result, $stdout, $stderr);
+($result, $stdout, $stderr) = $node->psql(
+	'postgres', qq[
+		SELECT pg_replication_slot_advance('logical_slot', '0/1');
+]);
+ok( $stderr =~ /can no longer access replication slot "logical_slot"/,
+	"detected error upon trying to acquire invalidated slot on node")
+  or die
+  "could not detect error upon trying to acquire invalidated slot \"logical_slot\" on node";
+
+# Testcase end
+# =============================================================================
+
+done_testing();
-- 
2.34.1

#396

Peter Smith

smithpb2250@gmail.com

11 months ago

In reply to: Nisha Moond (#395)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hi Nisha.

Some review comments about v74-0001

======
src/backend/replication/slot.c

1.
 /* Maximum number of invalidation causes */
-#define RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
-
-StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
- "array length mismatch");
+#define RS_INVAL_MAX_CAUSES (lengthof(InvalidationCauses)-1)

The static assert was here to protect against dev mistakes in keeping
the lookup table up-to-date with the enum of slot.h. So it's not a
good idea to remove it...

IMO the RS_INVAL_MAX_CAUSES should be relocated to slot.h where the
enum is defined and where the devs know exactly how many invalidation
types there are. Then this static assert can be put back in to do its
job of ensuring the integrity properly again for this lookup table.

~~~

InvalidatePossiblyObsoleteSlot:

2.
+ if (possible_causes & RS_INVAL_IDLE_TIMEOUT)
+ {
+ /*
+ * Assign the current time here to avoid system call overhead
+ * while holding the spinlock in subsequent code.
+ */
+ now = GetCurrentTimestamp();
+ }
+

I felt that any minuscule benefit gained from having this conditional
'now' assignment is outweighed by the subsequent confusion/doubt
caused by passing around a 'now' to other functions where it may or
may not still be zero depending on different processing. IMO we should
just remove all doubts and always assign it so that "now always means
now".

~~~

3.
+ if (possible_causes & RS_INVAL_IDLE_TIMEOUT)

IMO every bitwise check like this should also be checking
(invalidation_cause == RS_INVAL_NONE). Maybe you omitted it here
because this is the first but I think it will be safer anyhow in case
the code gets shuffled around in future and the extra condition gets
overlooked.

~~~

4.
+ if (possible_causes & RS_INVAL_WAL_REMOVED)
+ {
+ if (initial_restart_lsn != InvalidXLogRecPtr &&
+ initial_restart_lsn < oldestLSN)
+ invalidation_cause = RS_INVAL_WAL_REMOVED;
+ }
+ if (invalidation_cause == RS_INVAL_NONE &&
+ (possible_causes & RS_INVAL_HORIZON))
+ {
+ if (SlotIsLogical(s) &&
+ /* invalid DB oid signals a shared relation */
+ (dboid == InvalidOid || dboid == s->data.database) &&
+ TransactionIdIsValid(initial_effective_xmin) &&
+ TransactionIdPrecedesOrEquals(initial_effective_xmin,
+   snapshotConflictHorizon))
+ invalidation_cause = RS_INVAL_HORIZON;
+ else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
+ TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
+    snapshotConflictHorizon))
+ invalidation_cause = RS_INVAL_HORIZON;
+ }
+ if (invalidation_cause == RS_INVAL_NONE &&

I suggest adding blank lines where the bit conditions are to delineate
each of the different invalidation checks.

~~~

InvalidateObsoleteReplicationSlots:

5
- Assert(cause != RS_INVAL_HORIZON ||
TransactionIdIsValid(snapshotConflictHorizon));
- Assert(cause != RS_INVAL_WAL_REMOVED || oldestSegno > 0);
- Assert(cause != RS_INVAL_NONE);
+ Assert(!(possible_causes & RS_INVAL_HORIZON) ||
TransactionIdIsValid(snapshotConflictHorizon));
+ Assert(!(possible_causes & RS_INVAL_WAL_REMOVED) || oldestSegno > 0);
+ Assert(!(possible_causes & RS_INVAL_NONE));

AFAIK the RS_INVAL_NONE is defined as 0, so doing bit-wise operations
on 0 seems bogus.

Do you mean just Assert(possible_causes != NONE);

======
src/include/replication/slot.h

6.
-extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
+typedef struct SlotInvalidationCauseMap
+{
+ int cause;
+ const char *cause_name;
+} SlotInvalidationCauseMap;
+
+extern PGDLLIMPORT const SlotInvalidationCauseMap InvalidationCauses[];

6a.
AFAIK. there is no longer any external access to this lookup table so
why do you need this extern. Similarly, why is this typedef even here
instead of declared in the slot.c module.

6b.
Why is the field 'cause' declared as int instead of
ReplicationSlotInvalidationCause?

======

Please the attached top-up patch as a code example of some of my
suggestions above -- in particular the relocating of
RS_INVAL_MAX_CAUSES and the typedef, and the reinstating of the static
insert for the lookup table.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachments:

PS_topup_for_v740001.txttext/plain; charset=US-ASCII; name=PS_topup_for_v740001.txtDownload

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index eccada9..2452eac 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -99,9 +99,16 @@ typedef struct
 	char		slot_names[FLEXIBLE_ARRAY_MEMBER];
 } SyncStandbySlotsConfigData;
 
+
 /*
  * Lookup table for slot invalidation causes.
  */
+typedef struct SlotInvalidationCauseMap
+{
+	ReplicationSlotInvalidationCause cause;
+	const char *cause_name;
+} SlotInvalidationCauseMap;
+
 const SlotInvalidationCauseMap InvalidationCauses[] = {
 	{RS_INVAL_NONE, "none"},
 	{RS_INVAL_WAL_REMOVED, "wal_removed"},
@@ -110,8 +117,8 @@ const SlotInvalidationCauseMap InvalidationCauses[] = {
 	{RS_INVAL_IDLE_TIMEOUT, "idle_timeout"},
 };
 
-/* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES (lengthof(InvalidationCauses)-1)
+StaticAssertDecl(lengthof(InvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
+				 "array length mismatch");
 
 /* size of version independent data */
 #define ReplicationSlotOnDiskConstantSize \
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 161784b..56ba48e 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -61,14 +61,8 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_IDLE_TIMEOUT = (1 << 3),
 } ReplicationSlotInvalidationCause;
 
-typedef struct SlotInvalidationCauseMap
-{
-	int			cause;
-	const char *cause_name;
-} SlotInvalidationCauseMap;
-
-extern PGDLLIMPORT const SlotInvalidationCauseMap InvalidationCauses[];
-
+/* Maximum number of invalidation causes */
+#define	RS_INVAL_MAX_CAUSES 4
 /*
  * On-Disk data of a replication slot, preserved across restarts.
  */

#397

Zhijie Hou (Fujitsu)

houzj.fnst@fujitsu.com

11 months ago

In reply to: Nisha Moond (#393)

1 attachment(s)

RE: Introduce XID age and inactive timeout based replication slot invalidation

On Monday, February 10, 2025 8:03 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

On Sat, Feb 8, 2025 at 12:28 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
wrote:
3.
+                       if (cause & RS_INVAL_HORIZON)
+                       {
+                               if (!SlotIsLogical(s))
+                                       goto invalidation_marked;
I am not sure if this logic is correct. Even if the slot would not be
invalidated due to RS_INVAL_HORIZON, we should continue to check other
causes.

Used goto here since we do not expect RS_INVAL_HORIZON to be combined
with any other "cause" and to keep the pgHead behavior.
However, with the bitflag approach, the code should be future-safe, so
replacing goto in v73 should handle this now.

I think the following logic needs some adjustments.

+			if (invalidation_cause == RS_INVAL_NONE &&
+				(possible_causes & RS_INVAL_HORIZON))
+			{
+				if (SlotIsLogical(s) &&
+				/* invalid DB oid signals a shared relation */
+					(dboid == InvalidOid || dboid == s->data.database) &&
+					TransactionIdIsValid(initial_effective_xmin) &&
+					TransactionIdPrecedesOrEquals(initial_effective_xmin,
+												  snapshotConflictHorizon))
+					invalidation_cause = RS_INVAL_HORIZON;
+				else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
+						 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
+													   snapshotConflictHorizon))
+					invalidation_cause = RS_INVAL_HORIZON;
+			}

I think we assign RS_INVAL_HORIZON to invalidation_cause only when the slot is
logical and the dboid is valid, but it is not guaranteed in the second if
condition ("else if (TransactionIdIsValid(initial_catalog_effective_xmin)").

Here is a top-up patch to fix this.

Best Regards,
Hou zj

Attachments:

0001-fix-if-condition.patchapplication/octet-stream; name=0001-fix-if-condition.patchDownload

From 186b0b8876efe6bfa4aab0368ec5f9082e57d99e Mon Sep 17 00:00:00 2001
From: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Date: Tue, 11 Feb 2025 12:01:36 +0800
Subject: [PATCH] fix if condition

---
 src/backend/replication/slot.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index eccada9d4c5..28d69f916eb 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -1703,15 +1703,17 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes,
 			{
 				if (SlotIsLogical(s) &&
 				/* invalid DB oid signals a shared relation */
-					(dboid == InvalidOid || dboid == s->data.database) &&
-					TransactionIdIsValid(initial_effective_xmin) &&
-					TransactionIdPrecedesOrEquals(initial_effective_xmin,
-												  snapshotConflictHorizon))
-					invalidation_cause = RS_INVAL_HORIZON;
-				else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
-						 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
-													   snapshotConflictHorizon))
-					invalidation_cause = RS_INVAL_HORIZON;
+					(dboid == InvalidOid || dboid == s->data.database))
+				{
+					if (TransactionIdIsValid(initial_effective_xmin) &&
+						TransactionIdPrecedesOrEquals(initial_effective_xmin,
+													  snapshotConflictHorizon))
+						invalidation_cause = RS_INVAL_HORIZON;
+					else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
+							 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
+														   snapshotConflictHorizon))
+						invalidation_cause = RS_INVAL_HORIZON;
+				}
 			}
 			if (invalidation_cause == RS_INVAL_NONE &&
 				(possible_causes & RS_INVAL_WAL_LEVEL))
-- 
2.30.0.windows.2

#398

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Peter Smith (#396)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Feb 11, 2025 at 8:49 AM Peter Smith <smithpb2250@gmail.com> wrote:

InvalidatePossiblyObsoleteSlot:
2.
+ if (possible_causes & RS_INVAL_IDLE_TIMEOUT)
+ {
+ /*
+ * Assign the current time here to avoid system call overhead
+ * while holding the spinlock in subsequent code.
+ */
+ now = GetCurrentTimestamp();
+ }
+
I felt that any minuscule benefit gained from having this conditional
'now' assignment is outweighed by the subsequent confusion/doubt
caused by passing around a 'now' to other functions where it may or
may not still be zero depending on different processing. IMO we should
just remove all doubts and always assign it so that "now always means
now".

I think we shouldn't pass now to another function, but rather do all
required computations in the caller (probably in a separate inline
function). And keep the above code as is.

--
With Regards,
Amit Kapila.

#399

Nisha Moond

nisha.moond412@gmail.com

11 months ago

In reply to: Peter Smith (#396)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Feb 11, 2025 at 8:49 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Nisha.

Some review comments about v74-0001

======
src/backend/replication/slot.c
1.
/* Maximum number of invalidation causes */
-#define RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
-
-StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
- "array length mismatch");
+#define RS_INVAL_MAX_CAUSES (lengthof(InvalidationCauses)-1)
The static assert was here to protect against dev mistakes in keeping
the lookup table up-to-date with the enum of slot.h. So it's not a
good idea to remove it...

IMO the RS_INVAL_MAX_CAUSES should be relocated to slot.h where the
enum is defined and where the devs know exactly how many invalidation
types there are. Then this static assert can be put back in to do its
job of ensuring the integrity properly again for this lookup table.

How about keeping RS_INVAL_MAX_CAUSES dynamic in slot.c (as it was)
and updating the static assert to ensure the lookup table stays
up-to-date with the enums?
The change has been implemented in v75.

~~~

======

Here are v75 patches addressing comments in [1]/messages/by-id/CAHut+PvsvHWoiEkGTP4NfVNsADsy-Jan3Dvp+_GW3gmPDHf5Qw@mail.gmail.com, [2]/messages/by-id/OS0PR01MB57163889BE5F9F30DD3318A394FD2@OS0PR01MB5716.jpnprd01.prod.outlook.com and [3]/messages/by-id/CAA4eK1LuvXa6sVj3xuLoe2X=0xjbJXrnJePbpXQZaTMws8pZqg@mail.gmail.com.
- A new function, "EvaluateSlotInvalidationCause()", has been
introduced to separate the invalidation_cause evaluation logic from
InvalidatePossiblyObsoleteSlot().
- Also, another new inline function "CalculateTimeDuration()" added
as suggested in [3]/messages/by-id/CAA4eK1LuvXa6sVj3xuLoe2X=0xjbJXrnJePbpXQZaTMws8pZqg@mail.gmail.com.

[1]: /messages/by-id/CAHut+PvsvHWoiEkGTP4NfVNsADsy-Jan3Dvp+_GW3gmPDHf5Qw@mail.gmail.com
[2]: /messages/by-id/OS0PR01MB57163889BE5F9F30DD3318A394FD2@OS0PR01MB5716.jpnprd01.prod.outlook.com
[3]: /messages/by-id/CAA4eK1LuvXa6sVj3xuLoe2X=0xjbJXrnJePbpXQZaTMws8pZqg@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v75-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v75-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 879cbd8a9ceb9e82566b97daa1b2879c94493c6d Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 3 Feb 2025 15:20:40 +0530
Subject: [PATCH v75 1/2] Introduce inactive_timeout based replication slot
 invalidation

Till now, postgres has the ability to invalidate inactive
replication slots based on the amount of WAL (set via
max_slot_wal_keep_size GUC) that will be needed for the slots in
case they become active. However, choosing a default value for
this GUC is a bit tricky. Because the amount of WAL a database
generates, and the allocated storage per instance will vary
greatly in production, making it difficult to pin down a
one-size-fits-all value.

It is often easy for users to set a timeout of say 1 or 2 or n
days, after which all the inactive slots get invalidated. This
commit introduces a GUC named idle_replication_slot_timeout.
When set, postgres invalidates slots (during non-shutdown
checkpoints) that are idle for longer than this amount of
time.

Note that the idle timeout invalidation mechanism is not applicable
for slots that do not reserve WAL or for slots on the standby server
that are being synced from the primary server (i.e., standby slots
having 'synced' field 'true'). Synced slots are always considered to be
inactive because they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  42 +++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |   7 +
 src/backend/access/transam/xlog.c             |   4 +-
 src/backend/replication/slot.c                | 318 ++++++++++++++----
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/utils/adt/timestamp.c             |  18 +
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  21 +-
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 src/tools/pgindent/typedefs.list              |   1 +
 15 files changed, 375 insertions(+), 72 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 38244409e3..a915a43625 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4423,6 +4423,48 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </listitem>
       </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero (the default) disables the idle timeout
+        invalidation mechanism. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the
+        slot's <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not applicable
+        for slots that do not reserve WAL or for slots on the standby server
+        that are being synced from the primary server (i.e., standby slots
+        having <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>). Synced slots are always considered to
+        be inactive because they don't perform logical decoding to produce
+        changes. Slots that appear idle due to a disrupted connection between
+        the publisher and subscriber are also excluded, as they are managed by
+        <link linkend="guc-wal-sender-timeout"><varname>wal_sender_timeout</varname></link>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
       <term><varname>wal_sender_timeout</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 613abcd28b..3d18e507bb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2390,6 +2390,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be81c2b51d..f58b9406e4 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2621,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9c270e7d46..3eaf0bf311 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7337,7 +7337,7 @@ CreateCheckPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 	KeepLogSeg(recptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
@@ -7792,7 +7792,7 @@ CreateRestartPoint(int flags)
 	replayPtr = GetXLogReplayRecPtr(&replayTLI);
 	endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr;
 	KeepLogSeg(endptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fe5acd8b1f..7fbe9092e2 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -102,17 +102,30 @@ typedef struct
 /*
  * Lookup table for slot invalidation causes.
  */
-const char *const SlotInvalidationCauses[] = {
-	[RS_INVAL_NONE] = "none",
-	[RS_INVAL_WAL_REMOVED] = "wal_removed",
-	[RS_INVAL_HORIZON] = "rows_removed",
-	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+typedef struct SlotInvalidationCauseMap
+{
+	ReplicationSlotInvalidationCause cause;
+	const char *cause_name;
+} SlotInvalidationCauseMap;
+
+static const SlotInvalidationCauseMap InvalidationCauses[] = {
+	{RS_INVAL_NONE, "none"},
+	{RS_INVAL_WAL_REMOVED, "wal_removed"},
+	{RS_INVAL_HORIZON, "rows_removed"},
+	{RS_INVAL_WAL_LEVEL, "wal_level_insufficient"},
+	{RS_INVAL_IDLE_TIMEOUT, "idle_timeout"},
 };
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define	RS_INVAL_MAX_CAUSES (lengthof(InvalidationCauses)-1)
 
-StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
+/*
+ * Ensure that the lookup table is up-to-date with the enums defined in
+ * ReplicationSlotInvalidationCause. Shifting 1 left by
+ * (RS_INVAL_MAX_CAUSES - 1) should give the highest defined value in
+ * the enum.
+ */
+StaticAssertDecl(RS_INVAL_IDLE_TIMEOUT == (1 << (RS_INVAL_MAX_CAUSES - 1)),
 				 "array length mismatch");
 
 /* size of version independent data */
@@ -141,6 +154,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = 0;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -575,7 +594,7 @@ retry:
 				errmsg("can no longer access replication slot \"%s\"",
 					   NameStr(s->data.name)),
 				errdetail("This replication slot has been invalidated due to \"%s\".",
-						  SlotInvalidationCauses[s->data.invalidated]));
+						  GetSlotInvalidationCauseName(s->data.invalidated)));
 	}
 
 	/*
@@ -1512,12 +1531,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   TimestampTz inactive_since,
+					   int minutes, int secs)
 {
 	StringInfoData err_detail;
-	bool		hint = false;
+	StringInfoData err_hint;
 
 	initStringInfo(&err_detail);
+	initStringInfo(&err_hint);
 
 	switch (cause)
 	{
@@ -1525,13 +1547,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			{
 				unsigned long long ex = oldestLSN - restart_lsn;
 
-				hint = true;
 				appendStringInfo(&err_detail,
 								 ngettext("The slot's restart_lsn %X/%X exceeds the limit by %llu byte.",
 										  "The slot's restart_lsn %X/%X exceeds the limit by %llu bytes.",
 										  ex),
 								 LSN_FORMAT_ARGS(restart_lsn),
 								 ex);
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+								 "max_slot_wal_keep_size");
 				break;
 			}
 		case RS_INVAL_HORIZON:
@@ -1542,6 +1566,19 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			Assert(inactive_since > 0);
+
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_detail, _("The slot's idle time of %d minutes and %02d seconds exceeds the configured \"%s\" duration of %d minutes."),
+							 minutes, secs, "idle_replication_slot_timeout",
+							 idle_replication_slot_timeout_mins);
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1553,9 +1590,117 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			errmsg("invalidating obsolete replication slot \"%s\"",
 				   NameStr(slotname)),
 			errdetail_internal("%s", err_detail.data),
-			hint ? errhint("You might need to increase \"%s\".", "max_slot_wal_keep_size") : 0);
+			err_hint.len ? errhint("%s", err_hint.data) : 0);
 
 	pfree(err_detail.data);
+	pfree(err_hint.data);
+}
+
+/*
+ * Calculate time duration between two timestapms in minutes and seconds.
+ */
+static inline int
+CalculateTimeDuration(TimestampTz ts1, TimestampTz ts2, int *secs)
+{
+	int			minutes;
+	long		elapsed_secs;
+
+	elapsed_secs = (ts1 - ts2) / USECS_PER_SEC;
+	minutes = elapsed_secs / SECS_PER_MINUTE;
+	*secs = elapsed_secs % SECS_PER_MINUTE;
+
+	return minutes;
+}
+
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has reserved WAL
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server is in
+ *	  recovery. This is because synced slots are always considered to be
+ *	  inactive because they don't perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins != 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
+/*
+ * EvaluateSlotInvalidationCause - Evaluate the cause for which a slot becomes
+ *                                 invalid among the given possible causes.
+ *
+ * This function sequentially checks all possible invalidation causes and
+ * returns the first one for which the slot eligible for invalidation.
+ */
+static ReplicationSlotInvalidationCause
+EvaluateSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s,
+							  XLogRecPtr oldestLSN, Oid dboid,
+							  TransactionId snapshotConflictHorizon,
+							  TransactionId initial_effective_xmin,
+							  TransactionId initial_catalog_effective_xmin,
+							  XLogRecPtr initial_restart_lsn,
+							  TimestampTz *inactive_since, TimestampTz now)
+{
+	Assert(possible_causes != RS_INVAL_NONE);
+
+	if (possible_causes & RS_INVAL_WAL_REMOVED)
+	{
+		if (initial_restart_lsn != InvalidXLogRecPtr &&
+			initial_restart_lsn < oldestLSN)
+			return RS_INVAL_WAL_REMOVED;
+	}
+
+	if (possible_causes & RS_INVAL_HORIZON)
+	{
+		if (!SlotIsLogical(s))
+			return RS_INVAL_NONE;
+
+		/* invalid DB oid signals a shared relation */
+		if (dboid != InvalidOid && dboid != s->data.database)
+			return RS_INVAL_NONE;
+
+		if (TransactionIdIsValid(initial_effective_xmin) &&
+			TransactionIdPrecedesOrEquals(initial_effective_xmin,
+										  snapshotConflictHorizon))
+			return RS_INVAL_HORIZON;
+		else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
+				 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
+											   snapshotConflictHorizon))
+			return RS_INVAL_HORIZON;
+	}
+
+	if (possible_causes & RS_INVAL_WAL_LEVEL)
+	{
+		if (SlotIsLogical(s))
+			return RS_INVAL_WAL_LEVEL;
+	}
+
+	if (possible_causes & RS_INVAL_IDLE_TIMEOUT)
+	{
+		Assert(now > 0);
+
+		/*
+		 * Check if the slot needs to be invalidated due to
+		 * idle_replication_slot_timeout GUC.
+		 */
+		if (CanInvalidateIdleSlot(s) &&
+			TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+											  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+		{
+			*inactive_since = s->inactive_since;
+			return RS_INVAL_IDLE_TIMEOUT;
+		}
+	}
+
+	return RS_INVAL_NONE;
 }
 
 /*
@@ -1572,7 +1717,7 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
  * for syscalls, so caller must restart if we return true.
  */
 static bool
-InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
+InvalidatePossiblyObsoleteSlot(uint32 possible_causes,
 							   ReplicationSlot *s,
 							   XLogRecPtr oldestLSN,
 							   Oid dboid, TransactionId snapshotConflictHorizon,
@@ -1585,6 +1730,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1592,6 +1738,9 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
+		int			minutes = 0;
+		int			secs = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1602,6 +1751,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (possible_causes & RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * Assign the current time here to avoid system call overhead
+			 * while holding the spinlock in subsequent code.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1629,35 +1787,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				initial_catalog_effective_xmin = s->effective_catalog_xmin;
 			}
 
-			switch (cause)
-			{
-				case RS_INVAL_WAL_REMOVED:
-					if (initial_restart_lsn != InvalidXLogRecPtr &&
-						initial_restart_lsn < oldestLSN)
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_HORIZON:
-					if (!SlotIsLogical(s))
-						break;
-					/* invalid DB oid signals a shared relation */
-					if (dboid != InvalidOid && dboid != s->data.database)
-						break;
-					if (TransactionIdIsValid(initial_effective_xmin) &&
-						TransactionIdPrecedesOrEquals(initial_effective_xmin,
-													  snapshotConflictHorizon))
-						invalidation_cause = cause;
-					else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
-							 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
-														   snapshotConflictHorizon))
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_WAL_LEVEL:
-					if (SlotIsLogical(s))
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_NONE:
-					pg_unreachable();
-			}
+			invalidation_cause = EvaluateSlotInvalidationCause(possible_causes,
+															   s, oldestLSN,
+															   dboid,
+															   snapshotConflictHorizon,
+															   initial_effective_xmin,
+															   initial_catalog_effective_xmin,
+															   initial_restart_lsn,
+															   &inactive_since,
+															   now);
 		}
 
 		/*
@@ -1679,6 +1817,13 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		slotname = s->data.name;
 		active_pid = s->active_pid;
 
+		/*
+		 * Calculate the idle time duration of the slot if slot is marked
+		 * invalidated with RS_INVAL_IDLE_TIMEOUT.
+		 */
+		if (invalidation_cause == RS_INVAL_IDLE_TIMEOUT && now != 0)
+			minutes = CalculateTimeDuration(now, inactive_since, &secs);
+
 		/*
 		 * If the slot can be acquired, do so and mark it invalidated
 		 * immediately.  Otherwise we'll signal the owning process, below, and
@@ -1705,9 +1850,10 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
+		 * max_slot_wal_keep_size is set to -1 and
+		 * idle_replication_slot_timeout is set to 0 during the binary
+		 * upgrade. See check_old_cluster_for_valid_slots() where we ensure
+		 * that no invalidated before the upgrade.
 		 */
 		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
 
@@ -1739,7 +1885,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   inactive_since, minutes, secs);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1785,7 +1932,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   inactive_since, minutes, secs);
 
 			/* done with this slot for now */
 			break;
@@ -1800,28 +1948,34 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 /*
  * Invalidate slots that require resources about to be removed.
  *
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
+ *
+ * Note: This function attempts to invalidate the slot for multiple possible
+ * causes in a single pass, minimizing redundant iterations. The "cause"
+ * parameter can be a MASK representing one or more of the defined causes.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
 bool
-InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
+InvalidateObsoleteReplicationSlots(uint32 possible_causes,
 								   XLogSegNo oldestSegno, Oid dboid,
 								   TransactionId snapshotConflictHorizon)
 {
 	XLogRecPtr	oldestLSN;
 	bool		invalidated = false;
 
-	Assert(cause != RS_INVAL_HORIZON || TransactionIdIsValid(snapshotConflictHorizon));
-	Assert(cause != RS_INVAL_WAL_REMOVED || oldestSegno > 0);
-	Assert(cause != RS_INVAL_NONE);
+	Assert(!(possible_causes & RS_INVAL_HORIZON) || TransactionIdIsValid(snapshotConflictHorizon));
+	Assert(!(possible_causes & RS_INVAL_WAL_REMOVED) || oldestSegno > 0);
+	Assert(possible_causes != RS_INVAL_NONE);
 
 	if (max_replication_slots == 0)
 		return invalidated;
@@ -1837,7 +1991,7 @@ restart:
 		if (!s->in_use)
 			continue;
 
-		if (InvalidatePossiblyObsoleteSlot(cause, s, oldestLSN, dboid,
+		if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid,
 										   snapshotConflictHorizon,
 										   &invalidated))
 		{
@@ -2428,18 +2582,43 @@ RestoreSlotFromDisk(const char *name)
 ReplicationSlotInvalidationCause
 GetSlotInvalidationCause(const char *invalidation_reason)
 {
-	ReplicationSlotInvalidationCause cause;
 	ReplicationSlotInvalidationCause result = RS_INVAL_NONE;
 	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
+	int			cause_idx;
 
 	Assert(invalidation_reason);
 
-	for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
+	for (cause_idx = 0; cause_idx <= RS_INVAL_MAX_CAUSES; cause_idx++)
 	{
-		if (strcmp(SlotInvalidationCauses[cause], invalidation_reason) == 0)
+		if (strcmp(InvalidationCauses[cause_idx].cause_name, invalidation_reason) == 0)
 		{
 			found = true;
-			result = cause;
+			result = InvalidationCauses[cause_idx].cause;
+			break;
+		}
+	}
+
+	Assert(found);
+	return result;
+}
+
+/*
+ * Maps an ReplicationSlotInvalidationCause to the invalidation
+ * reason for a replication slot.
+ */
+const char *
+GetSlotInvalidationCauseName(ReplicationSlotInvalidationCause cause)
+{
+	const char *result = "none";
+	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
+	int			cause_idx;
+
+	for (cause_idx = 0; cause_idx <= RS_INVAL_MAX_CAUSES; cause_idx++)
+	{
+		if (InvalidationCauses[cause_idx].cause == cause)
+		{
+			found = true;
+			result = InvalidationCauses[cause_idx].cause_name;
 			break;
 		}
 	}
@@ -2802,3 +2981,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 8be4b8c65b..f652ec8a73 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -431,7 +431,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		if (cause == RS_INVAL_NONE)
 			nulls[i++] = true;
 		else
-			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+			values[i++] = CStringGetTextDatum(GetSlotInvalidationCauseName(cause));
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index ce7534d4d2..758329b1c1 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3048,6 +3048,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		0, 0, INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index c40b7a3121..f9a5561166 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -326,6 +326,7 @@
 				# (change requires restart)
 #wal_keep_size = 0		# in megabytes; 0 disables
 #max_slot_wal_keep_size = -1	# in megabytes; -1 disables
+#idle_replication_slot_timeout = 0	# in minutes; 0 disables
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index faf18ccf13..f2ef8d5ccc 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1438,6 +1438,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 000c36d30d..69eedddf2c 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -44,22 +44,23 @@ typedef enum ReplicationSlotPersistency
  * Slots can be invalidated, e.g. due to max_slot_wal_keep_size. If so, the
  * 'invalidated' field is set to a value other than _NONE.
  *
- * When adding a new invalidation cause here, remember to update
- * SlotInvalidationCauses and RS_INVAL_MAX_CAUSES.
+ * When adding a new invalidation cause here, the value must be powers of 2
+ * (e.g., 1, 2, 4...) for proper bitwise operations. Also, remember to update
+ * SlotInvalidationCauseMap in slot.c.
  */
 typedef enum ReplicationSlotInvalidationCause
 {
-	RS_INVAL_NONE,
+	RS_INVAL_NONE = 0,
 	/* required WAL has been removed */
-	RS_INVAL_WAL_REMOVED,
+	RS_INVAL_WAL_REMOVED = (1 << 0),
 	/* required rows have been removed */
-	RS_INVAL_HORIZON,
+	RS_INVAL_HORIZON = (1 << 1),
 	/* wal_level insufficient for slot */
-	RS_INVAL_WAL_LEVEL,
+	RS_INVAL_WAL_LEVEL = (1 << 2),
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT = (1 << 3),
 } ReplicationSlotInvalidationCause;
 
-extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
-
 /*
  * On-Disk data of a replication slot, preserved across restarts.
  */
@@ -254,6 +255,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -286,7 +288,7 @@ extern void ReplicationSlotsComputeRequiredLSN(void);
 extern XLogRecPtr ReplicationSlotsComputeLogicalRestartLSN(void);
 extern bool ReplicationSlotsCountDBSlots(Oid dboid, int *nslots, int *nactive);
 extern void ReplicationSlotsDropDBSlots(Oid dboid);
-extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
+extern bool InvalidateObsoleteReplicationSlots(uint32 possible_causes,
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
@@ -303,6 +305,7 @@ extern void CheckSlotRequirements(void);
 extern void CheckSlotPermissions(void);
 extern ReplicationSlotInvalidationCause
 			GetSlotInvalidationCause(const char *invalidation_reason);
+extern const char *GetSlotInvalidationCauseName(ReplicationSlotInvalidationCause cause);
 
 extern bool SlotExistsInSyncStandbySlots(const char *slot_name);
 extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..9963bddc0e 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -107,6 +107,9 @@ extern long TimestampDifferenceMilliseconds(TimestampTz start_time,
 extern bool TimestampDifferenceExceeds(TimestampTz start_time,
 									   TimestampTz stop_time,
 									   int msec);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
 extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 656ecd919d..51f9a681b6 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2687,6 +2687,7 @@ SkipPages
 SlabBlock
 SlabContext
 SlabSlot
+SlotInvalidationCauseMap
 SlotNumber
 SlotSyncCtxStruct
 SlruCtl
-- 
2.34.1

v75-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/octet-stream; name=v75-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From a76926e41d66c09df46fe2932286f95b8027aa94 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Tue, 11 Feb 2025 17:26:15 +0530
Subject: [PATCH v75 2/2] Add TAP test for slot invalidation based on inactive
 timeout.

This test uses injection points to bypass the time overhead caused by the
idle_replication_slot_timeout GUC, which has a minimum value of one minute.
---
 src/backend/replication/slot.c                |  29 +++--
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 110 ++++++++++++++++++
 3 files changed, 131 insertions(+), 9 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 7fbe9092e2..b03b4696a6 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
+#include "utils/injection_point.h"
 #include "utils/guc_hooks.h"
 #include "utils/varlena.h"
 
@@ -1687,16 +1688,26 @@ EvaluateSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s,
 	{
 		Assert(now > 0);
 
-		/*
-		 * Check if the slot needs to be invalidated due to
-		 * idle_replication_slot_timeout GUC.
-		 */
-		if (CanInvalidateIdleSlot(s) &&
-			TimestampDifferenceExceedsSeconds(s->inactive_since, now,
-											  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+		if (CanInvalidateIdleSlot(s))
 		{
-			*inactive_since = s->inactive_since;
-			return RS_INVAL_IDLE_TIMEOUT;
+			/*
+			 * Check if the slot needs to be invalidated due to
+			 * idle_replication_slot_timeout GUC.
+			 *
+			 * To test idle timeout slot invalidation, if the
+			 * "slot-timeout-inval" injection point is attached, immediately
+			 * invalidate the slot.
+			 */
+			if (
+#ifdef USE_INJECTION_POINTS
+				IS_INJECTION_POINT_ATTACHED("slot-timeout-inval") ||
+#endif
+				TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+												  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+			{
+				*inactive_since = s->inactive_since;
+				return RS_INVAL_IDLE_TIMEOUT;
+			}
 		}
 	}
 
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..2392f24711
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,110 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation due to idle_timeout
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# This test depends on injection point that forces slot invalidation
+# due to idle_timeout. Enabling injections points requires
+# --enable-injection-points with configure or
+# -Dinjection_points=true with Meson.
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $node_name = $node->name;
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(
+		qr/invalidating obsolete replication slot \"$slot_name\"/, $offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot_name to be set on node $node_name";
+}
+
+# ========================================================================
+# Testcase start
+#
+# Test invalidation of streaming standby slot and logical slot due to idle
+# timeout.
+
+# Initialize the node
+my $node = PostgreSQL::Test::Cluster->new('node');
+$node->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$node->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+idle_replication_slot_timeout = 1min
+});
+$node->start;
+
+# Create both streaming standby and logical slot
+$node->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'physical_slot', immediately_reserve := true);
+]);
+$node->safe_psql('postgres',
+	q{SELECT pg_create_logical_replication_slot('logical_slot', 'test_decoding');}
+);
+
+my $log_offset = -s $node->logfile;
+
+# Register an injection point on the node to forcibly cause a slot
+# invalidation due to idle_timeout
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Check if the 'injection_points' extension is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('slot-timeout-inval', 'error');");
+
+# Idle timeout slot invalidation occurs during a checkpoint, so run a
+# checkpoint to invalidate the slots.
+$node->safe_psql('postgres', "CHECKPOINT");
+
+# Wait for slots to become inactive. Note that since nobody has acquired the
+# slot yet, then if it has been invalidated that can only be due to the idle
+# timeout mechanism.
+wait_for_slot_invalidation($node, 'physical_slot', $log_offset);
+wait_for_slot_invalidation($node, 'logical_slot', $log_offset);
+
+# Check that the invalidated slot cannot be acquired
+my $node_name = $node->name;
+my ($result, $stdout, $stderr);
+($result, $stdout, $stderr) = $node->psql(
+	'postgres', qq[
+		SELECT pg_replication_slot_advance('logical_slot', '0/1');
+]);
+ok( $stderr =~ /can no longer access replication slot "logical_slot"/,
+	"detected error upon trying to acquire invalidated slot on node")
+  or die
+  "could not detect error upon trying to acquire invalidated slot \"logical_slot\" on node";
+
+# Testcase end
+# =============================================================================
+
+done_testing();
-- 
2.34.1

#400

Nisha Moond

nisha.moond412@gmail.com

11 months ago

In reply to: Zhijie Hou (Fujitsu) (#397)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Feb 11, 2025 at 11:42 AM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

On Monday, February 10, 2025 8:03 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
On Sat, Feb 8, 2025 at 12:28 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
wrote:
3.
+                       if (cause & RS_INVAL_HORIZON)
+                       {
+                               if (!SlotIsLogical(s))
+                                       goto invalidation_marked;
I am not sure if this logic is correct. Even if the slot would not be
invalidated due to RS_INVAL_HORIZON, we should continue to check other
causes.

Used goto here since we do not expect RS_INVAL_HORIZON to be combined
with any other "cause" and to keep the pgHead behavior.
However, with the bitflag approach, the code should be future-safe, so
replacing goto in v73 should handle this now.
I think the following logic needs some adjustments.
+                       if (invalidation_cause == RS_INVAL_NONE &&
+                               (possible_causes & RS_INVAL_HORIZON))
+                       {
+                               if (SlotIsLogical(s) &&
+                               /* invalid DB oid signals a shared relation */
+                                       (dboid == InvalidOid || dboid == s->data.database) &&
+                                       TransactionIdIsValid(initial_effective_xmin) &&
+                                       TransactionIdPrecedesOrEquals(initial_effective_xmin,
+                                                                                                 snapshotConflictHorizon))
+                                       invalidation_cause = RS_INVAL_HORIZON;
+                               else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
+                                                TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
+                                                                                                          snapshotConflictHorizon))
+                                       invalidation_cause = RS_INVAL_HORIZON;
+                       }
I think we assign RS_INVAL_HORIZON to invalidation_cause only when the slot is
logical and the dboid is valid, but it is not guaranteed in the second if
condition ("else if (TransactionIdIsValid(initial_catalog_effective_xmin)").

Here is a top-up patch to fix this.

Thank you for reviewing and providing the fix! v75 addresses this bug
with a slightly different approach after introducing the new function
EvaluateSlotInvalidationCause().

--
Thanks,
Nisha

#401

Álvaro Herrera

alvherre@alvh.no-ip.org

11 months ago

In reply to: Nisha Moond (#399)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Hello,

I find this proposed patch a bit strange and I feel it needs more
explanation.

When this thread started, Bharath justified his patches saying that a
slot that's inactive for a very long time could be problematic because
of XID wraparound. Fine, that sounds a reasonable feature. If you
wanted to invalidate slots whose xmins were too old, I would support
that. He submitted that as his 0004 patch then.

However, he also chose to submit 0003 with invalidation based on a
timeout. This is far less convincing a feature to me. The
justification for the time out seems to be that ... it's difficult to
have a one-size-fits-all value because size of disks vary. (???)
Or something like that. Really? I mean -- yes, this will prevent
problems in toy databases when run in developer's laptops. It will not
prevent any problems in production databases. Do we really want a
setting that is only useful for toy situations rather than production?

Anyway, the thread is way too long, but after some initial pieces were
committed, Nisha took over and submitting patches derived from Bharath's
0003, and at some point the initial 0004 was dropped. But 0004 was the
more useful one, I thought, so what's going on?

I'm baffled.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Officer Krupke, what are we to do?
Gee, officer Krupke, Krup you! (West Side Story, "Gee, Officer Krupke")

#402

Nathan Bossart

nathandbossart@gmail.com

11 months ago

In reply to: Álvaro Herrera (#401)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Feb 11, 2025 at 03:22:49PM +0100, ï¿½lvaro Herrera wrote:

I find this proposed patch a bit strange and I feel it needs more
explanation.

When this thread started, Bharath justified his patches saying that a
slot that's inactive for a very long time could be problematic because
of XID wraparound. Fine, that sounds a reasonable feature. If you
wanted to invalidate slots whose xmins were too old, I would support
that. He submitted that as his 0004 patch then.

However, he also chose to submit 0003 with invalidation based on a
timeout. This is far less convincing a feature to me. The
justification for the time out seems to be that ... it's difficult to
have a one-size-fits-all value because size of disks vary. (???)
Or something like that. Really? I mean -- yes, this will prevent
problems in toy databases when run in developer's laptops. It will not
prevent any problems in production databases. Do we really want a
setting that is only useful for toy situations rather than production?

Anyway, the thread is way too long, but after some initial pieces were
committed, Nisha took over and submitting patches derived from Bharath's
0003, and at some point the initial 0004 was dropped. But 0004 was the
more useful one, I thought, so what's going on?

I'm baffled.

I agree, and I am also baffled because I think this discussion has happened
at least once already on this thread. I still feel like the XID-based
parameter makes more sense. For replication slots, two primary concerns
are 1) storage, for which we have max_slot_wal_keep_size and 2) XID
wraparound, for which we don't really have anything today. A timeout might
be useful in some contexts, but if the goal is to prevent wraparound, why
not target that directly?

--
nathan

#403

Peter Smith

smithpb2250@gmail.com

11 months ago

In reply to: Nisha Moond (#399)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Feb 12, 2025 at 12:36 AM Nisha Moond <nisha.moond412@gmail.com> wrote:

On Tue, Feb 11, 2025 at 8:49 AM Peter Smith <smithpb2250@gmail.com> wrote:
Hi Nisha.

Some review comments about v74-0001

======
src/backend/replication/slot.c
1.
/* Maximum number of invalidation causes */
-#define RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
-
-StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
- "array length mismatch");
+#define RS_INVAL_MAX_CAUSES (lengthof(InvalidationCauses)-1)
The static assert was here to protect against dev mistakes in keeping
the lookup table up-to-date with the enum of slot.h. So it's not a
good idea to remove it...

IMO the RS_INVAL_MAX_CAUSES should be relocated to slot.h where the
enum is defined and where the devs know exactly how many invalidation
types there are. Then this static assert can be put back in to do its
job of ensuring the integrity properly again for this lookup table.
How about keeping RS_INVAL_MAX_CAUSES dynamic in slot.c (as it was)
and updating the static assert to ensure the lookup table stays
up-to-date with the enums?
The change has been implemented in v75.

Latest v75-001 patch code looks like:

+static const SlotInvalidationCauseMap InvalidationCauses[] = {
+ {RS_INVAL_NONE, "none"},
+ {RS_INVAL_WAL_REMOVED, "wal_removed"},
+ {RS_INVAL_HORIZON, "rows_removed"},
+ {RS_INVAL_WAL_LEVEL, "wal_level_insufficient"},
+ {RS_INVAL_IDLE_TIMEOUT, "idle_timeout"},
 };

 /* Maximum number of invalidation causes */
-#define RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
+#define RS_INVAL_MAX_CAUSES (lengthof(InvalidationCauses)-1)

-StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
+/*
+ * Ensure that the lookup table is up-to-date with the enums defined in
+ * ReplicationSlotInvalidationCause. Shifting 1 left by
+ * (RS_INVAL_MAX_CAUSES - 1) should give the highest defined value in
+ * the enum.
+ */
+StaticAssertDecl(RS_INVAL_IDLE_TIMEOUT == (1 << (RS_INVAL_MAX_CAUSES - 1)),
  "array length mismatch");

Where:
1. RS_INVAL_MAX_CAUSES is based on the length of lookup table so it is 4
2. the StaticAssert then confirms that the enum RS_INVAL_IDLE_TIMEOUT
is the 4th enum entry

AFAICT that is not useful. The purpose of the static assert is (like
your comment says) to "Ensure that the lookup table is up-to-date with
the enums". Imagine if I added another (5th cause) enum called
RS_INVAL_BANANA but accidentally overlook updating the lookup table.
The code above isn't going to detect that -- the lookup table length
is still 4 (instead of 5) but RS_INVAL_IDLE_TIMEOUT is still the 4th
enum so the assert is happy. Hence my original suggestion to define
RS_INVAL_MAX_CAUSES adjacent to the enum in slot.h.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#404

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Nathan Bossart (#402)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Feb 11, 2025 at 9:39 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

On Tue, Feb 11, 2025 at 03:22:49PM +0100, Álvaro Herrera wrote:

I find this proposed patch a bit strange and I feel it needs more
explanation.

When this thread started, Bharath justified his patches saying that a
slot that's inactive for a very long time could be problematic because
of XID wraparound. Fine, that sounds a reasonable feature. If you
wanted to invalidate slots whose xmins were too old, I would support
that. He submitted that as his 0004 patch then.

However, he also chose to submit 0003 with invalidation based on a
timeout. This is far less convincing a feature to me. The
justification for the time out seems to be that ... it's difficult to
have a one-size-fits-all value because size of disks vary. (???)
Or something like that. Really? I mean -- yes, this will prevent
problems in toy databases when run in developer's laptops. It will not
prevent any problems in production databases. Do we really want a
setting that is only useful for toy situations rather than production?

...

I'm baffled.

I agree, and I am also baffled because I think this discussion has happened
at least once already on this thread.

Yes, we previously discussed this topic and Robert seems to prefer a
time-based parameter for invalidating the slot (1)(2) as it is easier
to reason in terms of time. The other points discussed previously were
that there are tools that create a lot of slots and sometimes forget
to clean up slots. Bharath has seen this in production and we now have
the tool pg_createsubscriber that creates a slot-per-database, so if
for some reason, such slots are not cleaned on the tool's exit, such a
parameter could save the cluster. See (3)(4).

Also, we previously didn't have a good experience with XID-based
threshold parameters like vacuum_defer_cleanup_age as mentioned by
Robert (1). AFAICU from the previous discussion we need a time-based
parameter and we didn't rule out xid_age based parameter as another
parameter.

(1) - /messages/by-id/CA+TgmoZTbaaEjSZUG1FL0mzxAdN3qmXksO3O9_PZhEuXTkVnRQ@mail.gmail.com
(2) - /messages/by-id/CA+TgmoaRECcnyqxAxUhP5dk2S4HX=pGh-p-PkA3uc+jG_9hiMw@mail.gmail.com
(3) - /messages/by-id/CALj2ACVFV=yUa3DXXfJLOtJxUM8qzC_mEECMJ2iekDGPeQLkTw@mail.gmail.com
(4) - /messages/by-id/CAA4eK1L3awyzWMuymLJUm8SoFEQe=Da9KUwCcAfC31RNJ1xdJA@mail.gmail.com

--
With Regards,
Amit Kapila.

#405

Zhijie Hou (Fujitsu)

houzj.fnst@fujitsu.com

11 months ago

In reply to: Amit Kapila (#404)

RE: Introduce XID age and inactive timeout based replication slot invalidation

On Wednesday, February 12, 2025 11:56 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Feb 11, 2025 at 9:39 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:

On Tue, Feb 11, 2025 at 03:22:49PM +0100, Álvaro Herrera wrote:

I find this proposed patch a bit strange and I feel it needs more
explanation.

When this thread started, Bharath justified his patches saying that
a slot that's inactive for a very long time could be problematic
because of XID wraparound. Fine, that sounds a reasonable feature.
If you wanted to invalidate slots whose xmins were too old, I would
support that. He submitted that as his 0004 patch then.

However, he also chose to submit 0003 with invalidation based on a
timeout. This is far less convincing a feature to me. The
justification for the time out seems to be that ... it's difficult
to have a one-size-fits-all value because size of disks vary. (???)
Or something like that. Really? I mean -- yes, this will prevent
problems in toy databases when run in developer's laptops. It will
not prevent any problems in production databases. Do we really want
a setting that is only useful for toy situations rather than production?

...

I'm baffled.

I agree, and I am also baffled because I think this discussion has
happened at least once already on this thread.

Yes, we previously discussed this topic and Robert seems to prefer a
time-based parameter for invalidating the slot (1)(2) as it is easier to reason in
terms of time. The other points discussed previously were that there are tools
that create a lot of slots and sometimes forget to clean up slots. Bharath has
seen this in production and we now have the tool pg_createsubscriber that
creates a slot-per-database, so if for some reason, such slots are not cleaned
on the tool's exit, such a parameter could save the cluster. See (3)(4).

Also, we previously didn't have a good experience with XID-based threshold
parameters like vacuum_defer_cleanup_age as mentioned by Robert (1).
AFAICU from the previous discussion we need a time-based parameter and we
didn't rule out xid_age based parameter as another parameter.

Yeah, I think the primary purpose of this time-based option is to invalidate dormant
replication slots that have been inactive for a long period, in which case the
slots are no longer useful.

Such slots can remain if a subscriber is down due to a system error or
inaccessible because of network issues. If this situation persists, it might be
more practical to recreate the subscriber rather than attempt to recover the
node and wait for it to catch up, which could be time-consuming.

Parameters like max_slot_wal_keep_size and max_slot_xid_id_age do not
differentiate between active and inactive replication slots. Some customers I
met are hesitant about using these settings, as they can sometimes invalidate
a slot unnecessarily and break the replication.

(1) -
/messages/by-id/CA+TgmoZTbaaEjSZUG1FL0mzx
AdN3qmXksO3O9_PZhEuXTkVnRQ%40mail.gmail.com
(2) -
/messages/by-id/CA+TgmoaRECcnyqxAxUhP5dk2
S4HX%3DpGh-p-PkA3uc%2BjG_9hiMw%40mail.gmail.com
(3) -
/messages/by-id/CALj2ACVFV=yUa3DXXfJLOtJxU
M8qzC_mEECMJ2iekDGPeQLkTw%40mail.gmail.com
(4) -
/messages/by-id/CAA4eK1L3awyzWMuymLJUm8SoF
EQe%3DDa9KUwCcAfC31RNJ1xdJA%40mail.gmail.com

Best Regards,
Hou zj

#406

Nisha Moond

nisha.moond412@gmail.com

11 months ago

In reply to: Zhijie Hou (Fujitsu) (#405)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Please find the updated v78 patches after a few off-list review rounds.

Here is a summary of changes in v78:
patch-001:
- Fixed bugs reported by Hou-san and Peter in [1]/messages/by-id/CABdArM7eeejXEgd6t4wtBiK=aWc++gt1__WwAWm-Y_5xMVskWg@mail.gmail.com and [2]/messages/by-id/CAHut+PtnWyOMvxb6mZHWFxqD-NdHuYL8Zp=-QasAQ3VvxauiMA@mail.gmail.com.
- Fixed a race condition reported by Hou-san off-list, which could
lead to an assert failure.
This failure happens when the checkpointer sets the invalidation cause
to idle_timeout on the first attempt, but if it later finds another
process's pid active for the slot, it retries after terminating that
process. By then, inactive_since may have been updated, and it
determines the invalidation_cause as RS_INVAL_NONE and below assert
fails:

```
Assert(!(invalidation_cause_prev != RS_INVAL_NONE && terminated &&
invalidation_cause_prev != invalidation_cause));
```

- Moved the slot's idle_time calculation to the caller of
ReportSlotInvalidation().
- Improved the patch commit message for better clarity.

patch-002:
- Fixed a bug reported by Kuroda-san - "check_extension() must be done
before the CREATE EXTENSION".
- Addressed a few other comments by Peter and Kuroda-san to optimize
code and improve comments.

[1]: /messages/by-id/CABdArM7eeejXEgd6t4wtBiK=aWc++gt1__WwAWm-Y_5xMVskWg@mail.gmail.com
[2]: /messages/by-id/CAHut+PtnWyOMvxb6mZHWFxqD-NdHuYL8Zp=-QasAQ3VvxauiMA@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v78-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v78-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From f8b08428ba2851b05aaffcdb15f199a4f44b1700 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 3 Feb 2025 15:20:40 +0530
Subject: [PATCH v78 1/2] Introduce inactive_timeout based replication slot
 invalidation

Tools that create replication slots (e.g., for migrations or upgrades) may
fail to remove them if an error occurs, leaving behind unused slots that
take up space and resources. Manually cleaning them up can be tedious and
error-prone, and without intervention, these lingering slots can cause
unnecessary WAL retention and system bloat.

Till now, postgres has the ability to invalidate inactive replication slots
based on the amount of WAL (set via max_slot_wal_keep_size GUC) that will
be needed for the slots in case they become active. However, setting an
optimal value for this is tricky since the amount of WAL a database
generates, and the allocated storage per instance will vary greatly in
production. A high value may allow orphaned slots to persist longer than
necessary, leading to system bloat by retaining WAL unnecessarily.

This commit introduces idle_replication_slot_timeout, a simpler and more
intuitive way to manage inactive slots. Instead of relying on WAL size,
users can set a time limit (e.g., 1 or 2 or n days), after which slots that
remain idle for longer than this amount of time are automatically
invalidated during checkpoints.

Note that the idle timeout invalidation mechanism is not applicable
for slots that do not reserve WAL or for slots on the standby server
that are being synced from the primary server (i.e., standby slots
having 'synced' field 'true'). Synced slots are always considered to be
inactive because they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  42 +++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |   7 +
 src/backend/access/transam/xlog.c             |   4 +-
 src/backend/replication/slot.c                | 312 ++++++++++++++----
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/utils/adt/timestamp.c             |  18 +
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 +-
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 src/tools/pgindent/typedefs.list              |   1 +
 15 files changed, 364 insertions(+), 78 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5e4f201e09..507e7e2eec 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4429,6 +4429,48 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </listitem>
       </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero (the default) disables the idle timeout
+        invalidation mechanism. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the
+        slot's <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not applicable
+        for slots that do not reserve WAL or for slots on the standby server
+        that are being synced from the primary server (i.e., standby slots
+        having <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>). Synced slots are always considered to
+        be inactive because they don't perform logical decoding to produce
+        changes. Slots that appear idle due to a disrupted connection between
+        the publisher and subscriber are also excluded, as they are managed by
+        <link linkend="guc-wal-sender-timeout"><varname>wal_sender_timeout</varname></link>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
       <term><varname>wal_sender_timeout</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 613abcd28b..3d18e507bb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2390,6 +2390,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be81c2b51d..f58b9406e4 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2621,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index a50fd99d9e..9391705664 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7337,7 +7337,7 @@ CreateCheckPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 	KeepLogSeg(recptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
@@ -7792,7 +7792,7 @@ CreateRestartPoint(int flags)
 	replayPtr = GetXLogReplayRecPtr(&replayTLI);
 	endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr;
 	KeepLogSeg(endptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fe5acd8b1f..4ff66f0dd5 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -102,16 +102,24 @@ typedef struct
 /*
  * Lookup table for slot invalidation causes.
  */
-const char *const SlotInvalidationCauses[] = {
-	[RS_INVAL_NONE] = "none",
-	[RS_INVAL_WAL_REMOVED] = "wal_removed",
-	[RS_INVAL_HORIZON] = "rows_removed",
-	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+typedef struct SlotInvalidationCauseMap
+{
+	ReplicationSlotInvalidationCause cause;
+	const char *cause_name;
+} SlotInvalidationCauseMap;
+
+static const SlotInvalidationCauseMap SlotInvalidationCauses[] = {
+	{RS_INVAL_NONE, "none"},
+	{RS_INVAL_WAL_REMOVED, "wal_removed"},
+	{RS_INVAL_HORIZON, "rows_removed"},
+	{RS_INVAL_WAL_LEVEL, "wal_level_insufficient"},
+	{RS_INVAL_IDLE_TIMEOUT, "idle_timeout"},
 };
 
-/* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
-
+/*
+ * Ensure that the lookup table is up-to-date with the enums defined in
+ * ReplicationSlotInvalidationCause.
+ */
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
 
@@ -141,6 +149,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = 0;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -575,7 +589,7 @@ retry:
 				errmsg("can no longer access replication slot \"%s\"",
 					   NameStr(s->data.name)),
 				errdetail("This replication slot has been invalidated due to \"%s\".",
-						  SlotInvalidationCauses[s->data.invalidated]));
+						  GetSlotInvalidationCauseName(s->data.invalidated)));
 	}
 
 	/*
@@ -596,10 +610,14 @@ retry:
 		if (s->active_pid == 0)
 			s->active_pid = MyProcPid;
 		active_pid = s->active_pid;
+		ReplicationSlotSetInactiveSince(s, 0, false);
 		SpinLockRelease(&s->mutex);
 	}
 	else
+	{
 		active_pid = MyProcPid;
+		ReplicationSlotSetInactiveSince(s, 0, true);
+	}
 	LWLockRelease(ReplicationSlotControlLock);
 
 	/*
@@ -640,11 +658,6 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
-	/*
-	 * Reset the time since the slot has become inactive as the slot is active
-	 * now.
-	 */
-	ReplicationSlotSetInactiveSince(s, 0, true);
 
 	if (am_walsender)
 	{
@@ -1512,12 +1525,16 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   long slot_idle_seconds)
 {
+	int			minutes;
+	int			secs;
 	StringInfoData err_detail;
-	bool		hint = false;
+	StringInfoData err_hint;
 
 	initStringInfo(&err_detail);
+	initStringInfo(&err_hint);
 
 	switch (cause)
 	{
@@ -1525,13 +1542,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			{
 				unsigned long long ex = oldestLSN - restart_lsn;
 
-				hint = true;
 				appendStringInfo(&err_detail,
 								 ngettext("The slot's restart_lsn %X/%X exceeds the limit by %llu byte.",
 										  "The slot's restart_lsn %X/%X exceeds the limit by %llu bytes.",
 										  ex),
 								 LSN_FORMAT_ARGS(restart_lsn),
 								 ex);
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+								 "max_slot_wal_keep_size");
 				break;
 			}
 		case RS_INVAL_HORIZON:
@@ -1542,6 +1561,21 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+
+			minutes = slot_idle_seconds / SECS_PER_MINUTE;
+			secs = slot_idle_seconds % SECS_PER_MINUTE;
+
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_detail, _("The slot's idle time of %dmin %02ds exceeds the configured \"%s\" duration of %dmin."),
+							 minutes, secs, "idle_replication_slot_timeout",
+							 idle_replication_slot_timeout_mins);
+			/* translator: %s is a GUC variable name */
+			appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+							 "idle_replication_slot_timeout");
+			break;
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1553,9 +1587,100 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			errmsg("invalidating obsolete replication slot \"%s\"",
 				   NameStr(slotname)),
 			errdetail_internal("%s", err_detail.data),
-			hint ? errhint("You might need to increase \"%s\".", "max_slot_wal_keep_size") : 0);
+			err_hint.len ? errhint("%s", err_hint.data) : 0);
 
 	pfree(err_detail.data);
+	pfree(err_hint.data);
+}
+
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has reserved WAL
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server is in
+ *	  recovery. This is because synced slots are always considered to be
+ *	  inactive because they don't perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins != 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
+/*
+ * DetermineSlotInvalidationCause - Determine the cause for which a slot
+ * becomes invalid among the given possible causes.
+ *
+ * This function sequentially checks all possible invalidation causes and
+ * returns the first one for which the slot is eligible for invalidation.
+ */
+static ReplicationSlotInvalidationCause
+DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s,
+							   XLogRecPtr oldestLSN, Oid dboid,
+							   TransactionId snapshotConflictHorizon,
+							   TransactionId initial_effective_xmin,
+							   TransactionId initial_catalog_effective_xmin,
+							   XLogRecPtr initial_restart_lsn,
+							   TimestampTz *inactive_since, TimestampTz now)
+{
+	Assert(possible_causes != RS_INVAL_NONE);
+
+	if (possible_causes & RS_INVAL_WAL_REMOVED)
+	{
+		if (initial_restart_lsn != InvalidXLogRecPtr &&
+			initial_restart_lsn < oldestLSN)
+			return RS_INVAL_WAL_REMOVED;
+	}
+
+	if (possible_causes & RS_INVAL_HORIZON)
+	{
+		if (SlotIsLogical(s) &&
+		/* invalid DB oid signals a shared relation */
+			(dboid == InvalidOid || dboid == s->data.database))
+		{
+
+			if (TransactionIdIsValid(initial_effective_xmin) &&
+				TransactionIdPrecedesOrEquals(initial_effective_xmin,
+											  snapshotConflictHorizon))
+				return RS_INVAL_HORIZON;
+			else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
+					 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
+												   snapshotConflictHorizon))
+				return RS_INVAL_HORIZON;
+		}
+	}
+
+	if (possible_causes & RS_INVAL_WAL_LEVEL)
+	{
+		if (SlotIsLogical(s))
+			return RS_INVAL_WAL_LEVEL;
+	}
+
+	if (possible_causes & RS_INVAL_IDLE_TIMEOUT)
+	{
+		Assert(now > 0);
+
+		/*
+		 * Check if the slot needs to be invalidated due to
+		 * idle_replication_slot_timeout GUC.
+		 */
+		if (CanInvalidateIdleSlot(s) &&
+			TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+											  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+		{
+			*inactive_since = s->inactive_since;
+			return RS_INVAL_IDLE_TIMEOUT;
+		}
+	}
+
+	return RS_INVAL_NONE;
 }
 
 /*
@@ -1572,7 +1697,7 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
  * for syscalls, so caller must restart if we return true.
  */
 static bool
-InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
+InvalidatePossiblyObsoleteSlot(uint32 possible_causes,
 							   ReplicationSlot *s,
 							   XLogRecPtr oldestLSN,
 							   Oid dboid, TransactionId snapshotConflictHorizon,
@@ -1585,6 +1710,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1592,6 +1718,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
+		long		slot_idle_secs = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1602,6 +1730,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (possible_causes & RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * Assign the current time here to avoid system call overhead
+			 * while holding the spinlock in subsequent code.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1620,7 +1757,11 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			 * The slot's mutex will be released soon, and it is possible that
 			 * those values change since the process holding the slot has been
 			 * terminated (if any), so record them here to ensure that we
-			 * would report the correct invalidation cause.
+			 * would report the correct invalidation cause. No need to record
+			 * inactive_since for the idle_timeout case here, as an already
+			 * inactive slot's inactive_since can only be reset under a mutex
+			 * in ReplicationSlotAcquire(), and an inactive slot can be
+			 * invalidated immediately without releasing the spinlock.
 			 */
 			if (!terminated)
 			{
@@ -1629,35 +1770,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				initial_catalog_effective_xmin = s->effective_catalog_xmin;
 			}
 
-			switch (cause)
-			{
-				case RS_INVAL_WAL_REMOVED:
-					if (initial_restart_lsn != InvalidXLogRecPtr &&
-						initial_restart_lsn < oldestLSN)
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_HORIZON:
-					if (!SlotIsLogical(s))
-						break;
-					/* invalid DB oid signals a shared relation */
-					if (dboid != InvalidOid && dboid != s->data.database)
-						break;
-					if (TransactionIdIsValid(initial_effective_xmin) &&
-						TransactionIdPrecedesOrEquals(initial_effective_xmin,
-													  snapshotConflictHorizon))
-						invalidation_cause = cause;
-					else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
-							 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
-														   snapshotConflictHorizon))
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_WAL_LEVEL:
-					if (SlotIsLogical(s))
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_NONE:
-					pg_unreachable();
-			}
+			invalidation_cause = DetermineSlotInvalidationCause(possible_causes,
+																s, oldestLSN,
+																dboid,
+																snapshotConflictHorizon,
+																initial_effective_xmin,
+																initial_catalog_effective_xmin,
+																initial_restart_lsn,
+																&inactive_since,
+																now);
 		}
 
 		/*
@@ -1705,12 +1826,25 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
+		 * max_slot_wal_keep_size is set to -1 and
+		 * idle_replication_slot_timeout is set to 0 during the binary
+		 * upgrade. See check_old_cluster_for_valid_slots() where we ensure
+		 * that no invalidated before the upgrade.
 		 */
 		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
 
+		/*
+		 * Calculate the idle time duration of the slot if slot is marked
+		 * invalidated with RS_INVAL_IDLE_TIMEOUT.
+		 */
+		if (invalidation_cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			int			slot_idle_usecs;
+
+			TimestampDifference(inactive_since, now, &slot_idle_secs,
+								&slot_idle_usecs);
+		}
+
 		if (active_pid != 0)
 		{
 			/*
@@ -1739,7 +1873,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   slot_idle_secs);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1785,7 +1920,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   slot_idle_secs);
 
 			/* done with this slot for now */
 			break;
@@ -1800,28 +1936,34 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 /*
  * Invalidate slots that require resources about to be removed.
  *
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
+ *
+ * Note: This function attempts to invalidate the slot for multiple possible
+ * causes in a single pass, minimizing redundant iterations. The "cause"
+ * parameter can be a MASK representing one or more of the defined causes.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
 bool
-InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
+InvalidateObsoleteReplicationSlots(uint32 possible_causes,
 								   XLogSegNo oldestSegno, Oid dboid,
 								   TransactionId snapshotConflictHorizon)
 {
 	XLogRecPtr	oldestLSN;
 	bool		invalidated = false;
 
-	Assert(cause != RS_INVAL_HORIZON || TransactionIdIsValid(snapshotConflictHorizon));
-	Assert(cause != RS_INVAL_WAL_REMOVED || oldestSegno > 0);
-	Assert(cause != RS_INVAL_NONE);
+	Assert(!(possible_causes & RS_INVAL_HORIZON) || TransactionIdIsValid(snapshotConflictHorizon));
+	Assert(!(possible_causes & RS_INVAL_WAL_REMOVED) || oldestSegno > 0);
+	Assert(possible_causes != RS_INVAL_NONE);
 
 	if (max_replication_slots == 0)
 		return invalidated;
@@ -1837,7 +1979,7 @@ restart:
 		if (!s->in_use)
 			continue;
 
-		if (InvalidatePossiblyObsoleteSlot(cause, s, oldestLSN, dboid,
+		if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid,
 										   snapshotConflictHorizon,
 										   &invalidated))
 		{
@@ -2428,18 +2570,17 @@ RestoreSlotFromDisk(const char *name)
 ReplicationSlotInvalidationCause
 GetSlotInvalidationCause(const char *invalidation_reason)
 {
-	ReplicationSlotInvalidationCause cause;
 	ReplicationSlotInvalidationCause result = RS_INVAL_NONE;
 	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
 
 	Assert(invalidation_reason);
 
-	for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
+	for (int i = 0; i <= RS_INVAL_MAX_CAUSES; i++)
 	{
-		if (strcmp(SlotInvalidationCauses[cause], invalidation_reason) == 0)
+		if (strcmp(SlotInvalidationCauses[i].cause_name, invalidation_reason) == 0)
 		{
 			found = true;
-			result = cause;
+			result = SlotInvalidationCauses[i].cause;
 			break;
 		}
 	}
@@ -2448,6 +2589,24 @@ GetSlotInvalidationCause(const char *invalidation_reason)
 	return result;
 }
 
+/*
+ * Maps an ReplicationSlotInvalidationCause to the invalidation
+ * reason for a replication slot.
+ */
+const char *
+GetSlotInvalidationCauseName(ReplicationSlotInvalidationCause cause)
+{
+	/* Search lookup table for the name of this cause */
+	for (int i = 0; i <= RS_INVAL_MAX_CAUSES; i++)
+	{
+		if (SlotInvalidationCauses[i].cause == cause)
+			return SlotInvalidationCauses[i].cause_name;
+	}
+
+	Assert(false);
+	return "none";				/* to keep compiler quiet */
+}
+
 /*
  * A helper function to validate slots specified in GUC synchronized_standby_slots.
  *
@@ -2802,3 +2961,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 8be4b8c65b..f652ec8a73 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -431,7 +431,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		if (cause == RS_INVAL_NONE)
 			nulls[i++] = true;
 		else
-			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+			values[i++] = CStringGetTextDatum(GetSlotInvalidationCauseName(cause));
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 226af43fe2..ded7bd844c 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3058,6 +3058,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		0, 0, INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index d472987ed4..415f253096 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -329,6 +329,7 @@
 				# (change requires restart)
 #wal_keep_size = 0		# in megabytes; 0 disables
 #max_slot_wal_keep_size = -1	# in megabytes; -1 disables
+#idle_replication_slot_timeout = 0	# in minutes; 0 disables
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index 2d881d54f5..4968474d8c 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1439,6 +1439,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 000c36d30d..f5a24ccfbf 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -44,21 +44,25 @@ typedef enum ReplicationSlotPersistency
  * Slots can be invalidated, e.g. due to max_slot_wal_keep_size. If so, the
  * 'invalidated' field is set to a value other than _NONE.
  *
- * When adding a new invalidation cause here, remember to update
- * SlotInvalidationCauses and RS_INVAL_MAX_CAUSES.
+ * When adding a new invalidation cause here, the value must be powers of 2
+ * (e.g., 1, 2, 4...) for proper bitwise operations. Also, remember to update
+ * RS_INVAL_MAX_CAUSES below, and SlotInvalidationCauses in slot.c.
  */
 typedef enum ReplicationSlotInvalidationCause
 {
-	RS_INVAL_NONE,
+	RS_INVAL_NONE = 0,
 	/* required WAL has been removed */
-	RS_INVAL_WAL_REMOVED,
+	RS_INVAL_WAL_REMOVED = (1 << 0),
 	/* required rows have been removed */
-	RS_INVAL_HORIZON,
+	RS_INVAL_HORIZON = (1 << 1),
 	/* wal_level insufficient for slot */
-	RS_INVAL_WAL_LEVEL,
+	RS_INVAL_WAL_LEVEL = (1 << 2),
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT = (1 << 3),
 } ReplicationSlotInvalidationCause;
 
-extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
+/* Maximum number of invalidation causes */
+#define	RS_INVAL_MAX_CAUSES 4
 
 /*
  * On-Disk data of a replication slot, preserved across restarts.
@@ -254,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -286,7 +291,7 @@ extern void ReplicationSlotsComputeRequiredLSN(void);
 extern XLogRecPtr ReplicationSlotsComputeLogicalRestartLSN(void);
 extern bool ReplicationSlotsCountDBSlots(Oid dboid, int *nslots, int *nactive);
 extern void ReplicationSlotsDropDBSlots(Oid dboid);
-extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
+extern bool InvalidateObsoleteReplicationSlots(uint32 possible_causes,
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
@@ -303,6 +308,7 @@ extern void CheckSlotRequirements(void);
 extern void CheckSlotPermissions(void);
 extern ReplicationSlotInvalidationCause
 			GetSlotInvalidationCause(const char *invalidation_reason);
+extern const char *GetSlotInvalidationCauseName(ReplicationSlotInvalidationCause cause);
 
 extern bool SlotExistsInSyncStandbySlots(const char *slot_name);
 extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..9963bddc0e 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -107,6 +107,9 @@ extern long TimestampDifferenceMilliseconds(TimestampTz start_time,
 extern bool TimestampDifferenceExceeds(TimestampTz start_time,
 									   TimestampTz stop_time,
 									   int msec);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
 extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b6c170ac24..55fcfe482a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2688,6 +2688,7 @@ SkipPages
 SlabBlock
 SlabContext
 SlabSlot
+SlotInvalidationCauseMap
 SlotNumber
 SlotSyncCtxStruct
 SlruCtl
-- 
2.34.1

v78-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/octet-stream; name=v78-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From ec13c28b5baa219a91cba17a6af17f1964aae698 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Tue, 11 Feb 2025 17:26:15 +0530
Subject: [PATCH v78 2/2] Add TAP test for slot invalidation based on inactive
 timeout.

This test uses injection points to bypass the time overhead caused by the
idle_replication_slot_timeout GUC, which has a minimum value of one minute.
---
 src/backend/replication/slot.c                |  29 +++--
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 107 ++++++++++++++++++
 3 files changed, 128 insertions(+), 9 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 4ff66f0dd5..49e23a48a6 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
+#include "utils/injection_point.h"
 #include "utils/guc_hooks.h"
 #include "utils/varlena.h"
 
@@ -1667,16 +1668,26 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s,
 	{
 		Assert(now > 0);
 
-		/*
-		 * Check if the slot needs to be invalidated due to
-		 * idle_replication_slot_timeout GUC.
-		 */
-		if (CanInvalidateIdleSlot(s) &&
-			TimestampDifferenceExceedsSeconds(s->inactive_since, now,
-											  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+		if (CanInvalidateIdleSlot(s))
 		{
-			*inactive_since = s->inactive_since;
-			return RS_INVAL_IDLE_TIMEOUT;
+			/*
+			 * Check if the slot needs to be invalidated due to
+			 * idle_replication_slot_timeout GUC.
+			 *
+			 * To test idle timeout slot invalidation, if the
+			 * "slot-timeout-inval" injection point is attached, immediately
+			 * invalidate the slot.
+			 */
+			if (
+#ifdef USE_INJECTION_POINTS
+				IS_INJECTION_POINT_ATTACHED("slot-timeout-inval") ||
+#endif
+				TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+												  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+			{
+				*inactive_since = s->inactive_since;
+				return RS_INVAL_IDLE_TIMEOUT;
+			}
 		}
 	}
 
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..7d3fe75426
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,107 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation due to idle_timeout
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# This test depends on injection point that forces slot invalidation
+# due to idle_timeout.
+# https://www.postgresql.org/docs/current/xfunc-c.html#XFUNC-ADDIN-INJECTION-POINTS
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $node_name = $node->name;
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(
+		qr/invalidating obsolete replication slot \"$slot_name\"/, $offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot_name to be set on node $node_name";
+}
+
+# ========================================================================
+# Testcase start
+#
+# Test invalidation of physical replication slot and logical replication slot
+# due to idle timeout.
+
+# Initialize the node
+my $node = PostgreSQL::Test::Cluster->new('node');
+$node->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$node->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+idle_replication_slot_timeout = 1min
+});
+$node->start;
+
+# Check if the 'injection_points' extension is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create both physical and logical replication slots
+$node->safe_psql(
+	'postgres', qq[
+		SELECT pg_create_physical_replication_slot(slot_name := 'physical_slot', immediately_reserve := true);
+		SELECT pg_create_logical_replication_slot('logical_slot', 'test_decoding');
+]);
+
+my $log_offset = -s $node->logfile;
+
+# Register an injection point on the node to forcibly cause a slot
+# invalidation due to idle_timeout
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('slot-timeout-inval', 'error');");
+
+# Idle timeout slot invalidation occurs during a checkpoint, so run a
+# checkpoint to invalidate the slots.
+$node->safe_psql('postgres', "CHECKPOINT");
+
+# Wait for slots to become inactive. Note that since nobody has acquired the
+# slot yet, then if it has been invalidated that can only be due to the idle
+# timeout mechanism.
+wait_for_slot_invalidation($node, 'physical_slot', $log_offset);
+wait_for_slot_invalidation($node, 'logical_slot', $log_offset);
+
+# Check that the invalidated slot cannot be acquired
+my $node_name = $node->name;
+my ($result, $stdout, $stderr);
+($result, $stdout, $stderr) = $node->psql(
+	'postgres', qq[
+		SELECT pg_replication_slot_advance('logical_slot', '0/1');
+]);
+ok( $stderr =~ /can no longer access replication slot "logical_slot"/,
+	"detected error upon trying to acquire invalidated slot on node")
+  or die
+  "could not detect error upon trying to acquire invalidated slot \"logical_slot\" on node";
+
+# Testcase end
+# =============================================================================
+
+done_testing();
-- 
2.34.1

#407

Peter Smith

smithpb2250@gmail.com

11 months ago

In reply to: Nisha Moond (#406)

Re: Introduce XID age and inactive timeout based replication slot invalidation

Some review comments for v78-0001.

======
src/backend/replication/slot.c

ReportSlotInvalidation:

1.
+ int minutes;
+ int secs;

The variables 'minutes' and 'seconds' are only used by case
RS_INVAL_IDLE_TIMEOUT, so I think it would be better to make a new
code block for that case where you can declare and initialise these in
one go at that scope.

~~~

DetermineSlotInvalidationCause:

2.
+ if (SlotIsLogical(s) &&
+ /* invalid DB oid signals a shared relation */
+ (dboid == InvalidOid || dboid == s->data.database))
+ {

The comment placement in the master code was ok because then there
were different statements, but now in this patch multiple conditions
are combined, and this comment seems strangely placed.

~~~

GetSlotInvalidationCause:

3.
I understand your argument "let's not change anything unless
absolutely necessary for our patch", but in this case since half the
function is changing anyway it seems a missed opportunity to not
simplify the rest of the code "in passing" to make it consistent with
the newly added partner function GetSlotInvalidationCauseName. My
question is "if not now, then when?", because I expect a future patch
to do this might be rejected as being too trivial, so by not changing
now probably these functions are doomed to be inconsistent forever.
Anyway it is just my opinion -- leave it as-is if you wish.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#408

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Zhijie Hou (Fujitsu) (#405)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Feb 12, 2025 at 1:16 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

On Wednesday, February 12, 2025 11:56 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Feb 11, 2025 at 9:39 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:

On Tue, Feb 11, 2025 at 03:22:49PM +0100, Álvaro Herrera wrote:

I find this proposed patch a bit strange and I feel it needs more
explanation.

When this thread started, Bharath justified his patches saying that
a slot that's inactive for a very long time could be problematic
because of XID wraparound. Fine, that sounds a reasonable feature.
If you wanted to invalidate slots whose xmins were too old, I would
support that. He submitted that as his 0004 patch then.

However, he also chose to submit 0003 with invalidation based on a
timeout. This is far less convincing a feature to me. The
justification for the time out seems to be that ... it's difficult
to have a one-size-fits-all value because size of disks vary. (???)
Or something like that. Really? I mean -- yes, this will prevent
problems in toy databases when run in developer's laptops. It will
not prevent any problems in production databases. Do we really want
a setting that is only useful for toy situations rather than production?

...

I'm baffled.

I agree, and I am also baffled because I think this discussion has
happened at least once already on this thread.

Yes, we previously discussed this topic and Robert seems to prefer a
time-based parameter for invalidating the slot (1)(2) as it is easier to reason in
terms of time. The other points discussed previously were that there are tools
that create a lot of slots and sometimes forget to clean up slots. Bharath has
seen this in production and we now have the tool pg_createsubscriber that
creates a slot-per-database, so if for some reason, such slots are not cleaned
on the tool's exit, such a parameter could save the cluster. See (3)(4).

Also, we previously didn't have a good experience with XID-based threshold
parameters like vacuum_defer_cleanup_age as mentioned by Robert (1).
AFAICU from the previous discussion we need a time-based parameter and we
didn't rule out xid_age based parameter as another parameter.

Yeah, I think the primary purpose of this time-based option is to invalidate dormant
replication slots that have been inactive for a long period, in which case the
slots are no longer useful.

Such slots can remain if a subscriber is down due to a system error or
inaccessible because of network issues. If this situation persists, it might be
more practical to recreate the subscriber rather than attempt to recover the
node and wait for it to catch up, which could be time-consuming.

Parameters like max_slot_wal_keep_size and max_slot_xid_id_age do not
differentiate between active and inactive replication slots. Some customers I
met are hesitant about using these settings, as they can sometimes invalidate
a slot unnecessarily and break the replication.

Alvaro, Nathan, do let us know if you would like to discuss more on
the use case for this new GUC idle_replication_slot_timeout?
Otherwise, we can proceed with this patch.

--
With Regards,
Amit Kapila.

#409

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Nisha Moond (#406)

1 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Fri, Feb 14, 2025 at 5:30 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

Here is a summary of changes in v78:

A few minor comments:
1.
Slots that appear idle due to a disrupted connection between
+        the publisher and subscriber are also excluded, as they are managed by
+        <link linkend="guc-wal-sender-timeout"><varname>wal_sender_timeout</varname></link>.
...

How do we exclude the above kind of slots? I think it is trying to
cover the case where walsender is not exited even after the connection
is broken between publisher and subscriber. The point is quite
confusing and adds much less value. So, we can remove it.

2.
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
...

I find the existing comment more suitable for this function and easy to follow.

Apart from the above, I have changed a few other comments and minor
cosmetic cleanup.

--
With Regards,
Amit Kapila.

Attachments:

v78_amit.1.patch.txttext/plain; charset=US-ASCII; name=v78_amit.1.patch.txtDownload

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 4ff66f0dd5..476b8e5355 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -606,6 +606,11 @@ retry:
 		if (!nowait)
 			ConditionVariablePrepareToSleep(&s->active_cv);
 
+		/*
+		 * It is important to reset the inactive_since under spinlock here to
+		 * avoid race conditions with slot invalidation. See comments related
+		 * to inactive_since in InvalidatePossiblyObsoleteSlot.
+		 */
 		SpinLockAcquire(&s->mutex);
 		if (s->active_pid == 0)
 			s->active_pid = MyProcPid;
@@ -1641,11 +1646,10 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s,
 
 	if (possible_causes & RS_INVAL_HORIZON)
 	{
-		if (SlotIsLogical(s) &&
 		/* invalid DB oid signals a shared relation */
+		if (SlotIsLogical(s) &&
 			(dboid == InvalidOid || dboid == s->data.database))
 		{
-
 			if (TransactionIdIsValid(initial_effective_xmin) &&
 				TransactionIdPrecedesOrEquals(initial_effective_xmin,
 											  snapshotConflictHorizon))
@@ -1757,11 +1761,13 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes,
 			 * The slot's mutex will be released soon, and it is possible that
 			 * those values change since the process holding the slot has been
 			 * terminated (if any), so record them here to ensure that we
-			 * would report the correct invalidation cause. No need to record
-			 * inactive_since for the idle_timeout case here, as an already
-			 * inactive slot's inactive_since can only be reset under a mutex
-			 * in ReplicationSlotAcquire(), and an inactive slot can be
-			 * invalidated immediately without releasing the spinlock.
+			 * would report the correct invalidation cause.
+			 *
+			 * Unlike others slot's inactive_since can't be changed once it is
+			 * acquired till it gets released or the process owning it gets
+			 * terminated. The slot remains active till some process owns it.
+			 * So, the inactive slot can only be invalidated immediately
+			 * without being terminated.
 			 */
 			if (!terminated)
 			{

#410

Nisha Moond

nisha.moond412@gmail.com

11 months ago

In reply to: Amit Kapila (#409)

2 attachment(s)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Feb 17, 2025 at 11:29 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Feb 14, 2025 at 5:30 PM Nisha Moond <nisha.moond412@gmail.com> wrote:

Here is a summary of changes in v78:
A few minor comments:
1.
Slots that appear idle due to a disrupted connection between
+        the publisher and subscriber are also excluded, as they are managed by
+        <link linkend="guc-wal-sender-timeout"><varname>wal_sender_timeout</varname></link>.
...
How do we exclude the above kind of slots? I think it is trying to
cover the case where walsender is not exited even after the connection
is broken between publisher and subscriber. The point is quite
confusing and adds much less value. So, we can remove it.
2.
- * Returns true when any slot have got invalidated.
+ * Returns true if there are any invalidated slots.
...
I find the existing comment more suitable for this function and easy to follow.

Apart from the above, I have changed a few other comments and minor
cosmetic cleanup.

Here are the v79 patches with the above changes and comments from [1]/messages/by-id/CAHut+Putqw=79SPh+EJZoS+98cJJvRRBmp-v6zqSRwngHey_ow@mail.gmail.com
incorporated.

[1]: /messages/by-id/CAHut+Putqw=79SPh+EJZoS+98cJJvRRBmp-v6zqSRwngHey_ow@mail.gmail.com

--
Thanks,
Nisha

Attachments:

v79-0001-Introduce-inactive_timeout-based-replication-slo.patchapplication/octet-stream; name=v79-0001-Introduce-inactive_timeout-based-replication-slo.patchDownload

From 714bbd0a6c3f303bbe9b5cecf394f67417d96c55 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Mon, 3 Feb 2025 15:20:40 +0530
Subject: [PATCH v79 1/2] Introduce inactive_timeout based replication slot
 invalidation

Tools that create replication slots (e.g., for migrations or upgrades) may
fail to remove them if an error occurs, leaving behind unused slots that
take up space and resources. Manually cleaning them up can be tedious and
error-prone, and without intervention, these lingering slots can cause
unnecessary WAL retention and system bloat.

Till now, postgres has the ability to invalidate inactive replication slots
based on the amount of WAL (set via max_slot_wal_keep_size GUC) that will
be needed for the slots in case they become active. However, setting an
optimal value for this is tricky since the amount of WAL a database
generates, and the allocated storage per instance will vary greatly in
production. A high value may allow orphaned slots to persist longer than
necessary, leading to system bloat by retaining WAL unnecessarily.

This commit introduces idle_replication_slot_timeout, a simpler and more
intuitive way to manage inactive slots. Instead of relying on WAL size,
users can set a time limit (e.g., 1 or 2 or n days), after which slots that
remain idle for longer than this amount of time are automatically
invalidated during checkpoints.

Note that the idle timeout invalidation mechanism is not applicable
for slots that do not reserve WAL or for slots on the standby server
that are being synced from the primary server (i.e., standby slots
having 'synced' field 'true'). Synced slots are always considered to be
inactive because they don't perform logical decoding to produce changes.
---
 doc/src/sgml/config.sgml                      |  40 +++
 doc/src/sgml/logical-replication.sgml         |   5 +
 doc/src/sgml/system-views.sgml                |   7 +
 src/backend/access/transam/xlog.c             |   4 +-
 src/backend/replication/slot.c                | 326 ++++++++++++++----
 src/backend/replication/slotfuncs.c           |   2 +-
 src/backend/utils/adt/timestamp.c             |  18 +
 src/backend/utils/misc/guc_tables.c           |  12 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_basebackup/pg_createsubscriber.c   |   4 +
 src/bin/pg_upgrade/server.c                   |   7 +
 src/include/replication/slot.h                |  22 +-
 src/include/utils/guc_hooks.h                 |   2 +
 src/include/utils/timestamp.h                 |   3 +
 src/tools/pgindent/typedefs.list              |   1 +
 15 files changed, 368 insertions(+), 86 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 60829b79d8..7b4b426fa7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4429,6 +4429,46 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
        </listitem>
       </varlistentry>
 
+     <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+      <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots that have remained idle longer than this
+        duration. If this value is specified without units, it is taken as
+        minutes. A value of zero (the default) disables the idle timeout
+        invalidation mechanism. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        Slot invalidation due to idle timeout occurs during checkpoint.
+        Because checkpoints happen at <varname>checkpoint_timeout</varname>
+        intervals, there can be some lag between when the
+        <varname>idle_replication_slot_timeout</varname> was exceeded and when
+        the slot invalidation is triggered at the next checkpoint.
+        To avoid such lags, users can force a checkpoint to promptly invalidate
+        inactive slots. The duration of slot inactivity is calculated using the
+        slot's <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+        value.
+       </para>
+
+       <para>
+        Note that the idle timeout invalidation mechanism is not applicable
+        for slots that do not reserve WAL or for slots on the standby server
+        that are being synced from the primary server (i.e., standby slots
+        having <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+        value <literal>true</literal>). Synced slots are always considered to
+        be inactive because they don't perform logical decoding to produce
+        changes.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
       <term><varname>wal_sender_timeout</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 613abcd28b..3d18e507bb 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2390,6 +2390,11 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
     plus some reserve for table synchronization.
    </para>
 
+   <para>
+    Logical replication slots are also affected by
+    <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+   </para>
+
    <para>
     <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
     should be set to at least the same as
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be81c2b51d..f58b9406e4 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2621,6 +2621,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           perform logical decoding.  It is set only for logical slots.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>idle_timeout</literal> means that the slot has remained
+          idle longer than the configured
+          <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 75d5554c77..5524276bc3 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7421,7 +7421,7 @@ CreateCheckPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 	KeepLogSeg(recptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
@@ -7876,7 +7876,7 @@ CreateRestartPoint(int flags)
 	replayPtr = GetXLogReplayRecPtr(&replayTLI);
 	endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr;
 	KeepLogSeg(endptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index fe5acd8b1f..674998b4e9 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -102,16 +102,24 @@ typedef struct
 /*
  * Lookup table for slot invalidation causes.
  */
-const char *const SlotInvalidationCauses[] = {
-	[RS_INVAL_NONE] = "none",
-	[RS_INVAL_WAL_REMOVED] = "wal_removed",
-	[RS_INVAL_HORIZON] = "rows_removed",
-	[RS_INVAL_WAL_LEVEL] = "wal_level_insufficient",
+typedef struct SlotInvalidationCauseMap
+{
+	ReplicationSlotInvalidationCause cause;
+	const char *cause_name;
+} SlotInvalidationCauseMap;
+
+static const SlotInvalidationCauseMap SlotInvalidationCauses[] = {
+	{RS_INVAL_NONE, "none"},
+	{RS_INVAL_WAL_REMOVED, "wal_removed"},
+	{RS_INVAL_HORIZON, "rows_removed"},
+	{RS_INVAL_WAL_LEVEL, "wal_level_insufficient"},
+	{RS_INVAL_IDLE_TIMEOUT, "idle_timeout"},
 };
 
-/* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
-
+/*
+ * Ensure that the lookup table is up-to-date with the enums defined in
+ * ReplicationSlotInvalidationCause.
+ */
 StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
 				 "array length mismatch");
 
@@ -141,6 +149,12 @@ ReplicationSlot *MyReplicationSlot = NULL;
 int			max_replication_slots = 10; /* the maximum number of replication
 										 * slots */
 
+/*
+ * Invalidate replication slots that have remained idle longer than this
+ * duration; '0' disables it.
+ */
+int			idle_replication_slot_timeout_mins = 0;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -575,7 +589,7 @@ retry:
 				errmsg("can no longer access replication slot \"%s\"",
 					   NameStr(s->data.name)),
 				errdetail("This replication slot has been invalidated due to \"%s\".",
-						  SlotInvalidationCauses[s->data.invalidated]));
+						  GetSlotInvalidationCauseName(s->data.invalidated)));
 	}
 
 	/*
@@ -592,14 +606,23 @@ retry:
 		if (!nowait)
 			ConditionVariablePrepareToSleep(&s->active_cv);
 
+		/*
+		 * It is important to reset the inactive_since under spinlock here to
+		 * avoid race conditions with slot invalidation. See comments related
+		 * to inactive_since in InvalidatePossiblyObsoleteSlot.
+		 */
 		SpinLockAcquire(&s->mutex);
 		if (s->active_pid == 0)
 			s->active_pid = MyProcPid;
 		active_pid = s->active_pid;
+		ReplicationSlotSetInactiveSince(s, 0, false);
 		SpinLockRelease(&s->mutex);
 	}
 	else
+	{
 		active_pid = MyProcPid;
+		ReplicationSlotSetInactiveSince(s, 0, true);
+	}
 	LWLockRelease(ReplicationSlotControlLock);
 
 	/*
@@ -640,11 +663,6 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
-	/*
-	 * Reset the time since the slot has become inactive as the slot is active
-	 * now.
-	 */
-	ReplicationSlotSetInactiveSince(s, 0, true);
 
 	if (am_walsender)
 	{
@@ -1512,12 +1530,14 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   NameData slotname,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
-					   TransactionId snapshotConflictHorizon)
+					   TransactionId snapshotConflictHorizon,
+					   long slot_idle_seconds)
 {
 	StringInfoData err_detail;
-	bool		hint = false;
+	StringInfoData err_hint;
 
 	initStringInfo(&err_detail);
+	initStringInfo(&err_hint);
 
 	switch (cause)
 	{
@@ -1525,13 +1545,15 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			{
 				unsigned long long ex = oldestLSN - restart_lsn;
 
-				hint = true;
 				appendStringInfo(&err_detail,
 								 ngettext("The slot's restart_lsn %X/%X exceeds the limit by %llu byte.",
 										  "The slot's restart_lsn %X/%X exceeds the limit by %llu bytes.",
 										  ex),
 								 LSN_FORMAT_ARGS(restart_lsn),
 								 ex);
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+								 "max_slot_wal_keep_size");
 				break;
 			}
 		case RS_INVAL_HORIZON:
@@ -1542,6 +1564,21 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 		case RS_INVAL_WAL_LEVEL:
 			appendStringInfoString(&err_detail, _("Logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary server."));
 			break;
+
+		case RS_INVAL_IDLE_TIMEOUT:
+			{
+				int			minutes = slot_idle_seconds / SECS_PER_MINUTE;
+				int			secs = slot_idle_seconds % SECS_PER_MINUTE;
+
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_detail, _("The slot's idle time of %dmin %02ds exceeds the configured \"%s\" duration of %dmin."),
+								 minutes, secs, "idle_replication_slot_timeout",
+								 idle_replication_slot_timeout_mins);
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+								 "idle_replication_slot_timeout");
+				break;
+			}
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1553,9 +1590,99 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 			errmsg("invalidating obsolete replication slot \"%s\"",
 				   NameStr(slotname)),
 			errdetail_internal("%s", err_detail.data),
-			hint ? errhint("You might need to increase \"%s\".", "max_slot_wal_keep_size") : 0);
+			err_hint.len ? errhint("%s", err_hint.data) : 0);
 
 	pfree(err_detail.data);
+	pfree(err_hint.data);
+}
+
+/*
+ * Can we invalidate an idle replication slot?
+ *
+ * Idle timeout invalidation is allowed only when:
+ *
+ * 1. Idle timeout is set
+ * 2. Slot has reserved WAL
+ * 3. Slot is inactive
+ * 4. The slot is not being synced from the primary while the server is in
+ *	  recovery. This is because synced slots are always considered to be
+ *	  inactive because they don't perform logical decoding to produce changes.
+ */
+static inline bool
+CanInvalidateIdleSlot(ReplicationSlot *s)
+{
+	return (idle_replication_slot_timeout_mins != 0 &&
+			!XLogRecPtrIsInvalid(s->data.restart_lsn) &&
+			s->inactive_since > 0 &&
+			!(RecoveryInProgress() && s->data.synced));
+}
+
+/*
+ * DetermineSlotInvalidationCause - Determine the cause for which a slot
+ * becomes invalid among the given possible causes.
+ *
+ * This function sequentially checks all possible invalidation causes and
+ * returns the first one for which the slot is eligible for invalidation.
+ */
+static ReplicationSlotInvalidationCause
+DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s,
+							   XLogRecPtr oldestLSN, Oid dboid,
+							   TransactionId snapshotConflictHorizon,
+							   TransactionId initial_effective_xmin,
+							   TransactionId initial_catalog_effective_xmin,
+							   XLogRecPtr initial_restart_lsn,
+							   TimestampTz *inactive_since, TimestampTz now)
+{
+	Assert(possible_causes != RS_INVAL_NONE);
+
+	if (possible_causes & RS_INVAL_WAL_REMOVED)
+	{
+		if (initial_restart_lsn != InvalidXLogRecPtr &&
+			initial_restart_lsn < oldestLSN)
+			return RS_INVAL_WAL_REMOVED;
+	}
+
+	if (possible_causes & RS_INVAL_HORIZON)
+	{
+		/* invalid DB oid signals a shared relation */
+		if (SlotIsLogical(s) &&
+			(dboid == InvalidOid || dboid == s->data.database))
+		{
+			if (TransactionIdIsValid(initial_effective_xmin) &&
+				TransactionIdPrecedesOrEquals(initial_effective_xmin,
+											  snapshotConflictHorizon))
+				return RS_INVAL_HORIZON;
+			else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
+					 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
+												   snapshotConflictHorizon))
+				return RS_INVAL_HORIZON;
+		}
+	}
+
+	if (possible_causes & RS_INVAL_WAL_LEVEL)
+	{
+		if (SlotIsLogical(s))
+			return RS_INVAL_WAL_LEVEL;
+	}
+
+	if (possible_causes & RS_INVAL_IDLE_TIMEOUT)
+	{
+		Assert(now > 0);
+
+		/*
+		 * Check if the slot needs to be invalidated due to
+		 * idle_replication_slot_timeout GUC.
+		 */
+		if (CanInvalidateIdleSlot(s) &&
+			TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+											  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+		{
+			*inactive_since = s->inactive_since;
+			return RS_INVAL_IDLE_TIMEOUT;
+		}
+	}
+
+	return RS_INVAL_NONE;
 }
 
 /*
@@ -1572,7 +1699,7 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
  * for syscalls, so caller must restart if we return true.
  */
 static bool
-InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
+InvalidatePossiblyObsoleteSlot(uint32 possible_causes,
 							   ReplicationSlot *s,
 							   XLogRecPtr oldestLSN,
 							   Oid dboid, TransactionId snapshotConflictHorizon,
@@ -1585,6 +1712,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
+	TimestampTz inactive_since = 0;
 
 	for (;;)
 	{
@@ -1592,6 +1720,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 		NameData	slotname;
 		int			active_pid = 0;
 		ReplicationSlotInvalidationCause invalidation_cause = RS_INVAL_NONE;
+		TimestampTz now = 0;
+		long		slot_idle_secs = 0;
 
 		Assert(LWLockHeldByMeInMode(ReplicationSlotControlLock, LW_SHARED));
 
@@ -1602,6 +1732,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			break;
 		}
 
+		if (possible_causes & RS_INVAL_IDLE_TIMEOUT)
+		{
+			/*
+			 * Assign the current time here to avoid system call overhead
+			 * while holding the spinlock in subsequent code.
+			 */
+			now = GetCurrentTimestamp();
+		}
+
 		/*
 		 * Check if the slot needs to be invalidated. If it needs to be
 		 * invalidated, and is not currently acquired, acquire it and mark it
@@ -1621,6 +1760,12 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			 * those values change since the process holding the slot has been
 			 * terminated (if any), so record them here to ensure that we
 			 * would report the correct invalidation cause.
+			 *
+			 * Unlike others, slot's inactive_since can't be changed once it
+			 * is acquired till it gets released or the process owning it gets
+			 * terminated. The slot remains active till some process owns it.
+			 * So, the inactive slot can only be invalidated immediately
+			 * without being terminated.
 			 */
 			if (!terminated)
 			{
@@ -1629,35 +1774,15 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				initial_catalog_effective_xmin = s->effective_catalog_xmin;
 			}
 
-			switch (cause)
-			{
-				case RS_INVAL_WAL_REMOVED:
-					if (initial_restart_lsn != InvalidXLogRecPtr &&
-						initial_restart_lsn < oldestLSN)
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_HORIZON:
-					if (!SlotIsLogical(s))
-						break;
-					/* invalid DB oid signals a shared relation */
-					if (dboid != InvalidOid && dboid != s->data.database)
-						break;
-					if (TransactionIdIsValid(initial_effective_xmin) &&
-						TransactionIdPrecedesOrEquals(initial_effective_xmin,
-													  snapshotConflictHorizon))
-						invalidation_cause = cause;
-					else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
-							 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
-														   snapshotConflictHorizon))
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_WAL_LEVEL:
-					if (SlotIsLogical(s))
-						invalidation_cause = cause;
-					break;
-				case RS_INVAL_NONE:
-					pg_unreachable();
-			}
+			invalidation_cause = DetermineSlotInvalidationCause(possible_causes,
+																s, oldestLSN,
+																dboid,
+																snapshotConflictHorizon,
+																initial_effective_xmin,
+																initial_catalog_effective_xmin,
+																initial_restart_lsn,
+																&inactive_since,
+																now);
 		}
 
 		/*
@@ -1705,12 +1830,25 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 		/*
 		 * The logical replication slots shouldn't be invalidated as GUC
-		 * max_slot_wal_keep_size is set to -1 during the binary upgrade. See
-		 * check_old_cluster_for_valid_slots() where we ensure that no
-		 * invalidated before the upgrade.
+		 * max_slot_wal_keep_size is set to -1 and
+		 * idle_replication_slot_timeout is set to 0 during the binary
+		 * upgrade. See check_old_cluster_for_valid_slots() where we ensure
+		 * that no invalidated before the upgrade.
 		 */
 		Assert(!(*invalidated && SlotIsLogical(s) && IsBinaryUpgrade));
 
+		/*
+		 * Calculate the idle time duration of the slot if slot is marked
+		 * invalidated with RS_INVAL_IDLE_TIMEOUT.
+		 */
+		if (invalidation_cause == RS_INVAL_IDLE_TIMEOUT)
+		{
+			int			slot_idle_usecs;
+
+			TimestampDifference(inactive_since, now, &slot_idle_secs,
+								&slot_idle_usecs);
+		}
+
 		if (active_pid != 0)
 		{
 			/*
@@ -1739,7 +1877,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 			{
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
-									   oldestLSN, snapshotConflictHorizon);
+									   oldestLSN, snapshotConflictHorizon,
+									   slot_idle_secs);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SendProcSignal(active_pid,
@@ -1785,7 +1924,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
-								   oldestLSN, snapshotConflictHorizon);
+								   oldestLSN, snapshotConflictHorizon,
+								   slot_idle_secs);
 
 			/* done with this slot for now */
 			break;
@@ -1802,26 +1942,32 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
  *
  * Returns true when any slot have got invalidated.
  *
- * Whether a slot needs to be invalidated depends on the cause. A slot is
- * removed if it:
+ * Whether a slot needs to be invalidated depends on the invalidation cause.
+ * A slot is invalidated if it:
  * - RS_INVAL_WAL_REMOVED: requires a LSN older than the given segment
  * - RS_INVAL_HORIZON: requires a snapshot <= the given horizon in the given
  *   db; dboid may be InvalidOid for shared relations
- * - RS_INVAL_WAL_LEVEL: is logical
+ * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient
+ * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
+ *   "idle_replication_slot_timeout" duration.
+ *
+ * Note: This function attempts to invalidate the slot for multiple possible
+ * causes in a single pass, minimizing redundant iterations. The "cause"
+ * parameter can be a MASK representing one or more of the defined causes.
  *
  * NB - this runs as part of checkpoint, so avoid raising errors if possible.
  */
 bool
-InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
+InvalidateObsoleteReplicationSlots(uint32 possible_causes,
 								   XLogSegNo oldestSegno, Oid dboid,
 								   TransactionId snapshotConflictHorizon)
 {
 	XLogRecPtr	oldestLSN;
 	bool		invalidated = false;
 
-	Assert(cause != RS_INVAL_HORIZON || TransactionIdIsValid(snapshotConflictHorizon));
-	Assert(cause != RS_INVAL_WAL_REMOVED || oldestSegno > 0);
-	Assert(cause != RS_INVAL_NONE);
+	Assert(!(possible_causes & RS_INVAL_HORIZON) || TransactionIdIsValid(snapshotConflictHorizon));
+	Assert(!(possible_causes & RS_INVAL_WAL_REMOVED) || oldestSegno > 0);
+	Assert(possible_causes != RS_INVAL_NONE);
 
 	if (max_replication_slots == 0)
 		return invalidated;
@@ -1837,7 +1983,7 @@ restart:
 		if (!s->in_use)
 			continue;
 
-		if (InvalidatePossiblyObsoleteSlot(cause, s, oldestLSN, dboid,
+		if (InvalidatePossiblyObsoleteSlot(possible_causes, s, oldestLSN, dboid,
 										   snapshotConflictHorizon,
 										   &invalidated))
 		{
@@ -2426,26 +2572,37 @@ RestoreSlotFromDisk(const char *name)
  * ReplicationSlotInvalidationCause.
  */
 ReplicationSlotInvalidationCause
-GetSlotInvalidationCause(const char *invalidation_reason)
+GetSlotInvalidationCause(const char *cause_name)
 {
-	ReplicationSlotInvalidationCause cause;
-	ReplicationSlotInvalidationCause result = RS_INVAL_NONE;
-	bool		found PG_USED_FOR_ASSERTS_ONLY = false;
+	Assert(cause_name);
 
-	Assert(invalidation_reason);
+	/* Search lookup table for the cause having this name */
+	for (int i = 0; i <= RS_INVAL_MAX_CAUSES; i++)
+	{
+		if (strcmp(SlotInvalidationCauses[i].cause_name, cause_name) == 0)
+			return SlotInvalidationCauses[i].cause;
+	}
+
+	Assert(false);
+	return RS_INVAL_NONE;		/* to keep compiler quiet */
+}
 
-	for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++)
+/*
+ * Maps an ReplicationSlotInvalidationCause to the invalidation
+ * reason for a replication slot.
+ */
+const char *
+GetSlotInvalidationCauseName(ReplicationSlotInvalidationCause cause)
+{
+	/* Search lookup table for the name of this cause */
+	for (int i = 0; i <= RS_INVAL_MAX_CAUSES; i++)
 	{
-		if (strcmp(SlotInvalidationCauses[cause], invalidation_reason) == 0)
-		{
-			found = true;
-			result = cause;
-			break;
-		}
+		if (SlotInvalidationCauses[i].cause == cause)
+			return SlotInvalidationCauses[i].cause_name;
 	}
 
-	Assert(found);
-	return result;
+	Assert(false);
+	return "none";				/* to keep compiler quiet */
 }
 
 /*
@@ -2802,3 +2959,22 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * GUC check_hook for idle_replication_slot_timeout
+ *
+ * The value of idle_replication_slot_timeout must be set to 0 during
+ * a binary upgrade. See start_postmaster() in pg_upgrade for more details.
+ */
+bool
+check_idle_replication_slot_timeout(int *newval, void **extra, GucSource source)
+{
+	if (IsBinaryUpgrade && *newval != 0)
+	{
+		GUC_check_errdetail("The value of \"%s\" must be set to 0 during binary upgrade mode.",
+							"idle_replication_slot_timeout");
+		return false;
+	}
+
+	return true;
+}
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 8be4b8c65b..f652ec8a73 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -431,7 +431,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 		if (cause == RS_INVAL_NONE)
 			nulls[i++] = true;
 		else
-			values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
+			values[i++] = CStringGetTextDatum(GetSlotInvalidationCauseName(cause));
 
 		values[i++] = BoolGetDatum(slot_contents.data.failover);
 
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae0506..9682f9dbdc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1786,6 +1786,24 @@ TimestampDifferenceExceeds(TimestampTz start_time,
 	return (diff >= msec * INT64CONST(1000));
 }
 
+/*
+ * Check if the difference between two timestamps is >= a given
+ * threshold (expressed in seconds).
+ */
+bool
+TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+								  TimestampTz stop_time,
+								  int threshold_sec)
+{
+	long		secs;
+	int			usecs;
+
+	/* Calculate the difference in seconds */
+	TimestampDifference(start_time, stop_time, &secs, &usecs);
+
+	return (secs >= threshold_sec);
+}
+
 /*
  * Convert a time_t to TimestampTz.
  *
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 4272818932..8d2ae66c20 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3058,6 +3058,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"idle_replication_slot_timeout", PGC_SIGHUP, REPLICATION_SENDING,
+			gettext_noop("Sets the duration a replication slot can remain idle before "
+						 "it is invalidated."),
+			NULL,
+			GUC_UNIT_MIN
+		},
+		&idle_replication_slot_timeout_mins,
+		0, 0, INT_MAX / SECS_PER_MINUTE,
+		check_idle_replication_slot_timeout, NULL, NULL
+	},
+
 	{
 		{"commit_delay", PGC_SUSET, WAL_SETTINGS,
 			gettext_noop("Sets the delay in microseconds between transaction commit and "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index d472987ed4..415f253096 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -329,6 +329,7 @@
 				# (change requires restart)
 #wal_keep_size = 0		# in megabytes; 0 disables
 #max_slot_wal_keep_size = -1	# in megabytes; -1 disables
+#idle_replication_slot_timeout = 0	# in minutes; 0 disables
 #wal_sender_timeout = 60s	# in milliseconds; 0 disables
 #track_commit_timestamp = off	# collect timestamp of transaction commit
 				# (change requires restart)
diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index 2d881d54f5..4968474d8c 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1439,6 +1439,10 @@ start_standby_server(const struct CreateSubscriberOptions *opt, bool restricted_
 	appendPQExpBuffer(pg_ctl_cmd, "\"%s\" start -D ", pg_ctl_path);
 	appendShellString(pg_ctl_cmd, subscriber_dir);
 	appendPQExpBuffer(pg_ctl_cmd, " -s -o \"-c sync_replication_slots=off\"");
+
+	/* Prevent unintended slot invalidation */
+	appendPQExpBuffer(pg_ctl_cmd, " -o \"-c idle_replication_slot_timeout=0\"");
+
 	if (restricted_access)
 	{
 		appendPQExpBuffer(pg_ctl_cmd, " -o \"-p %s\"", opt->sub_port);
diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index de6971cde6..873e5b5117 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -252,6 +252,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
 		appendPQExpBufferStr(&pgoptions, " -c max_slot_wal_keep_size=-1");
 
+	/*
+	 * Use idle_replication_slot_timeout=0 to prevent slot invalidation due to
+	 * idle_timeout by checkpointer process during upgrade.
+	 */
+	if (GET_MAJOR_VERSION(cluster->major_version) >= 1800)
+		appendPQExpBufferStr(&pgoptions, " -c idle_replication_slot_timeout=0");
+
 	/*
 	 * Use -b to disable autovacuum and logical replication launcher
 	 * (effective in PG17 or later for the latter).
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 000c36d30d..f5a24ccfbf 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -44,21 +44,25 @@ typedef enum ReplicationSlotPersistency
  * Slots can be invalidated, e.g. due to max_slot_wal_keep_size. If so, the
  * 'invalidated' field is set to a value other than _NONE.
  *
- * When adding a new invalidation cause here, remember to update
- * SlotInvalidationCauses and RS_INVAL_MAX_CAUSES.
+ * When adding a new invalidation cause here, the value must be powers of 2
+ * (e.g., 1, 2, 4...) for proper bitwise operations. Also, remember to update
+ * RS_INVAL_MAX_CAUSES below, and SlotInvalidationCauses in slot.c.
  */
 typedef enum ReplicationSlotInvalidationCause
 {
-	RS_INVAL_NONE,
+	RS_INVAL_NONE = 0,
 	/* required WAL has been removed */
-	RS_INVAL_WAL_REMOVED,
+	RS_INVAL_WAL_REMOVED = (1 << 0),
 	/* required rows have been removed */
-	RS_INVAL_HORIZON,
+	RS_INVAL_HORIZON = (1 << 1),
 	/* wal_level insufficient for slot */
-	RS_INVAL_WAL_LEVEL,
+	RS_INVAL_WAL_LEVEL = (1 << 2),
+	/* idle slot timeout has occurred */
+	RS_INVAL_IDLE_TIMEOUT = (1 << 3),
 } ReplicationSlotInvalidationCause;
 
-extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
+/* Maximum number of invalidation causes */
+#define	RS_INVAL_MAX_CAUSES 4
 
 /*
  * On-Disk data of a replication slot, preserved across restarts.
@@ -254,6 +258,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 /* GUCs */
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
+extern PGDLLIMPORT int idle_replication_slot_timeout_mins;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -286,7 +291,7 @@ extern void ReplicationSlotsComputeRequiredLSN(void);
 extern XLogRecPtr ReplicationSlotsComputeLogicalRestartLSN(void);
 extern bool ReplicationSlotsCountDBSlots(Oid dboid, int *nslots, int *nactive);
 extern void ReplicationSlotsDropDBSlots(Oid dboid);
-extern bool InvalidateObsoleteReplicationSlots(ReplicationSlotInvalidationCause cause,
+extern bool InvalidateObsoleteReplicationSlots(uint32 possible_causes,
 											   XLogSegNo oldestSegno,
 											   Oid dboid,
 											   TransactionId snapshotConflictHorizon);
@@ -303,6 +308,7 @@ extern void CheckSlotRequirements(void);
 extern void CheckSlotPermissions(void);
 extern ReplicationSlotInvalidationCause
 			GetSlotInvalidationCause(const char *invalidation_reason);
+extern const char *GetSlotInvalidationCauseName(ReplicationSlotInvalidationCause cause);
 
 extern bool SlotExistsInSyncStandbySlots(const char *slot_name);
 extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 87999218d6..951451a976 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -174,5 +174,7 @@ extern void assign_wal_sync_method(int new_wal_sync_method, void *extra);
 extern bool check_synchronized_standby_slots(char **newval, void **extra,
 											 GucSource source);
 extern void assign_synchronized_standby_slots(const char *newval, void *extra);
+extern bool check_idle_replication_slot_timeout(int *newval, void **extra,
+												GucSource source);
 
 #endif							/* GUC_HOOKS_H */
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb8..9963bddc0e 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -107,6 +107,9 @@ extern long TimestampDifferenceMilliseconds(TimestampTz start_time,
 extern bool TimestampDifferenceExceeds(TimestampTz start_time,
 									   TimestampTz stop_time,
 									   int msec);
+extern bool TimestampDifferenceExceedsSeconds(TimestampTz start_time,
+											  TimestampTz stop_time,
+											  int threshold_sec);
 
 extern TimestampTz time_t_to_timestamptz(pg_time_t tm);
 extern pg_time_t timestamptz_to_time_t(TimestampTz t);
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b6c170ac24..55fcfe482a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2688,6 +2688,7 @@ SkipPages
 SlabBlock
 SlabContext
 SlabSlot
+SlotInvalidationCauseMap
 SlotNumber
 SlotSyncCtxStruct
 SlruCtl
-- 
2.34.1

v79-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchapplication/octet-stream; name=v79-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patchDownload

From c3cbafdd302fb4e284a9ce04b725089675c935fc Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Tue, 11 Feb 2025 17:26:15 +0530
Subject: [PATCH v79 2/2] Add TAP test for slot invalidation based on inactive
 timeout.

This test uses injection points to bypass the time overhead caused by the
idle_replication_slot_timeout GUC, which has a minimum value of one minute.
---
 src/backend/replication/slot.c                |  29 +++--
 src/test/recovery/meson.build                 |   1 +
 .../t/044_invalidate_inactive_slots.pl        | 106 ++++++++++++++++++
 3 files changed, 127 insertions(+), 9 deletions(-)
 create mode 100644 src/test/recovery/t/044_invalidate_inactive_slots.pl

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 674998b4e9..68d6f006f5 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -55,6 +55,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/builtins.h"
+#include "utils/injection_point.h"
 #include "utils/guc_hooks.h"
 #include "utils/varlena.h"
 
@@ -1669,16 +1670,26 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s,
 	{
 		Assert(now > 0);
 
-		/*
-		 * Check if the slot needs to be invalidated due to
-		 * idle_replication_slot_timeout GUC.
-		 */
-		if (CanInvalidateIdleSlot(s) &&
-			TimestampDifferenceExceedsSeconds(s->inactive_since, now,
-											  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+		if (CanInvalidateIdleSlot(s))
 		{
-			*inactive_since = s->inactive_since;
-			return RS_INVAL_IDLE_TIMEOUT;
+			/*
+			 * Check if the slot needs to be invalidated due to
+			 * idle_replication_slot_timeout GUC.
+			 *
+			 * To test idle timeout slot invalidation, if the
+			 * "slot-timeout-inval" injection point is attached, immediately
+			 * invalidate the slot.
+			 */
+			if (
+#ifdef USE_INJECTION_POINTS
+				IS_INJECTION_POINT_ATTACHED("slot-timeout-inval") ||
+#endif
+				TimestampDifferenceExceedsSeconds(s->inactive_since, now,
+												  idle_replication_slot_timeout_mins * SECS_PER_MINUTE))
+			{
+				*inactive_since = s->inactive_since;
+				return RS_INVAL_IDLE_TIMEOUT;
+			}
 		}
 	}
 
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 0428704dbf..057bcde143 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -52,6 +52,7 @@ tests += {
       't/041_checkpoint_at_promote.pl',
       't/042_low_level_backup.pl',
       't/043_no_contrecord_switch.pl',
+      't/044_invalidate_inactive_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/044_invalidate_inactive_slots.pl b/src/test/recovery/t/044_invalidate_inactive_slots.pl
new file mode 100644
index 0000000000..949b0aa7be
--- /dev/null
+++ b/src/test/recovery/t/044_invalidate_inactive_slots.pl
@@ -0,0 +1,106 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation due to idle_timeout
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# This test depends on injection point that forces slot invalidation
+# due to idle_timeout.
+# https://www.postgresql.org/docs/current/xfunc-c.html#XFUNC-ADDIN-INJECTION-POINTS
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+# Wait for slot to first become idle and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $offset) = @_;
+	my $node_name = $node->name;
+
+	# The slot's invalidation should be logged
+	$node->wait_for_log(
+		qr/invalidating obsolete replication slot \"$slot_name\"/, $offset);
+
+	# Check that the invalidation reason is 'idle_timeout'
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = 'idle_timeout';
+	])
+	  or die
+	  "Timed out while waiting for invalidation reason of slot $slot_name to be set on node $node_name";
+}
+
+# ========================================================================
+# Testcase start
+#
+# Test invalidation of physical replication slot and logical replication slot
+# due to idle timeout.
+
+# Initialize the node
+my $node = PostgreSQL::Test::Cluster->new('node');
+$node->init(allows_streaming => 'logical');
+
+# Avoid unpredictability
+$node->append_conf(
+	'postgresql.conf', qq{
+checkpoint_timeout = 1h
+idle_replication_slot_timeout = 1min
+});
+$node->start;
+
+# Check if the 'injection_points' extension is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create both physical and logical replication slots
+$node->safe_psql(
+	'postgres', qq[
+		SELECT pg_create_physical_replication_slot(slot_name := 'physical_slot', immediately_reserve := true);
+		SELECT pg_create_logical_replication_slot('logical_slot', 'test_decoding');
+]);
+
+my $log_offset = -s $node->logfile;
+
+# Register an injection point on the node to forcibly cause a slot
+# invalidation due to idle_timeout
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('slot-timeout-inval', 'error');");
+
+# Idle timeout slot invalidation occurs during a checkpoint, so run a
+# checkpoint to invalidate the slots.
+$node->safe_psql('postgres', "CHECKPOINT");
+
+# Wait for slots to become inactive. Note that since nobody has acquired the
+# slot yet, then if it has been invalidated that can only be due to the idle
+# timeout mechanism.
+wait_for_slot_invalidation($node, 'physical_slot', $log_offset);
+wait_for_slot_invalidation($node, 'logical_slot', $log_offset);
+
+# Check that the invalidated slot cannot be acquired
+my ($result, $stdout, $stderr);
+($result, $stdout, $stderr) = $node->psql(
+	'postgres', qq[
+		SELECT pg_replication_slot_advance('logical_slot', '0/1');
+]);
+ok( $stderr =~ /can no longer access replication slot "logical_slot"/,
+	"detected error upon trying to acquire invalidated slot on node")
+  or die
+  "could not detect error upon trying to acquire invalidated slot \"logical_slot\" on node";
+
+# Testcase end
+# =============================================================================
+
+done_testing();
-- 
2.34.1

#411

Nathan Bossart

nathandbossart@gmail.com

11 months ago

In reply to: Amit Kapila (#408)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Feb 17, 2025 at 07:57:22AM +0530, Amit Kapila wrote:

On Wed, Feb 12, 2025 at 1:16 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

On Wednesday, February 12, 2025 11:56 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Also, we previously didn't have a good experience with XID-based threshold
parameters like vacuum_defer_cleanup_age as mentioned by Robert (1).
AFAICU from the previous discussion we need a time-based parameter and we
didn't rule out xid_age based parameter as another parameter.

I am not sure I buy the comparison with vacuum_defer_cleanup_age. That is
a very different feature than max_slot_xid_age, and we still have a number
of XID-based parameters (vacuum_freeze_table_age, vacuum_freeze_min_age,
vacuum_failsafe_age, the multixact versions of those parameters, and the
autovacuum versions).

Yeah, I think the primary purpose of this time-based option is to invalidate dormant
replication slots that have been inactive for a long period, in which case the
slots are no longer useful.

Such slots can remain if a subscriber is down due to a system error or
inaccessible because of network issues. If this situation persists, it might be
more practical to recreate the subscriber rather than attempt to recover the
node and wait for it to catch up, which could be time-consuming.

Parameters like max_slot_wal_keep_size and max_slot_xid_id_age do not
differentiate between active and inactive replication slots. Some customers I
met are hesitant about using these settings, as they can sometimes invalidate
a slot unnecessarily and break the replication.

Sure, an inactive-timeout feature won't break replication, but it's also
not going to be terribly effective against wraparound-related issues. It
seems weird to me to allow an active replication slot to take priority over
imminent storage/XID issues it causes.

Alvaro, Nathan, do let us know if you would like to discuss more on
the use case for this new GUC idle_replication_slot_timeout?
Otherwise, we can proceed with this patch.

I guess I'm not mortally opposed to it. I just think we really need
proper backstops against the storage/XID issues more than we need this one,
and I don't want it to be mistaken for a solution to those problems.

--
nathan

#412

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Nathan Bossart (#411)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Mon, Feb 17, 2025 at 10:18 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:

On Mon, Feb 17, 2025 at 07:57:22AM +0530, Amit Kapila wrote:

Alvaro, Nathan, do let us know if you would like to discuss more on
the use case for this new GUC idle_replication_slot_timeout?
Otherwise, we can proceed with this patch.

I guess I'm not mortally opposed to it. I just think we really need
proper backstops against the storage/XID issues more than we need this one,
and I don't want it to be mistaken for a solution to those problems.

Fair enough. I see your point and would like to discuss the other
parameter in a separate thread. I plan to push the 0001 tomorrow after
some more review/testing unless I see any further arguments or
comments.

--
With Regards,
Amit Kapila.

#413

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Amit Kapila (#412)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Tue, Feb 18, 2025 at 8:42 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Feb 17, 2025 at 10:18 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:

On Mon, Feb 17, 2025 at 07:57:22AM +0530, Amit Kapila wrote:

Alvaro, Nathan, do let us know if you would like to discuss more on
the use case for this new GUC idle_replication_slot_timeout?
Otherwise, we can proceed with this patch.

I guess I'm not mortally opposed to it. I just think we really need
proper backstops against the storage/XID issues more than we need this one,
and I don't want it to be mistaken for a solution to those problems.

Fair enough. I see your point and would like to discuss the other
parameter in a separate thread. I plan to push the 0001 tomorrow after
some more review/testing unless I see any further arguments or
comments.

Pushed after minor modifications.

--
With Regards,
Amit Kapila.

#414

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Amit Kapila (#413)

Re: Introduce XID age and inactive timeout based replication slot invalidation

On Wed, Feb 19, 2025 at 3:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Pushed after minor modifications.

I have closed the corresponding CF entry. Please feel free to start a
new thread for xid age based parameter.

--
With Regards,
Amit Kapila.