pgsql: Track last_inactive_time in pg_replication_slots.

Started by Amit Kapilaalmost 2 years ago33 messages
#1Amit Kapila
akapila@postgresql.org

Track last_inactive_time in pg_replication_slots.

This commit adds a new property called last_inactive_time for slots. It is
set to 0 whenever a slot is made active/acquired and set to the current
timestamp whenever the slot is inactive/released or restored from the disk.
Note that we don't set the last_inactive_time for the slots currently being
synced from the primary to the standby because such slots are typically
inactive as decoding is not allowed on those.

The 'last_inactive_time' will be useful on production servers to debug and
analyze inactive replication slots. It will also help to know the lifetime
of a replication slot - one can know how long a streaming standby, logical
subscriber, or replication slot consumer is down.

The 'last_inactive_time' will also be useful to implement inactive
timeout-based replication slot invalidation in a future commit.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: /messages/by-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/a11f330b5584f2430371d68871e00f5c63735299

Modified Files
--------------
doc/src/sgml/system-views.sgml | 10 ++
src/backend/catalog/system_views.sql | 1 +
src/backend/replication/slot.c | 35 +++++++
src/backend/replication/slotfuncs.c | 7 +-
src/include/catalog/catversion.h | 2 +-
src/include/catalog/pg_proc.dat | 6 +-
src/include/replication/slot.h | 3 +
src/test/recovery/t/019_replslot_limit.pl | 152 ++++++++++++++++++++++++++++++
src/test/regress/expected/rules.out | 3 +-
9 files changed, 213 insertions(+), 6 deletions(-)

#2Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#1)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Mon, Mar 25, 2024 at 7:11 AM Amit Kapila <akapila@postgresql.org> wrote:

Track last_inactive_time in pg_replication_slots.

This commit adds a new property called last_inactive_time for slots. It is
set to 0 whenever a slot is made active/acquired and set to the current
timestamp whenever the slot is inactive/released or restored from the disk.
Note that we don't set the last_inactive_time for the slots currently being
synced from the primary to the standby because such slots are typically
inactive as decoding is not allowed on those.

So the field name is last_inactive_time, but if I'm reading this
description right, it's actually the last time the slot was active,
except for the weird exception for slots being synced. I'm wondering
if this field needs to be renamed.

And I'm suspicious that having an exception for slots being synced is
a bad idea. That makes too much of a judgement about how the user will
use this field. It's usually better to just expose the data, and if
the user needs helps to make sense of that data, then give them that
help separately. In this case, that would mean removing the exception,
but making it easy to tell the difference between slots are inactive
because they're being synced and slots that are inactive for some
other reason.

--
Robert Haas
EDB: http://www.enterprisedb.com

#3Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#2)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Mon, Mar 25, 2024 at 6:57 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Mar 25, 2024 at 7:11 AM Amit Kapila <akapila@postgresql.org> wrote:

Track last_inactive_time in pg_replication_slots.

This commit adds a new property called last_inactive_time for slots. It is
set to 0 whenever a slot is made active/acquired and set to the current
timestamp whenever the slot is inactive/released or restored from the disk.
Note that we don't set the last_inactive_time for the slots currently being
synced from the primary to the standby because such slots are typically
inactive as decoding is not allowed on those.

So the field name is last_inactive_time, but if I'm reading this
description right, it's actually the last time the slot was active,
except for the weird exception for slots being synced. I'm wondering
if this field needs to be renamed.

We considered the other two names as last_inactive_at and
last_active_time. For the first (last_inactive_at), there was an
argument that most other fields that display time ends with _time. For
the second (last_active_time), there was an argument that it could be
misleading as one could think that it should be updated each time WAL
record decoding is happening [1]/messages/by-id/Zf1yx9QMbhgJ/Lfy@ip-10-97-1-34.eu-west-3.compute.internal. The other possibility is to name it
last_used_time but I think it won't be much different from
last_active_time.

And I'm suspicious that having an exception for slots being synced is
a bad idea. That makes too much of a judgement about how the user will
use this field. It's usually better to just expose the data, and if
the user needs helps to make sense of that data, then give them that
help separately.

The reason we didn't set this for sync slots is that they won't be
usable (one can't use them to decode WAL) unless standby is promoted
[2]: /messages/by-id/CAJpy0uBGv85dFiWMnNLm6NuEs3eTVicsJCyRvMGbR8H+fOVBnA@mail.gmail.com
involved in this discussion to see what they think.

In this case, that would mean removing the exception,
but making it easy to tell the difference between slots are inactive
because they're being synced and slots that are inactive for some
other reason.

I think this can be differentiated with the help of 'synced' column in
pg_replication_slots.

[1]: /messages/by-id/Zf1yx9QMbhgJ/Lfy@ip-10-97-1-34.eu-west-3.compute.internal
[2]: /messages/by-id/CAJpy0uBGv85dFiWMnNLm6NuEs3eTVicsJCyRvMGbR8H+fOVBnA@mail.gmail.com

--
With Regards,
Amit Kapila.

#4Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#3)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Mon, Mar 25, 2024 at 10:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

We considered the other two names as last_inactive_at and
last_active_time. For the first (last_inactive_at), there was an
argument that most other fields that display time ends with _time. For
the second (last_active_time), there was an argument that it could be
misleading as one could think that it should be updated each time WAL
record decoding is happening [1]. The other possibility is to name it
last_used_time but I think it won't be much different from
last_active_time.

I don't understand the bit about updating it each time WAL record
decoding is happening. If it's the last active time, and the slot is
currently active, then the answer is either "right now" or "currently
undefined." I'd expect to see NULL in the system view in such a case.
And if that's so, then there's nothing to update each time a record is
decoded, because it's just still going to show NULL.

Why does this field get set to the current time when the slot is
restored from disk?

--
Robert Haas
EDB: http://www.enterprisedb.com

#5Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#4)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Mon, Mar 25, 2024 at 7:51 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Mar 25, 2024 at 10:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

We considered the other two names as last_inactive_at and
last_active_time. For the first (last_inactive_at), there was an
argument that most other fields that display time ends with _time. For
the second (last_active_time), there was an argument that it could be
misleading as one could think that it should be updated each time WAL
record decoding is happening [1]. The other possibility is to name it
last_used_time but I think it won't be much different from
last_active_time.

I don't understand the bit about updating it each time WAL record
decoding is happening. If it's the last active time, and the slot is
currently active, then the answer is either "right now" or "currently
undefined." I'd expect to see NULL in the system view in such a case.
And if that's so, then there's nothing to update each time a record is
decoded, because it's just still going to show NULL.

IIUC, Bertrand's point was that users can interpret last_active_time
as a value that gets updated each time they decode a change which is
not what we are doing. So, this can confuse users. Your expectation of
answer (NULL) when the slot is active is correct and that is what will
happen.

Why does this field get set to the current time when the slot is
restored from disk?

It is because we don't want to include the time the server is down in
the last_inactive_time. Say, if we are shutting down the server at
time X and the server remains down for another two hours, we don't
want to include those two hours as the slot inactive time. The related
theory is that this field will be used to invalidate inactive slots
based on a threshold (say inactive_timeout). Say, before the shutdown,
we release the slot and set the current_time for last_inactive_time
for each slot and persist that information as well. Now, if the server
is down for a long time, we may invalidate the slots as soon as the
server comes up. So, instead, we just set this field at the time we
read slots for disk and then reset it to 0/NULL as soon as the slot
became active.

--
With Regards,
Amit Kapila.

#6Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Amit Kapila (#5)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

Hi,

On Mon, Mar 25, 2024 at 08:38:16PM +0530, Amit Kapila wrote:

On Mon, Mar 25, 2024 at 7:51 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Mar 25, 2024 at 10:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

We considered the other two names as last_inactive_at and
last_active_time. For the first (last_inactive_at), there was an
argument that most other fields that display time ends with _time. For
the second (last_active_time), there was an argument that it could be
misleading as one could think that it should be updated each time WAL
record decoding is happening [1]. The other possibility is to name it
last_used_time but I think it won't be much different from
last_active_time.

I don't understand the bit about updating it each time WAL record
decoding is happening. If it's the last active time, and the slot is
currently active, then the answer is either "right now" or "currently
undefined." I'd expect to see NULL in the system view in such a case.
And if that's so, then there's nothing to update each time a record is
decoded, because it's just still going to show NULL.

IIUC, Bertrand's point was that users can interpret last_active_time
as a value that gets updated each time they decode a change which is
not what we are doing. So, this can confuse users. Your expectation of
answer (NULL) when the slot is active is correct and that is what will
happen.

Yeah, and so would be the confusion: why is last_active_time NULL while one is
using the slot?

Why does this field get set to the current time when the slot is
restored from disk?

It is because we don't want to include the time the server is down in
the last_inactive_time. Say, if we are shutting down the server at
time X and the server remains down for another two hours, we don't
want to include those two hours as the slot inactive time. The related
theory is that this field will be used to invalidate inactive slots
based on a threshold (say inactive_timeout). Say, before the shutdown,
we release the slot and set the current_time for last_inactive_time
for each slot and persist that information as well. Now, if the server
is down for a long time, we may invalidate the slots as soon as the
server comes up. So, instead, we just set this field at the time we
read slots for disk and then reset it to 0/NULL as soon as the slot
became active.

Right, and we also want to invalidate the slot if not used duration > timeout,
so that setting the field to zero when the slot is restored from disk is also not
an option.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#7Amit Kapila
amit.kapila16@gmail.com
In reply to: Bertrand Drouvot (#6)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Mon, Mar 25, 2024 at 8:46 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Mon, Mar 25, 2024 at 08:38:16PM +0530, Amit Kapila wrote:

On Mon, Mar 25, 2024 at 7:51 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Mar 25, 2024 at 10:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

We considered the other two names as last_inactive_at and
last_active_time. For the first (last_inactive_at), there was an
argument that most other fields that display time ends with _time. For
the second (last_active_time), there was an argument that it could be
misleading as one could think that it should be updated each time WAL
record decoding is happening [1]. The other possibility is to name it
last_used_time but I think it won't be much different from
last_active_time.

I don't understand the bit about updating it each time WAL record
decoding is happening. If it's the last active time, and the slot is
currently active, then the answer is either "right now" or "currently
undefined." I'd expect to see NULL in the system view in such a case.
And if that's so, then there's nothing to update each time a record is
decoded, because it's just still going to show NULL.

IIUC, Bertrand's point was that users can interpret last_active_time
as a value that gets updated each time they decode a change which is
not what we are doing. So, this can confuse users. Your expectation of
answer (NULL) when the slot is active is correct and that is what will
happen.

Yeah, and so would be the confusion: why is last_active_time NULL while one is
using the slot?

It is because we set it to zero when we acquire the slot and that
value will remain the same till the slot is active. I am not sure if I
understood your question so what I am saying might not make sense.

--
With Regards,
Amit Kapila.

#8Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Amit Kapila (#7)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

Hi,

On Mon, Mar 25, 2024 at 08:59:55PM +0530, Amit Kapila wrote:

On Mon, Mar 25, 2024 at 8:46 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Mon, Mar 25, 2024 at 08:38:16PM +0530, Amit Kapila wrote:

On Mon, Mar 25, 2024 at 7:51 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Mar 25, 2024 at 10:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

We considered the other two names as last_inactive_at and
last_active_time. For the first (last_inactive_at), there was an
argument that most other fields that display time ends with _time. For
the second (last_active_time), there was an argument that it could be
misleading as one could think that it should be updated each time WAL
record decoding is happening [1]. The other possibility is to name it
last_used_time but I think it won't be much different from
last_active_time.

I don't understand the bit about updating it each time WAL record
decoding is happening. If it's the last active time, and the slot is
currently active, then the answer is either "right now" or "currently
undefined." I'd expect to see NULL in the system view in such a case.
And if that's so, then there's nothing to update each time a record is
decoded, because it's just still going to show NULL.

IIUC, Bertrand's point was that users can interpret last_active_time
as a value that gets updated each time they decode a change which is
not what we are doing. So, this can confuse users. Your expectation of
answer (NULL) when the slot is active is correct and that is what will
happen.

Yeah, and so would be the confusion: why is last_active_time NULL while one is
using the slot?

It is because we set it to zero when we acquire the slot and that
value will remain the same till the slot is active. I am not sure if I
understood your question so what I am saying might not make sense.

There is no "real" question, I was just highlighting the confusion in case we
name the field "last_active_time".

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#9Robert Haas
robertmhaas@gmail.com
In reply to: Bertrand Drouvot (#6)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Mon, Mar 25, 2024 at 11:16 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

IIUC, Bertrand's point was that users can interpret last_active_time
as a value that gets updated each time they decode a change which is
not what we are doing. So, this can confuse users. Your expectation of
answer (NULL) when the slot is active is correct and that is what will
happen.

Yeah, and so would be the confusion: why is last_active_time NULL while one is
using the slot?

I agree that users could get confused here, but the solution to that
shouldn't be to give the field a name that is the opposite of what it
actually does. I expect a field called last_inactive_time to tell me
the last time that the slot was inactive. Here, it tells me the last
time that a currently-inactive slot previously *WAS* active. How can
you justify calling that the last *INACTIVE* time?

AFAICS, the user who has the confusion that you mention here is simply
wrong. If they are looking at a field called "last active time" and
the slot is active, then the correct answer is "right now" or
"undefined" and that is what they will see. Sure, they might not
understand that. But flipping the name of the field on its head cannot
be the right way to help them.

With the current naming, I expect to have the exact opposite confusion
as your hypothetical confused user. I'm going to be looking at a slot
that's currently inactive, and it's going to tell me that the
last_inactive_time was at some time in the past. And I'm going to say
"what the heck is going on here, the slot is inactive *right now*!"

Half of me wonders whether we should avoid this whole problem by
renaming it to something like last_state_change or
last_state_change_time, or maybe just state_change like we do in
pg_stat_activity, and making it mean the last time the slot flipped
between active and inactive in either direction. I'm not sure if this
is better, but unless I'm misunderstanding something, the current
situation is terrible.

--
Robert Haas
EDB: http://www.enterprisedb.com

#10Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Robert Haas (#9)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

Hi,

On Mon, Mar 25, 2024 at 11:49:00AM -0400, Robert Haas wrote:

On Mon, Mar 25, 2024 at 11:16 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

IIUC, Bertrand's point was that users can interpret last_active_time
as a value that gets updated each time they decode a change which is
not what we are doing. So, this can confuse users. Your expectation of
answer (NULL) when the slot is active is correct and that is what will
happen.

Yeah, and so would be the confusion: why is last_active_time NULL while one is
using the slot?

I agree that users could get confused here, but the solution to that
shouldn't be to give the field a name that is the opposite of what it
actually does. I expect a field called last_inactive_time to tell me
the last time that the slot was inactive. Here, it tells me the last
time that a currently-inactive slot previously *WAS* active. How can
you justify calling that the last *INACTIVE* time?

AFAICS, the user who has the confusion that you mention here is simply
wrong. If they are looking at a field called "last active time" and
the slot is active, then the correct answer is "right now" or
"undefined" and that is what they will see. Sure, they might not
understand that. But flipping the name of the field on its head cannot
be the right way to help them.

With the current naming, I expect to have the exact opposite confusion
as your hypothetical confused user. I'm going to be looking at a slot
that's currently inactive, and it's going to tell me that the
last_inactive_time was at some time in the past. And I'm going to say
"what the heck is going on here, the slot is inactive *right now*!"

Half of me wonders whether we should avoid this whole problem by
renaming it to something like last_state_change or
last_state_change_time, or maybe just state_change like we do in
pg_stat_activity, and making it mean the last time the slot flipped
between active and inactive in either direction. I'm not sure if this
is better, but unless I'm misunderstanding something, the current
situation is terrible.

Now that I read your arguments I think that last_<active|inactive>_time could be
both missleading because at the end they rely on users "expectation".

Would "released_time" sounds better? (at the end this is exactly what it does
represent unless for the case where it is restored from disk for which the meaning
would still makes sense to me though). It seems to me that released_time does not
lead to any expectation then removing any confusion.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#11Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Amit Kapila (#3)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

Hi,

On Mon, Mar 25, 2024 at 07:32:11PM +0530, Amit Kapila wrote:

On Mon, Mar 25, 2024 at 6:57 PM Robert Haas <robertmhaas@gmail.com> wrote:

And I'm suspicious that having an exception for slots being synced is
a bad idea. That makes too much of a judgement about how the user will
use this field. It's usually better to just expose the data, and if
the user needs helps to make sense of that data, then give them that
help separately.

The reason we didn't set this for sync slots is that they won't be
usable (one can't use them to decode WAL) unless standby is promoted
[2]. But I see your point as well. So, I have copied the others
involved in this discussion to see what they think.

Yeah I also see Robert's point. If we also sync the "last inactive time" field then
we would need to take care of the corner case mentioned by Shveta in [1]/messages/by-id/CAJpy0uCLu+mqAwAMum=pXE9YYsy0BE7hOSw_Wno5vjwpFY=63g@mail.gmail.com during
promotion.

[1]: /messages/by-id/CAJpy0uCLu+mqAwAMum=pXE9YYsy0BE7hOSw_Wno5vjwpFY=63g@mail.gmail.com

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#12Robert Haas
robertmhaas@gmail.com
In reply to: Bertrand Drouvot (#10)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Mon, Mar 25, 2024 at 12:12 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Now that I read your arguments I think that last_<active|inactive>_time could be
both missleading because at the end they rely on users "expectation".

Well, the user is always going to expect *something* -- that's just
how language works.

Would "released_time" sounds better? (at the end this is exactly what it does
represent unless for the case where it is restored from disk for which the meaning
would still makes sense to me though). It seems to me that released_time does not
lead to any expectation then removing any confusion.

Yeah, that's not bad. I mean, I don't agree that released_time doesn't
lead to any expectation, but what it leads me to expect is that you're
going to tell me the time at which the slot was released. So if it's
currently active, then I see NULL, because it's not released; but if
it's inactive, then I see the time at which it became so.

In the same vein, I think deactivated_at or inactive_since might be
good names to consider. I think they get at the same thing as
released_time, but they avoid introducing a completely new word
(release, as opposed to active/inactive).

--
Robert Haas
EDB: http://www.enterprisedb.com

#13Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Robert Haas (#12)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

Hi,

On Mon, Mar 25, 2024 at 12:25:37PM -0400, Robert Haas wrote:

On Mon, Mar 25, 2024 at 12:12 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Would "released_time" sounds better? (at the end this is exactly what it does
represent unless for the case where it is restored from disk for which the meaning
would still makes sense to me though). It seems to me that released_time does not
lead to any expectation then removing any confusion.

Yeah, that's not bad. I mean, I don't agree that released_time doesn't
lead to any expectation,
but what it leads me to expect is that you're
going to tell me the time at which the slot was released. So if it's
currently active, then I see NULL, because it's not released; but if
it's inactive, then I see the time at which it became so.

In the same vein, I think deactivated_at or inactive_since might be
good names to consider. I think they get at the same thing as
released_time, but they avoid introducing a completely new word
(release, as opposed to active/inactive).

Yeah, I'd vote for inactive_since then.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#14Nathan Bossart
nathandbossart@gmail.com
In reply to: Bertrand Drouvot (#13)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Mon, Mar 25, 2024 at 04:49:12PM +0000, Bertrand Drouvot wrote:

On Mon, Mar 25, 2024 at 12:25:37PM -0400, Robert Haas wrote:

In the same vein, I think deactivated_at or inactive_since might be
good names to consider. I think they get at the same thing as
released_time, but they avoid introducing a completely new word
(release, as opposed to active/inactive).

Yeah, I'd vote for inactive_since then.

Having only skimmed some of the related discussions, I'm inclined to agree
that inactive_since provides the clearest description for the column.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#15Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Nathan Bossart (#14)
1 attachment(s)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Tue, Mar 26, 2024 at 1:30 AM Nathan Bossart <nathandbossart@gmail.com> wrote:

On Mon, Mar 25, 2024 at 04:49:12PM +0000, Bertrand Drouvot wrote:

On Mon, Mar 25, 2024 at 12:25:37PM -0400, Robert Haas wrote:

In the same vein, I think deactivated_at or inactive_since might be
good names to consider. I think they get at the same thing as
released_time, but they avoid introducing a completely new word
(release, as opposed to active/inactive).

Yeah, I'd vote for inactive_since then.

Having only skimmed some of the related discussions, I'm inclined to agree
that inactive_since provides the clearest description for the column.

I think we all have some agreement on inactive_since. So, I'm
attaching the patch for that change.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Use-a-less-confusing-name-for-slot-s-last_inactiv.patchapplication/octet-stream; name=v1-0001-Use-a-less-confusing-name-for-slot-s-last_inactiv.patchDownload
From 03183dfec1a78a158a8b8d889b20d4332f0fc1e9 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 25 Mar 2024 19:22:13 +0000
Subject: [PATCH v1] Use a less confusing name for slot's last_inactive_time
 property

The slot's last_inactive_time property added by commit a11f330b55
seems confusing. With last_inactive_time one expects it to tell
the last time that the slot was inactive. But, it tells the last
time that a currently-inactive slot previously *WAS* active.

This commit uses a less confusing and better name for the property
called inactive_since. Othernames considered were released_time,
deactivated_at but inactive_since won the race since the word
inactive is predominant as far as the replication slots are
concerned.

Reported-by: Robert Haas
Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila
Discussion: https://www.postgresql.org/message-id/ZgGrCBQoktdLi1Ir%40ip-10-97-1-34.eu-west-3.compute.internal
---
 doc/src/sgml/system-views.sgml            |  4 +-
 src/backend/catalog/system_views.sql      |  2 +-
 src/backend/replication/slot.c            | 35 +++++++------
 src/backend/replication/slotfuncs.c       |  4 +-
 src/include/replication/slot.h            |  4 +-
 src/test/recovery/t/019_replslot_limit.pl | 62 +++++++++++------------
 src/test/regress/expected/rules.out       |  4 +-
 7 files changed, 59 insertions(+), 56 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 5f4165a945..19a08ca0ac 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,10 +2525,10 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>last_inactive_time</structfield> <type>timestamptz</type>
+       <structfield>inactive_since</structfield> <type>timestamptz</type>
       </para>
       <para>
-        The time at which the slot became inactive.
+        The time since the slot has became inactive.
         <literal>NULL</literal> if the slot is currently being used.
       </para></entry>
      </row>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index bc70ff193e..401fb35947 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.last_inactive_time,
+            L.inactive_since,
             L.conflicting,
             L.invalidation_reason,
             L.failover,
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 45f7a28f7d..e440fbe871 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -409,7 +409,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->candidate_restart_valid = InvalidXLogRecPtr;
 	slot->candidate_restart_lsn = InvalidXLogRecPtr;
 	slot->last_saved_confirmed_flush = InvalidXLogRecPtr;
-	slot->last_inactive_time = 0;
+	slot->inactive_since = 0;
 
 	/*
 	 * Create the slot on disk.  We haven't actually marked the slot allocated
@@ -623,9 +623,12 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
-	/* Reset the last inactive time as the slot is active now. */
+	/*
+	 * Reset the time since the slot has became inactive as the slot is active
+	 * now.
+	 */
 	SpinLockAcquire(&s->mutex);
-	s->last_inactive_time = 0;
+	s->inactive_since = 0;
 	SpinLockRelease(&s->mutex);
 
 	if (am_walsender)
@@ -687,10 +690,10 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has became inactive after marking it
+	 * inactive. We don't set it for the slots currently being synced from the
+	 * primary to the standby because such slots are typically inactive as
+	 * decoding is not allowed on those.
 	 */
 	if (!(RecoveryInProgress() && slot->data.synced))
 		now = GetCurrentTimestamp();
@@ -703,14 +706,14 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->last_inactive_time = now;
+		slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
 	{
 		SpinLockAcquire(&slot->mutex);
-		slot->last_inactive_time = now;
+		slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 	}
 
@@ -2366,16 +2369,16 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * We set the time since the slot has became inactive after loading
+		 * the slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it. We don't set it for the slots
+		 * currently being synced from the primary to the standby because such
+		 * slots are typically inactive as decoding is not allowed on those.
 		 */
 		if (!(RecoveryInProgress() && slot->data.synced))
-			slot->last_inactive_time = GetCurrentTimestamp();
+			slot->inactive_since = GetCurrentTimestamp();
 		else
-			slot->last_inactive_time = 0;
+			slot->inactive_since = 0;
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 24f5e6d90a..da57177c25 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -410,8 +410,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.last_inactive_time > 0)
-			values[i++] = TimestampTzGetDatum(slot_contents.last_inactive_time);
+		if (slot_contents.inactive_since > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.inactive_since);
 		else
 			nulls[i++] = true;
 
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index eefd7abd39..d032ce8d28 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -202,8 +202,8 @@ typedef struct ReplicationSlot
 	 */
 	XLogRecPtr	last_saved_confirmed_flush;
 
-	/* The time at which this slot becomes inactive */
-	TimestampTz last_inactive_time;
+	/* The time since the slot has became inactive */
+	TimestampTz inactive_since;
 } ReplicationSlot;
 
 #define SlotIsPhysical(slot) ((slot)->data.database == InvalidOid)
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 3409cf88cd..3b9a306a8b 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -411,7 +411,7 @@ $node_primary3->stop;
 $node_standby3->stop;
 
 # =============================================================================
-# Testcase start: Check last_inactive_time property of the streaming standby's slot
+# Testcase start: Check inactive_since property of the streaming standby's slot
 #
 
 # Initialize primary node
@@ -440,45 +440,45 @@ $primary4->safe_psql(
     SELECT pg_create_physical_replication_slot(slot_name := '$sb4_slot');
 ]);
 
-# Get last_inactive_time value after the slot's creation. Note that the slot
-# is still inactive till it's used by the standby below.
-my $last_inactive_time =
-	capture_and_validate_slot_last_inactive_time($primary4, $sb4_slot, $slot_creation_time);
+# Get inactive_since value after the slot's creation. Note that the slot is
+# still inactive till it's used by the standby below.
+my $inactive_since =
+	capture_and_validate_slot_inactive_since($primary4, $sb4_slot, $slot_creation_time);
 
 $standby4->start;
 
 # Wait until standby has replayed enough data
 $primary4->wait_for_catchup($standby4);
 
-# Now the slot is active so last_inactive_time value must be NULL
+# Now the slot is active so inactive_since value must be NULL
 is( $primary4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$sb4_slot';]
+		qq[SELECT inactive_since IS NULL FROM pg_replication_slots WHERE slot_name = '$sb4_slot';]
 	),
 	't',
 	'last inactive time for an active physical slot is NULL');
 
-# Stop the standby to check its last_inactive_time value is updated
+# Stop the standby to check its inactive_since value is updated
 $standby4->stop;
 
-# Let's restart the primary so that the last_inactive_time is set upon
-# loading the slot from the disk.
+# Let's restart the primary so that the inactive_since is set upon loading the
+# slot from the disk.
 $primary4->restart;
 
 is( $primary4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND last_inactive_time IS NOT NULL;]
+		qq[SELECT inactive_since > '$inactive_since'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND inactive_since IS NOT NULL;]
 	),
 	't',
 	'last inactive time for an inactive physical slot is updated correctly');
 
 $standby4->stop;
 
-# Testcase end: Check last_inactive_time property of the streaming standby's slot
+# Testcase end: Check inactive_since property of the streaming standby's slot
 # =============================================================================
 
 # =============================================================================
-# Testcase start: Check last_inactive_time property of the logical subscriber's slot
+# Testcase start: Check inactive_since property of the logical subscriber's slot
 my $publisher4 = $primary4;
 
 # Create subscriber node
@@ -499,10 +499,10 @@ $publisher4->safe_psql('postgres',
 	"SELECT pg_create_logical_replication_slot(slot_name := '$lsub4_slot', plugin := 'pgoutput');"
 );
 
-# Get last_inactive_time value after the slot's creation. Note that the slot
-# is still inactive till it's used by the subscriber below.
-$last_inactive_time =
-	capture_and_validate_slot_last_inactive_time($publisher4, $lsub4_slot, $slot_creation_time);
+# Get inactive_since value after the slot's creation. Note that the slot is
+# still inactive till it's used by the subscriber below.
+$inactive_since =
+	capture_and_validate_slot_inactive_since($publisher4, $lsub4_slot, $slot_creation_time);
 
 $subscriber4->start;
 $subscriber4->safe_psql('postgres',
@@ -512,54 +512,54 @@ $subscriber4->safe_psql('postgres',
 # Wait until subscriber has caught up
 $subscriber4->wait_for_subscription_sync($publisher4, 'sub');
 
-# Now the slot is active so last_inactive_time value must be NULL
+# Now the slot is active so inactive_since value must be NULL
 is( $publisher4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub4_slot';]
+		qq[SELECT inactive_since IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub4_slot';]
 	),
 	't',
 	'last inactive time for an active logical slot is NULL');
 
-# Stop the subscriber to check its last_inactive_time value is updated
+# Stop the subscriber to check its inactive_since value is updated
 $subscriber4->stop;
 
-# Let's restart the publisher so that the last_inactive_time is set upon
+# Let's restart the publisher so that the inactive_since is set upon
 # loading the slot from the disk.
 $publisher4->restart;
 
 is( $publisher4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND last_inactive_time IS NOT NULL;]
+		qq[SELECT inactive_since > '$inactive_since'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND inactive_since IS NOT NULL;]
 	),
 	't',
 	'last inactive time for an inactive logical slot is updated correctly');
 
-# Testcase end: Check last_inactive_time property of the logical subscriber's slot
+# Testcase end: Check inactive_since property of the logical subscriber's slot
 # =============================================================================
 
 $publisher4->stop;
 $subscriber4->stop;
 
-# Capture and validate last_inactive_time of a given slot.
-sub capture_and_validate_slot_last_inactive_time
+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
 {
 	my ($node, $slot_name, $slot_creation_time) = @_;
 
-	my $last_inactive_time = $node->safe_psql('postgres',
-		qq(SELECT last_inactive_time FROM pg_replication_slots
-			WHERE slot_name = '$slot_name' AND last_inactive_time IS NOT NULL;)
+	my $inactive_since = $node->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
 		);
 
 	# Check that the captured time is sane
 	is( $node->safe_psql(
 			'postgres',
-			qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0) AND
-				'$last_inactive_time'::timestamptz >= '$slot_creation_time'::timestamptz;]
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
 		),
 		't',
 		"last inactive time for an active slot $slot_name is sane");
 
-	return $last_inactive_time;
+	return $inactive_since;
 }
 
 done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index dfcbaec387..f53c3036a6 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,12 +1473,12 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.last_inactive_time,
+    l.inactive_since,
     l.conflicting,
     l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, last_inactive_time, conflicting, invalidation_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, inactive_since, conflicting, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

#16Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#12)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Mon, Mar 25, 2024 at 9:55 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Mar 25, 2024 at 12:12 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Would "released_time" sounds better? (at the end this is exactly what it does
represent unless for the case where it is restored from disk for which the meaning
would still makes sense to me though). It seems to me that released_time does not
lead to any expectation then removing any confusion.

Yeah, that's not bad. I mean, I don't agree that released_time doesn't
lead to any expectation, but what it leads me to expect is that you're
going to tell me the time at which the slot was released. So if it's
currently active, then I see NULL, because it's not released; but if
it's inactive, then I see the time at which it became so.

In the same vein, I think deactivated_at or inactive_since might be
good names to consider. I think they get at the same thing as
released_time, but they avoid introducing a completely new word
(release, as opposed to active/inactive).

We have a consensus on inactive_since, so I'll make that change. I
would also like to solicit your opinion on the other slot-level
parameter we are planning to introduce. This new slot-level parameter
will be named as inactive_timeout. This will indicate that once the
slot is inactive for the inactive_timeout period, we will invalidate
the slot. We are also discussing to have this parameter
(inactive_timeout) as GUC [1]/messages/by-id/20240325195443.GA2923888@nathanxps13. We can have this new parameter both at
the slot level and as well as a GUC, or just one of those.

[1]: /messages/by-id/20240325195443.GA2923888@nathanxps13

--
With Regards,
Amit Kapila.

#17shveta malik
shveta.malik@gmail.com
In reply to: Bertrand Drouvot (#11)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Mon, Mar 25, 2024 at 9:54 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Mon, Mar 25, 2024 at 07:32:11PM +0530, Amit Kapila wrote:

On Mon, Mar 25, 2024 at 6:57 PM Robert Haas <robertmhaas@gmail.com> wrote:

And I'm suspicious that having an exception for slots being synced is
a bad idea. That makes too much of a judgement about how the user will
use this field. It's usually better to just expose the data, and if
the user needs helps to make sense of that data, then give them that
help separately.

The reason we didn't set this for sync slots is that they won't be
usable (one can't use them to decode WAL) unless standby is promoted
[2]. But I see your point as well. So, I have copied the others
involved in this discussion to see what they think.

Yeah I also see Robert's point. If we also sync the "last inactive time" field then
we would need to take care of the corner case mentioned by Shveta in [1] during
promotion.

I have suggested one potential solution for that in [1]/messages/by-id/CAJpy0uB-yE+Riw7JQ4hW0+igJxvPc+rq+9c7WyTa1Jz7+2gAiA@mail.gmail.com. Please have a look.

[1]: /messages/by-id/CAJpy0uB-yE+Riw7JQ4hW0+igJxvPc+rq+9c7WyTa1Jz7+2gAiA@mail.gmail.com

thanks
Shveta

#18shveta malik
shveta.malik@gmail.com
In reply to: Bharath Rupireddy (#15)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Tue, Mar 26, 2024 at 1:50 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Mar 26, 2024 at 1:30 AM Nathan Bossart <nathandbossart@gmail.com> wrote:

On Mon, Mar 25, 2024 at 04:49:12PM +0000, Bertrand Drouvot wrote:

On Mon, Mar 25, 2024 at 12:25:37PM -0400, Robert Haas wrote:

In the same vein, I think deactivated_at or inactive_since might be
good names to consider. I think they get at the same thing as
released_time, but they avoid introducing a completely new word
(release, as opposed to active/inactive).

Yeah, I'd vote for inactive_since then.

Having only skimmed some of the related discussions, I'm inclined to agree
that inactive_since provides the clearest description for the column.

I think we all have some agreement on inactive_since. So, I'm
attaching the patch for that change.

pg_proc.dat needs to be changed to refer to 'inactive_since' instead
of 'last_inactive_time' in the attached patch.

thanks
Shveta

#19Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: shveta malik (#17)
1 attachment(s)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Tue, Mar 26, 2024 at 9:38 AM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Mar 25, 2024 at 9:54 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Mon, Mar 25, 2024 at 07:32:11PM +0530, Amit Kapila wrote:

On Mon, Mar 25, 2024 at 6:57 PM Robert Haas <robertmhaas@gmail.com> wrote:

And I'm suspicious that having an exception for slots being synced is
a bad idea. That makes too much of a judgement about how the user will
use this field. It's usually better to just expose the data, and if
the user needs helps to make sense of that data, then give them that
help separately.

The reason we didn't set this for sync slots is that they won't be
usable (one can't use them to decode WAL) unless standby is promoted
[2]. But I see your point as well. So, I have copied the others
involved in this discussion to see what they think.

Yeah I also see Robert's point. If we also sync the "last inactive time" field then
we would need to take care of the corner case mentioned by Shveta in [1] during
promotion.

I have suggested one potential solution for that in [1]. Please have a look.

[1]: /messages/by-id/CAJpy0uB-yE+Riw7JQ4hW0+igJxvPc+rq+9c7WyTa1Jz7+2gAiA@mail.gmail.com

I posted the v21 patch implementing the above idea in the other thread
- /messages/by-id/CALj2ACXRFx9g7A9RFJZF7eBe=zxk7=apMRFuCgJJKYB7O=vgwg@mail.gmail.com.
For ease, I'm also attaching the patch in here.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v21-0001-Fix-review-comments-for-slot-s-last_inactive_tim.patchapplication/octet-stream; name=v21-0001-Fix-review-comments-for-slot-s-last_inactive_tim.patchDownload
From fc9e61195b19768eacc856f72c185323483d2187 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 26 Mar 2024 05:33:39 +0000
Subject: [PATCH v21] Fix review comments for slot's last_inactive_time
 property

This commit addresses review comments received for the slot's
last_inactive_time property added by commit a11f330b55. It does
the following:

1. Name last_inactive_time seems confusing. With that, one
expects it to tell the last time that the slot was inactive. But,
it tells the last time that a currently-inactive slot previously
*WAS* active.

This commit uses a less confusing name inactive_since for the
property. Other names considered were released_time,
deactivated_at but inactive_since won the race since the word
inactive is predominant as far as the replication slots are
concerned.

2. The slot's last_inactive_time isn't currently maintained for
synced slots on the standby. The commit a11f330b55 prevents
updating last_inactive_time with RecoveryInProgress() check in
RestoreSlotFromDisk(). But, the issue is that RecoveryInProgress()
always returns true in RestoreSlotFromDisk() as
'xlogctl->SharedRecoveryState' is always 'RECOVERY_STATE_CRASH' at
that time. The impact of this on a promoted standby
last_inactive_at is always NULL for all synced slots even after
server restart.

Above issue led us to a question as to why we can't maintain
last_inactive_time for synced slots on the standby. There's a
use-case for having it that as it can tell the last synced time on
the standby apart from fixing the above issue. So, this commit does
two things a) maintains last_inactive_time for such slots,
b) ensures the value is set to current timestamp during the
shutdown to help correctly interpret the time if the standby gets
promoted without a restart.

Reported-by: Robert Haas, Shveta Malik
Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/ZgGrCBQoktdLi1Ir%40ip-10-97-1-34.eu-west-3.compute.internal
Discussion: https://www.postgresql.org/message-id/ZgGrCBQoktdLi1Ir%40ip-10-97-1-34.eu-west-3.compute.internal
Discussion: https://www.postgresql.org/message-id/CAJpy0uB-yE%2BRiw7JQ4hW0%2BigJxvPc%2Brq%2B9c7WyTa1Jz7%2B2gAiA%40mail.gmail.com
---
 doc/src/sgml/system-views.sgml                |  4 +-
 src/backend/catalog/system_views.sql          |  2 +-
 src/backend/replication/logical/slotsync.c    | 43 +++++++++++++
 src/backend/replication/slot.c                | 38 +++++-------
 src/backend/replication/slotfuncs.c           |  4 +-
 src/include/catalog/pg_proc.dat               |  2 +-
 src/include/replication/slot.h                |  4 +-
 src/test/recovery/t/019_replslot_limit.pl     | 62 +++++++++----------
 .../t/040_standby_failover_slots_sync.pl      | 34 ++++++++++
 src/test/regress/expected/rules.out           |  4 +-
 10 files changed, 135 insertions(+), 62 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 5f4165a945..19a08ca0ac 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,10 +2525,10 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>last_inactive_time</structfield> <type>timestamptz</type>
+       <structfield>inactive_since</structfield> <type>timestamptz</type>
       </para>
       <para>
-        The time at which the slot became inactive.
+        The time since the slot has became inactive.
         <literal>NULL</literal> if the slot is currently being used.
       </para></entry>
      </row>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index bc70ff193e..401fb35947 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.last_inactive_time,
+            L.inactive_since,
             L.conflicting,
             L.invalidation_reason,
             L.failover,
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 30480960c5..cfb5affeaa 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -140,6 +140,7 @@ typedef struct RemoteSlot
 } RemoteSlot;
 
 static void slotsync_failure_callback(int code, Datum arg);
+static void reset_synced_slots_info(void);
 
 /*
  * If necessary, update the local synced slot's metadata based on the data
@@ -1296,6 +1297,45 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 	Assert(false);
 }
 
+/*
+ * Reset the synced slots info such as inactive_since after shutting
+ * down the slot sync machinery.
+ */
+static void
+reset_synced_slots_info(void)
+{
+	LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
+
+	for (int i = 0; i < max_replication_slots; i++)
+	{
+		ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
+
+		/* Check if it is a synchronized slot */
+		if (s->in_use && s->data.synced)
+		{
+			TimestampTz now;
+
+			Assert(SlotIsLogical(s));
+			Assert(s->active_pid == 0);
+
+			/*
+			 * Set the time since the slot has became inactive after shutting
+			 * down slot sync machinery. This helps correctly interpret the
+			 * time if the standby gets promoted without a restart. We get the
+			 * current time beforehand to avoid a system call while holding
+			 * the lock.
+			 */
+			now = GetCurrentTimestamp();
+
+			SpinLockAcquire(&s->mutex);
+			s->inactive_since = now;
+			SpinLockRelease(&s->mutex);
+		}
+	}
+
+	LWLockRelease(ReplicationSlotControlLock);
+}
+
 /*
  * Shut down the slot sync worker.
  */
@@ -1309,6 +1349,7 @@ ShutDownSlotSync(void)
 	if (SlotSyncCtx->pid == InvalidPid)
 	{
 		SpinLockRelease(&SlotSyncCtx->mutex);
+		reset_synced_slots_info();
 		return;
 	}
 	SpinLockRelease(&SlotSyncCtx->mutex);
@@ -1341,6 +1382,8 @@ ShutDownSlotSync(void)
 	}
 
 	SpinLockRelease(&SlotSyncCtx->mutex);
+
+	reset_synced_slots_info();
 }
 
 /*
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 45f7a28f7d..860c7fbeb0 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -409,7 +409,7 @@ ReplicationSlotCreate(const char *name, bool db_specific,
 	slot->candidate_restart_valid = InvalidXLogRecPtr;
 	slot->candidate_restart_lsn = InvalidXLogRecPtr;
 	slot->last_saved_confirmed_flush = InvalidXLogRecPtr;
-	slot->last_inactive_time = 0;
+	slot->inactive_since = 0;
 
 	/*
 	 * Create the slot on disk.  We haven't actually marked the slot allocated
@@ -623,9 +623,12 @@ retry:
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
 
-	/* Reset the last inactive time as the slot is active now. */
+	/*
+	 * Reset the time since the slot has became inactive as the slot is active
+	 * now.
+	 */
 	SpinLockAcquire(&s->mutex);
-	s->last_inactive_time = 0;
+	s->inactive_since = 0;
 	SpinLockRelease(&s->mutex);
 
 	if (am_walsender)
@@ -651,7 +654,7 @@ ReplicationSlotRelease(void)
 	ReplicationSlot *slot = MyReplicationSlot;
 	char	   *slotname = NULL;	/* keep compiler quiet */
 	bool		is_logical = false; /* keep compiler quiet */
-	TimestampTz now = 0;
+	TimestampTz now;
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
@@ -687,13 +690,11 @@ ReplicationSlotRelease(void)
 	}
 
 	/*
-	 * Set the last inactive time after marking the slot inactive. We don't
-	 * set it for the slots currently being synced from the primary to the
-	 * standby because such slots are typically inactive as decoding is not
-	 * allowed on those.
+	 * Set the time since the slot has became inactive after marking it
+	 * inactive. We get the current time beforehand to avoid a system call
+	 * while holding the lock.
 	 */
-	if (!(RecoveryInProgress() && slot->data.synced))
-		now = GetCurrentTimestamp();
+	now = GetCurrentTimestamp();
 
 	if (slot->data.persistency == RS_PERSISTENT)
 	{
@@ -703,14 +704,14 @@ ReplicationSlotRelease(void)
 		 */
 		SpinLockAcquire(&slot->mutex);
 		slot->active_pid = 0;
-		slot->last_inactive_time = now;
+		slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 		ConditionVariableBroadcast(&slot->active_cv);
 	}
 	else
 	{
 		SpinLockAcquire(&slot->mutex);
-		slot->last_inactive_time = now;
+		slot->inactive_since = now;
 		SpinLockRelease(&slot->mutex);
 	}
 
@@ -2366,16 +2367,11 @@ RestoreSlotFromDisk(const char *name)
 		slot->active_pid = 0;
 
 		/*
-		 * We set the last inactive time after loading the slot from the disk
-		 * into memory. Whoever acquires the slot i.e. makes the slot active
-		 * will reset it. We don't set it for the slots currently being synced
-		 * from the primary to the standby because such slots are typically
-		 * inactive as decoding is not allowed on those.
+		 * We set the time since the slot has became inactive after loading
+		 * the slot from the disk into memory. Whoever acquires the slot i.e.
+		 * makes the slot active will reset it.
 		 */
-		if (!(RecoveryInProgress() && slot->data.synced))
-			slot->last_inactive_time = GetCurrentTimestamp();
-		else
-			slot->last_inactive_time = 0;
+		slot->inactive_since = GetCurrentTimestamp();
 
 		restored = true;
 		break;
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 24f5e6d90a..da57177c25 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -410,8 +410,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 		values[i++] = BoolGetDatum(slot_contents.data.two_phase);
 
-		if (slot_contents.last_inactive_time > 0)
-			values[i++] = TimestampTzGetDatum(slot_contents.last_inactive_time);
+		if (slot_contents.inactive_since > 0)
+			values[i++] = TimestampTzGetDatum(slot_contents.inactive_since);
 		else
 			nulls[i++] = true;
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 0d26e5b422..2f7cfc02c6 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11135,7 +11135,7 @@
   proargtypes => '',
   proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,timestamptz,bool,text,bool,bool}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,last_inactive_time,conflicting,invalidation_reason,failover,synced}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,inactive_since,conflicting,invalidation_reason,failover,synced}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index eefd7abd39..d032ce8d28 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -202,8 +202,8 @@ typedef struct ReplicationSlot
 	 */
 	XLogRecPtr	last_saved_confirmed_flush;
 
-	/* The time at which this slot becomes inactive */
-	TimestampTz last_inactive_time;
+	/* The time since the slot has became inactive */
+	TimestampTz inactive_since;
 } ReplicationSlot;
 
 #define SlotIsPhysical(slot) ((slot)->data.database == InvalidOid)
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 3409cf88cd..3b9a306a8b 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -411,7 +411,7 @@ $node_primary3->stop;
 $node_standby3->stop;
 
 # =============================================================================
-# Testcase start: Check last_inactive_time property of the streaming standby's slot
+# Testcase start: Check inactive_since property of the streaming standby's slot
 #
 
 # Initialize primary node
@@ -440,45 +440,45 @@ $primary4->safe_psql(
     SELECT pg_create_physical_replication_slot(slot_name := '$sb4_slot');
 ]);
 
-# Get last_inactive_time value after the slot's creation. Note that the slot
-# is still inactive till it's used by the standby below.
-my $last_inactive_time =
-	capture_and_validate_slot_last_inactive_time($primary4, $sb4_slot, $slot_creation_time);
+# Get inactive_since value after the slot's creation. Note that the slot is
+# still inactive till it's used by the standby below.
+my $inactive_since =
+	capture_and_validate_slot_inactive_since($primary4, $sb4_slot, $slot_creation_time);
 
 $standby4->start;
 
 # Wait until standby has replayed enough data
 $primary4->wait_for_catchup($standby4);
 
-# Now the slot is active so last_inactive_time value must be NULL
+# Now the slot is active so inactive_since value must be NULL
 is( $primary4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$sb4_slot';]
+		qq[SELECT inactive_since IS NULL FROM pg_replication_slots WHERE slot_name = '$sb4_slot';]
 	),
 	't',
 	'last inactive time for an active physical slot is NULL');
 
-# Stop the standby to check its last_inactive_time value is updated
+# Stop the standby to check its inactive_since value is updated
 $standby4->stop;
 
-# Let's restart the primary so that the last_inactive_time is set upon
-# loading the slot from the disk.
+# Let's restart the primary so that the inactive_since is set upon loading the
+# slot from the disk.
 $primary4->restart;
 
 is( $primary4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND last_inactive_time IS NOT NULL;]
+		qq[SELECT inactive_since > '$inactive_since'::timestamptz FROM pg_replication_slots WHERE slot_name = '$sb4_slot' AND inactive_since IS NOT NULL;]
 	),
 	't',
 	'last inactive time for an inactive physical slot is updated correctly');
 
 $standby4->stop;
 
-# Testcase end: Check last_inactive_time property of the streaming standby's slot
+# Testcase end: Check inactive_since property of the streaming standby's slot
 # =============================================================================
 
 # =============================================================================
-# Testcase start: Check last_inactive_time property of the logical subscriber's slot
+# Testcase start: Check inactive_since property of the logical subscriber's slot
 my $publisher4 = $primary4;
 
 # Create subscriber node
@@ -499,10 +499,10 @@ $publisher4->safe_psql('postgres',
 	"SELECT pg_create_logical_replication_slot(slot_name := '$lsub4_slot', plugin := 'pgoutput');"
 );
 
-# Get last_inactive_time value after the slot's creation. Note that the slot
-# is still inactive till it's used by the subscriber below.
-$last_inactive_time =
-	capture_and_validate_slot_last_inactive_time($publisher4, $lsub4_slot, $slot_creation_time);
+# Get inactive_since value after the slot's creation. Note that the slot is
+# still inactive till it's used by the subscriber below.
+$inactive_since =
+	capture_and_validate_slot_inactive_since($publisher4, $lsub4_slot, $slot_creation_time);
 
 $subscriber4->start;
 $subscriber4->safe_psql('postgres',
@@ -512,54 +512,54 @@ $subscriber4->safe_psql('postgres',
 # Wait until subscriber has caught up
 $subscriber4->wait_for_subscription_sync($publisher4, 'sub');
 
-# Now the slot is active so last_inactive_time value must be NULL
+# Now the slot is active so inactive_since value must be NULL
 is( $publisher4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub4_slot';]
+		qq[SELECT inactive_since IS NULL FROM pg_replication_slots WHERE slot_name = '$lsub4_slot';]
 	),
 	't',
 	'last inactive time for an active logical slot is NULL');
 
-# Stop the subscriber to check its last_inactive_time value is updated
+# Stop the subscriber to check its inactive_since value is updated
 $subscriber4->stop;
 
-# Let's restart the publisher so that the last_inactive_time is set upon
+# Let's restart the publisher so that the inactive_since is set upon
 # loading the slot from the disk.
 $publisher4->restart;
 
 is( $publisher4->safe_psql(
 		'postgres',
-		qq[SELECT last_inactive_time > '$last_inactive_time'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND last_inactive_time IS NOT NULL;]
+		qq[SELECT inactive_since > '$inactive_since'::timestamptz FROM pg_replication_slots WHERE slot_name = '$lsub4_slot' AND inactive_since IS NOT NULL;]
 	),
 	't',
 	'last inactive time for an inactive logical slot is updated correctly');
 
-# Testcase end: Check last_inactive_time property of the logical subscriber's slot
+# Testcase end: Check inactive_since property of the logical subscriber's slot
 # =============================================================================
 
 $publisher4->stop;
 $subscriber4->stop;
 
-# Capture and validate last_inactive_time of a given slot.
-sub capture_and_validate_slot_last_inactive_time
+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
 {
 	my ($node, $slot_name, $slot_creation_time) = @_;
 
-	my $last_inactive_time = $node->safe_psql('postgres',
-		qq(SELECT last_inactive_time FROM pg_replication_slots
-			WHERE slot_name = '$slot_name' AND last_inactive_time IS NOT NULL;)
+	my $inactive_since = $node->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
 		);
 
 	# Check that the captured time is sane
 	is( $node->safe_psql(
 			'postgres',
-			qq[SELECT '$last_inactive_time'::timestamptz > to_timestamp(0) AND
-				'$last_inactive_time'::timestamptz >= '$slot_creation_time'::timestamptz;]
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
 		),
 		't',
 		"last inactive time for an active slot $slot_name is sane");
 
-	return $last_inactive_time;
+	return $inactive_since;
 }
 
 done_testing();
diff --git a/src/test/recovery/t/040_standby_failover_slots_sync.pl b/src/test/recovery/t/040_standby_failover_slots_sync.pl
index f47bfd78eb..e64308cbf1 100644
--- a/src/test/recovery/t/040_standby_failover_slots_sync.pl
+++ b/src/test/recovery/t/040_standby_failover_slots_sync.pl
@@ -178,6 +178,13 @@ $primary->poll_query_until(
 # the subscriber.
 $primary->wait_for_replay_catchup($standby1);
 
+# Capture the time before which the logical failover slots are synced/created
+# on the standby.
+my $slots_creation_time = $standby1->safe_psql(
+	'postgres', qq[
+    SELECT current_timestamp;
+]);
+
 # Synchronize the primary server slots to the standby.
 $standby1->safe_psql('postgres', "SELECT pg_sync_replication_slots();");
 
@@ -190,6 +197,11 @@ is( $standby1->safe_psql(
 	"t",
 	'logical slots have synced as true on standby');
 
+# Confirm that the logical failover slots that have synced on the standby has
+# got a valid inactive_since value representing the last slot sync time. 
+capture_and_validate_slot_inactive_since($standby1, 'lsub1_slot', $slots_creation_time);
+capture_and_validate_slot_inactive_since($standby1, 'lsub2_slot', $slots_creation_time);
+
 ##################################################
 # Test that the synchronized slot will be dropped if the corresponding remote
 # slot on the primary server has been dropped.
@@ -773,4 +785,26 @@ is( $subscriber1->safe_psql('postgres', q{SELECT count(*) FROM tab_int;}),
 	"20",
 	'data replicated from the new primary');
 
+# Capture and validate inactive_since of a given slot.
+sub capture_and_validate_slot_inactive_since
+{
+	my ($node, $slot_name, $slot_creation_time) = @_;
+
+	my $inactive_since = $node->safe_psql('postgres',
+		qq(SELECT inactive_since FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND inactive_since IS NOT NULL;)
+		);
+
+	# Check that the captured time is sane
+	is( $node->safe_psql(
+			'postgres',
+			qq[SELECT '$inactive_since'::timestamptz > to_timestamp(0) AND
+				'$inactive_since'::timestamptz >= '$slot_creation_time'::timestamptz;]
+		),
+		't',
+		"last inactive time for a synced slot $slot_name is sane");
+
+	return $inactive_since;
+}
+
 done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index dfcbaec387..f53c3036a6 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,12 +1473,12 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.last_inactive_time,
+    l.inactive_since,
     l.conflicting,
     l.invalidation_reason,
     l.failover,
     l.synced
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, last_inactive_time, conflicting, invalidation_reason, failover, synced)
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, inactive_since, conflicting, invalidation_reason, failover, synced)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

#20Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Amit Kapila (#16)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On 2024-Mar-26, Amit Kapila wrote:

We have a consensus on inactive_since, so I'll make that change.

Sounds reasonable. So this is a timestamptz if the slot is inactive,
NULL if active, right? What value is it going to have for sync slots?

I would also like to solicit your opinion on the other slot-level
parameter we are planning to introduce. This new slot-level parameter
will be named as inactive_timeout.

Maybe inactivity_timeout?

This will indicate that once the slot is inactive for the
inactive_timeout period, we will invalidate the slot. We are also
discussing to have this parameter (inactive_timeout) as GUC [1]. We
can have this new parameter both at the slot level and as well as a
GUC, or just one of those.

replication_slot_inactivity_timeout?

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Tom: There seems to be something broken here.
Teodor: I'm in sackcloth and ashes... Fixed.
/messages/by-id/482D1632.8010507@sigaev.ru

#21Amit Kapila
amit.kapila16@gmail.com
In reply to: Alvaro Herrera (#20)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Tue, Mar 26, 2024 at 1:09 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2024-Mar-26, Amit Kapila wrote:

We have a consensus on inactive_since, so I'll make that change.

Sounds reasonable. So this is a timestamptz if the slot is inactive,
NULL if active, right?

Yes.

What value is it going to have for sync slots?

The behavior will be the same for non-sync slots. In each sync cycle,
we acquire/release the sync slots. So at the time of release,
inactive_since will be updated. See email [1]/messages/by-id/CAA4eK1KrPGwfZV9LYGidjxHeW+rxJ=E2ThjXvwRGLO=iLNuo=Q@mail.gmail.com.

I would also like to solicit your opinion on the other slot-level
parameter we are planning to introduce. This new slot-level parameter
will be named as inactive_timeout.

Maybe inactivity_timeout?

This will indicate that once the slot is inactive for the
inactive_timeout period, we will invalidate the slot. We are also
discussing to have this parameter (inactive_timeout) as GUC [1]. We
can have this new parameter both at the slot level and as well as a
GUC, or just one of those.

replication_slot_inactivity_timeout?

So, it seems you are okay to have this parameter both at slot level
and as a GUC. About names, let us see what others think.

Thanks for the input on the names.

[1]: /messages/by-id/CAA4eK1KrPGwfZV9LYGidjxHeW+rxJ=E2ThjXvwRGLO=iLNuo=Q@mail.gmail.com

--
With Regards,
Amit Kapila.

#22Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Amit Kapila (#21)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On 2024-Mar-26, Amit Kapila wrote:

On Tue, Mar 26, 2024 at 1:09 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2024-Mar-26, Amit Kapila wrote:

I would also like to solicit your opinion on the other slot-level
parameter we are planning to introduce. This new slot-level parameter
will be named as inactive_timeout.

Maybe inactivity_timeout?

This will indicate that once the slot is inactive for the
inactive_timeout period, we will invalidate the slot. We are also
discussing to have this parameter (inactive_timeout) as GUC [1]. We
can have this new parameter both at the slot level and as well as a
GUC, or just one of those.

replication_slot_inactivity_timeout?

So, it seems you are okay to have this parameter both at slot level
and as a GUC.

Well, I think a GUC is good to have regardless of the slot parameter,
because the GUC can be used as an instance-wide protection against going
out of disk space because of broken replication. However, now that I
think about it, I'm not really sure about invalidating a slot based on
time rather on disk space, for which we already have a parameter; what's
your rationale for that? The passage of time is not a very good
measure, really, because the amount of WAL being protected has wildly
varying production rate across time.

I can only see a timeout being useful as a parameter if its default
value is not the special disable value; say, the default timeout is 3
days (to be more precise -- the period from Friday to Monday, that is,
between DBA leaving the office one week until discovering a problem when
he returns early next week). This way we have a built-in mechanism that
invalidates slots regardless of how big the WAL partition is.

I'm less sure about the slot parameter; in what situation do you need to
extend the life of one individual slot further than the life of all the
other slots? (Of course, it makes no sense to set the per-slot param to
a shorter period than the GUC: invalidating one slot ahead of the others
is completely pointless.)

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Bob [Floyd] used to say that he was planning to get a Ph.D. by the "green
stamp method," namely by saving envelopes addressed to him as 'Dr. Floyd'.
After collecting 500 such letters, he mused, a university somewhere in
Arizona would probably grant him a degree. (Don Knuth)

#23Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Amit Kapila (#21)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

Hi,

On Tue, Mar 26, 2024 at 01:45:23PM +0530, Amit Kapila wrote:

On Tue, Mar 26, 2024 at 1:09 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2024-Mar-26, Amit Kapila wrote:

We have a consensus on inactive_since, so I'll make that change.

Sounds reasonable. So this is a timestamptz if the slot is inactive,
NULL if active, right?

Yes.

What value is it going to have for sync slots?

The behavior will be the same for non-sync slots. In each sync cycle,
we acquire/release the sync slots. So at the time of release,
inactive_since will be updated. See email [1].

I don't think we should set inactive_since to the current time at each sync cycle,
see [1]/messages/by-id/ZgKGIDC5lttWTdJH@ip-10-97-1-34.eu-west-3.compute.internal as to why. What do you think?

[1]: /messages/by-id/ZgKGIDC5lttWTdJH@ip-10-97-1-34.eu-west-3.compute.internal

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#24Amit Kapila
amit.kapila16@gmail.com
In reply to: Alvaro Herrera (#22)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Tue, Mar 26, 2024 at 2:11 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2024-Mar-26, Amit Kapila wrote:

On Tue, Mar 26, 2024 at 1:09 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2024-Mar-26, Amit Kapila wrote:

I would also like to solicit your opinion on the other slot-level
parameter we are planning to introduce. This new slot-level parameter
will be named as inactive_timeout.

Maybe inactivity_timeout?

This will indicate that once the slot is inactive for the
inactive_timeout period, we will invalidate the slot. We are also
discussing to have this parameter (inactive_timeout) as GUC [1]. We
can have this new parameter both at the slot level and as well as a
GUC, or just one of those.

replication_slot_inactivity_timeout?

So, it seems you are okay to have this parameter both at slot level
and as a GUC.

Well, I think a GUC is good to have regardless of the slot parameter,
because the GUC can be used as an instance-wide protection against going
out of disk space because of broken replication. However, now that I
think about it, I'm not really sure about invalidating a slot based on
time rather on disk space, for which we already have a parameter; what's
your rationale for that? The passage of time is not a very good
measure, really, because the amount of WAL being protected has wildly
varying production rate across time.

The inactive slot not only blocks WAL from being removed but prevents
the vacuum from proceeding. Also, there is a risk of transaction Id
wraparound. See email [1]/messages/by-id/20240325195443.GA2923888@nathanxps13 for more context.

I can only see a timeout being useful as a parameter if its default
value is not the special disable value; say, the default timeout is 3
days (to be more precise -- the period from Friday to Monday, that is,
between DBA leaving the office one week until discovering a problem when
he returns early next week). This way we have a built-in mechanism that
invalidates slots regardless of how big the WAL partition is.

We can have a default value for this parameter but it has the
potential to break the replication, so not sure what could be a good
default value.

I'm less sure about the slot parameter; in what situation do you need to
extend the life of one individual slot further than the life of all the
other slots?

I was thinking of an idle slot scenario where a slot from one
particular subscriber (or output plugin) is inactive due to some
maintenance activity. But it should be okay to have a GUC for this for
now.

[1]: /messages/by-id/20240325195443.GA2923888@nathanxps13

--
With Regards,
Amit Kapila.

#25Nathan Bossart
nathandbossart@gmail.com
In reply to: Amit Kapila (#24)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Tue, Mar 26, 2024 at 03:44:29PM +0530, Amit Kapila wrote:

On Tue, Mar 26, 2024 at 2:11 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

Well, I think a GUC is good to have regardless of the slot parameter,
because the GUC can be used as an instance-wide protection against going
out of disk space because of broken replication. However, now that I
think about it, I'm not really sure about invalidating a slot based on
time rather on disk space, for which we already have a parameter; what's
your rationale for that? The passage of time is not a very good
measure, really, because the amount of WAL being protected has wildly
varying production rate across time.

The inactive slot not only blocks WAL from being removed but prevents
the vacuum from proceeding. Also, there is a risk of transaction Id
wraparound. See email [1] for more context.

FWIW I'd really prefer to have something like max_slot_xid_age for this. A
time-based parameter would likely help with most cases, but transaction ID
usage will vary widely from server to server, so it'd be nice to have
something to protect against wraparound more directly. I don't object to a
time-based setting as well, but that won't always work as well for this
particular use-case, especially if we are relying on users to set a
slot-level parameter.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#26Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Nathan Bossart (#25)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On 2024-Mar-26, Nathan Bossart wrote:

FWIW I'd really prefer to have something like max_slot_xid_age for this. A
time-based parameter would likely help with most cases, but transaction ID
usage will vary widely from server to server, so it'd be nice to have
something to protect against wraparound more directly.

Yeah, I tend to agree that an XID-based limit makes more sense than a
time-based one.

I don't object to a
time-based setting as well, but that won't always work as well for this
particular use-case, especially if we are relying on users to set a
slot-level parameter.

I think slot-level parameters are mostly useless, because it takes just
one slot where you forget to set it for disaster to strike.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/

#27Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Alvaro Herrera (#26)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

Hi,

On Tue, Mar 26, 2024 at 04:39:55PM +0100, Alvaro Herrera wrote:

On 2024-Mar-26, Nathan Bossart wrote:

I don't object to a
time-based setting as well, but that won't always work as well for this
particular use-case, especially if we are relying on users to set a
slot-level parameter.

I think slot-level parameters are mostly useless, because it takes just
one slot where you forget to set it for disaster to strike.

I think that's a fair point. So maybe we should focus on having a GUC first and
later on re-think about having (or not) a slot based one (in addition to the GUC).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#28Amit Kapila
amit.kapila16@gmail.com
In reply to: Alvaro Herrera (#26)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Tue, Mar 26, 2024 at 9:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2024-Mar-26, Nathan Bossart wrote:

FWIW I'd really prefer to have something like max_slot_xid_age for this. A
time-based parameter would likely help with most cases, but transaction ID
usage will vary widely from server to server, so it'd be nice to have
something to protect against wraparound more directly.

Yeah, I tend to agree that an XID-based limit makes more sense than a
time-based one.

So, do we want the time-based parameter or just max_slot_xid_age
considering both will be GUC's? Yes, the xid_age based parameter
sounds to be directly helpful for transaction ID wraparound or dead
row cleanup, OTOH having a lot of inactive slots (which is possible in
use cases where a tool creates a lot of slots and forgets to remove
them, or the tool exits without cleaning up slots (say due to server
shutdown)) also prohibit removing dead space which is not nice either?

The one example that comes to mind is the pg_createsubscriber
(committed for PG17) which creates one slot per database to convert
standby to subscriber, now say it exits due to power shutdown then
there could be a lot of dangling slots on the primary server. Also,
say there is some bug in such a tool that doesn't allow proper cleanup
of slots, the same problem can happen; yeah, this would be a problem
of the tool but I think there is no harm in giving a way to avoid
problems at the server end due to such slots.

--
With Regards,
Amit Kapila.

#29Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Amit Kapila (#28)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Wed, Mar 27, 2024 at 10:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Mar 26, 2024 at 9:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2024-Mar-26, Nathan Bossart wrote:

FWIW I'd really prefer to have something like max_slot_xid_age for this. A
time-based parameter would likely help with most cases, but transaction ID
usage will vary widely from server to server, so it'd be nice to have
something to protect against wraparound more directly.

Yeah, I tend to agree that an XID-based limit makes more sense than a
time-based one.

So, do we want the time-based parameter or just max_slot_xid_age
considering both will be GUC's? Yes, the xid_age based parameter
sounds to be directly helpful for transaction ID wraparound or dead
row cleanup, OTOH having a lot of inactive slots (which is possible in
use cases where a tool creates a lot of slots and forgets to remove
them, or the tool exits without cleaning up slots (say due to server
shutdown)) also prohibit removing dead space which is not nice either?

I've personally seen the leftover slots problem on production systems
where a timeout based invalidation mechanism could have been of more
help to save time and reduce manual intervention. Usually, most if not
all, migration/upgrade/other tools create slots, and if at all any
errors occur or the operation gets cancelled midway, there's a chance
that the slots can be leftover if such tools forgets to clean them up
either because there was a bug or for whatever reasons. These are
unintended/ghost slots for the database user unnecessarily holding up
resources such as XIDs, dead rows and WAL; which might lead to XID
wraparound or server crash if unnoticed. Although XID based GUC helps
a lot, why do these unintended and unnoticed slots need to hold up the
resources even before the XID age of say 1.5 or 2 billion is reached.

With both GUCs max_slot_xid_age and replication_slot_inactive_timeout
in place, I can set max_slot_xid_age = 1.5 billion or so and also set
replication_slot_inactive_timeout = 2 days or so to make the database
foolproof.

The one example that comes to mind is the pg_createsubscriber
(committed for PG17) which creates one slot per database to convert
standby to subscriber, now say it exits due to power shutdown then
there could be a lot of dangling slots on the primary server. Also,
say there is some bug in such a tool that doesn't allow proper cleanup
of slots, the same problem can happen; yeah, this would be a problem
of the tool but I think there is no harm in giving a way to avoid
problems at the server end due to such slots.

Right. I can personally connect to this problem of leftover slots
where manual intervention was needed to drop all such slots which is
time-consuming and painful sometimes.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#30Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Bharath Rupireddy (#29)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

Hi,

On Wed, Mar 27, 2024 at 12:28:06PM +0530, Bharath Rupireddy wrote:

On Wed, Mar 27, 2024 at 10:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Mar 26, 2024 at 9:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2024-Mar-26, Nathan Bossart wrote:

FWIW I'd really prefer to have something like max_slot_xid_age for this. A
time-based parameter would likely help with most cases, but transaction ID
usage will vary widely from server to server, so it'd be nice to have
something to protect against wraparound more directly.

Yeah, I tend to agree that an XID-based limit makes more sense than a
time-based one.

So, do we want the time-based parameter or just max_slot_xid_age
considering both will be GUC's? Yes, the xid_age based parameter
sounds to be directly helpful for transaction ID wraparound or dead
row cleanup, OTOH having a lot of inactive slots (which is possible in
use cases where a tool creates a lot of slots and forgets to remove
them, or the tool exits without cleaning up slots (say due to server
shutdown)) also prohibit removing dead space which is not nice either?

I've personally seen the leftover slots problem on production systems
where a timeout based invalidation mechanism could have been of more
help to save time and reduce manual intervention. Usually, most if not
all, migration/upgrade/other tools create slots, and if at all any
errors occur or the operation gets cancelled midway, there's a chance
that the slots can be leftover if such tools forgets to clean them up
either because there was a bug or for whatever reasons. These are
unintended/ghost slots for the database user unnecessarily holding up
resources such as XIDs, dead rows and WAL; which might lead to XID
wraparound or server crash if unnoticed. Although XID based GUC helps
a lot, why do these unintended and unnoticed slots need to hold up the
resources even before the XID age of say 1.5 or 2 billion is reached.

With both GUCs max_slot_xid_age and replication_slot_inactive_timeout
in place, I can set max_slot_xid_age = 1.5 billion or so and also set
replication_slot_inactive_timeout = 2 days or so to make the database
foolproof.

Yeah, I think that both makes senses. The reason is that one depends of the
database activity and slot activity (the xid age one) while the other (the
timeout one) depends only of the slot activity.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#31Robert Haas
robertmhaas@gmail.com
In reply to: Bertrand Drouvot (#30)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Wed, Mar 27, 2024 at 3:13 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Yeah, I think that both makes senses. The reason is that one depends of the
database activity and slot activity (the xid age one) while the other (the
timeout one) depends only of the slot activity.

FWIW, I thought the time-based one sounded more useful. I think it
would be poor planning to say "well, if the slot reaches an XID age of
a billion, kill it so we don't wrap around," because while that likely
will prevent me from getting into wraparound trouble, my database is
likely to become horribly bloated long before the cutoff is reached. I
thought it would be easier to reason in terms of time: I don't expect
a slave to ever be down for more than X period of time, say an hour or
whatever, so if it is, forget about it. Or alternatively, I know that
if a slave does go down for more than X period of time, I start to get
bloat, so cut it off at that point and I'll rebuild it later. I feel
like these are things where people's intuition is going to be much
stronger when reckoning in units of wall-clock time, which everyone
deals with every day in one way or another, rather than in XID-based
units that are, at least in my view, just a lot less intuitive.

For a previous example of where an XID threshold turned out not to be
great, see vacuum_defer_cleanup_age, and in particular the commit
message from where it was removed in
1118cd37eb61e6a2428f457a8b2026a7bb3f801a. The case here might not turn
out to be quite comparable for one reason or another, but I do think
that case is a cautionary tale.

I'm sure the world won't end or anything if we end up with both
thresholds, and I may be missing some reason why the XID threshold
would be really great here. I just can't quite see why I'd ever
recommend it to anyone.

--
Robert Haas
EDB: http://www.enterprisedb.com

#32Nathan Bossart
nathandbossart@gmail.com
In reply to: Robert Haas (#31)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Wed, Mar 27, 2024 at 10:33:28AM -0400, Robert Haas wrote:

FWIW, I thought the time-based one sounded more useful. I think it
would be poor planning to say "well, if the slot reaches an XID age of
a billion, kill it so we don't wrap around," because while that likely
will prevent me from getting into wraparound trouble, my database is
likely to become horribly bloated long before the cutoff is reached. I
thought it would be easier to reason in terms of time: I don't expect
a slave to ever be down for more than X period of time, say an hour or
whatever, so if it is, forget about it. Or alternatively, I know that
if a slave does go down for more than X period of time, I start to get
bloat, so cut it off at that point and I'll rebuild it later. I feel
like these are things where people's intuition is going to be much
stronger when reckoning in units of wall-clock time, which everyone
deals with every day in one way or another, rather than in XID-based
units that are, at least in my view, just a lot less intuitive.

I don't disagree with this point in the context of a user who is managing a
single server or just a handful of servers. They are going to understand
their workload best and can reason about the right value for the timeout.
I think they'd still benefit from having an XID-based setting as a backstop
in case the timeout is still not sufficient to prevent wraparound, but it's
not nearly as important in that case.

IMHO the use-case where this doesn't work so well is when you have many,
many servers to administer (e.g., a cloud provider). In those cases,
picking a default timeout to try to prevent wraparound is going to be much
less accurate, as any reasonable value you pick is still going to be
insufficient in some cases. I think the XID-based parameter would be
better here; if the server is at imminent risk of an outage due to
wraparound, invalidating the slots is probably a reasonable course of
action.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#33Robert Haas
robertmhaas@gmail.com
In reply to: Nathan Bossart (#32)
Re: pgsql: Track last_inactive_time in pg_replication_slots.

On Wed, Mar 27, 2024 at 11:06 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:

IMHO the use-case where this doesn't work so well is when you have many,
many servers to administer (e.g., a cloud provider). In those cases,
picking a default timeout to try to prevent wraparound is going to be much
less accurate, as any reasonable value you pick is still going to be
insufficient in some cases. I think the XID-based parameter would be
better here; if the server is at imminent risk of an outage due to
wraparound, invalidating the slots is probably a reasonable course of
action.

Well, I'm certainly willing to believe that your experience with
administering servers in the cloud is superior to mine. I don't really
understand why it makes a difference, though. Whether you have one
server or many, I agree that it is reasonable to invalidate slots when
XID wraparound looms. But also, whether you have one server or many,
by the time wraparound looms, you will typically have crippling bloat
as well. If preventing that bloat isn't important or there are other
defenses against it, then I see the value of the XID-based cutoff as a
backstop. And I will admit that in an on-prem installation, I've
occasionally seen situations where lots of XIDs got burned without
really causing a lot of bloat -- say, because there are heavily
updated staging tables which are periodically truncated, and very
little modification to long-lived data.

I'm not really strongly against having an XID-based threshold if smart
people such as yourself want it. I just think for a lot of users it's
going to be fiddly to get right.

--
Robert Haas
EDB: http://www.enterprisedb.com