Track in pg_replication_slots the reason why slots conflict?

Started by Michael Paquierabout 2 years ago33 messages
#1Michael Paquier
michael@paquier.xyz

Hi all,
(Bertrand and Andres in CC.)

While listening at Bertrand's talk about logical decoding on standbys
last week at Prague, I got surprised by the fact that we do not
reflect in the catalogs the reason why a conflict happened for a slot.
There are three of them depending on ReplicationSlotInvalidationCause:
- WAL removed.
- Invalid horizon.
- Insufficient WAL level.

This idea has been hinted around here on the original thread that led
to be87200efd93:
/messages/by-id/d7547f2c-a0c3-6aad-b631-b7ed5efaf298@gmail.com
However v44 has picked up the idea of a boolean:
/messages/by-id/bdc49e0b-cd39-bcd3-e391-b0ad6e48b5cf@gmail.com

ReplicationSlotCtl holds this information, so couldn't it be useful
for monitoring purposes to know why a slot got invalidated and add a
column to pg_get_replication_slots()? This could just be an extra
text conflicting_reason, defaulting to NULL when there's nothing to
see.

Thoughts?
--
Michael

#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#1)
Re: Track in pg_replication_slots the reason why slots conflict?

On Thu, Dec 21, 2023 at 5:51 AM Michael Paquier <michael@paquier.xyz> wrote:

While listening at Bertrand's talk about logical decoding on standbys
last week at Prague, I got surprised by the fact that we do not
reflect in the catalogs the reason why a conflict happened for a slot.
There are three of them depending on ReplicationSlotInvalidationCause:
- WAL removed.
- Invalid horizon.
- Insufficient WAL level.

The invalidation cause is also required by one of the features being
discussed "Synchronize slots from primary to standby" [1]/messages/by-id/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com and there is
already a thread to discuss the same [2]/messages/by-id/CAJpy0uBpr0ym12+0mXpjcRFA6N=anX+Yk9aGU4EJhHNu=fWykQ@mail.gmail.com. As that thread started
yesterday only, you may not have noticed it. Currently, the proposal
is to expose it via a function but we can extend it to also display
via view, feel free to share your opinion on that thread.

[1]: /messages/by-id/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
[2]: /messages/by-id/CAJpy0uBpr0ym12+0mXpjcRFA6N=anX+Yk9aGU4EJhHNu=fWykQ@mail.gmail.com

--
With Regards,
Amit Kapila.

#3Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#2)
Re: Track in pg_replication_slots the reason why slots conflict?

On Thu, Dec 21, 2023 at 08:20:16AM +0530, Amit Kapila wrote:

The invalidation cause is also required by one of the features being
discussed "Synchronize slots from primary to standby" [1] and there is
already a thread to discuss the same [2]. As that thread started
yesterday only, you may not have noticed it. Currently, the proposal
is to expose it via a function but we can extend it to also display
via view, feel free to share your opinion on that thread.

[1] - /messages/by-id/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
[2] - /messages/by-id/CAJpy0uBpr0ym12+0mXpjcRFA6N=anX+Yk9aGU4EJhHNu=fWykQ@mail.gmail.com

Ah thanks, missed this one. This cannot use a separate function,
actually, and there is a good reason for that that has not been
mentioned. I'll jump there.
--
Michael

#4Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#1)
Re: Track in pg_replication_slots the reason why slots conflict?

Hi,

On 2023-12-21 09:21:04 +0900, Michael Paquier wrote:

While listening at Bertrand's talk about logical decoding on standbys
last week at Prague, I got surprised by the fact that we do not
reflect in the catalogs the reason why a conflict happened for a slot.
There are three of them depending on ReplicationSlotInvalidationCause:
- WAL removed.
- Invalid horizon.
- Insufficient WAL level.

It should be extremely rare to hit any of these other than "WAL removed", so
I'm not sure it's worth adding interface complexity to show them.

ReplicationSlotCtl holds this information, so couldn't it be useful
for monitoring purposes to know why a slot got invalidated and add a
column to pg_get_replication_slots()? This could just be an extra
text conflicting_reason, defaulting to NULL when there's nothing to
see.

Extra columns aren't free from a usability perspective. IFF we do something, I
think it should be a single column with a cause.

Greetings,

Andres Freund

#5shveta malik
shveta.malik@gmail.com
In reply to: Andres Freund (#4)
Re: Track in pg_replication_slots the reason why slots conflict?

On Thu, Dec 21, 2023 at 3:10 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2023-12-21 09:21:04 +0900, Michael Paquier wrote:

While listening at Bertrand's talk about logical decoding on standbys
last week at Prague, I got surprised by the fact that we do not
reflect in the catalogs the reason why a conflict happened for a slot.
There are three of them depending on ReplicationSlotInvalidationCause:
- WAL removed.
- Invalid horizon.
- Insufficient WAL level.

It should be extremely rare to hit any of these other than "WAL removed", so
I'm not sure it's worth adding interface complexity to show them.

ReplicationSlotCtl holds this information, so couldn't it be useful
for monitoring purposes to know why a slot got invalidated and add a
column to pg_get_replication_slots()? This could just be an extra
text conflicting_reason, defaulting to NULL when there's nothing to
see.

Extra columns aren't free from a usability perspective. IFF we do something, I
think it should be a single column with a cause.

Thanks for the feedback. But do you mean that we replace existing
'conflicting' column with 'cause' in both the function and view
(pg_get_replication_slots() and pg_replication_slots)? Or do you mean
that we expose 'cause' from pg_get_replication_slots() and use that to
display 'conflicting' in pg_replication_slots view?

And if we plan to return/display cause from either function or view,
then shall it be enum 'ReplicationSlotInvalidationCause' or
description/text corresponding to enum?

In the other feature being discussed "Synchronize slots from primary
to standby" [1]/messages/by-id/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com , there is a requirement to replicate invalidation
cause of slot from the primary to standby and thus it is needed in
enum form there. And thus there was a suggestion earlier to have the
function return enum-value and let the view display it as
text/description to the user. So kindly let us know your thoughts.

[1]: /messages/by-id/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com

thanks
Shveta

#6Andres Freund
andres@anarazel.de
In reply to: shveta malik (#5)
Re: Track in pg_replication_slots the reason why slots conflict?

Hi,

On 2023-12-21 16:08:48 +0530, shveta malik wrote:

On Thu, Dec 21, 2023 at 3:10 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2023-12-21 09:21:04 +0900, Michael Paquier wrote:

While listening at Bertrand's talk about logical decoding on standbys
last week at Prague, I got surprised by the fact that we do not
reflect in the catalogs the reason why a conflict happened for a slot.
There are three of them depending on ReplicationSlotInvalidationCause:
- WAL removed.
- Invalid horizon.
- Insufficient WAL level.

It should be extremely rare to hit any of these other than "WAL removed", so
I'm not sure it's worth adding interface complexity to show them.

ReplicationSlotCtl holds this information, so couldn't it be useful
for monitoring purposes to know why a slot got invalidated and add a
column to pg_get_replication_slots()? This could just be an extra
text conflicting_reason, defaulting to NULL when there's nothing to
see.

Extra columns aren't free from a usability perspective. IFF we do something, I
think it should be a single column with a cause.

Thanks for the feedback. But do you mean that we replace existing
'conflicting' column with 'cause' in both the function and view
(pg_get_replication_slots() and pg_replication_slots)? Or do you mean
that we expose 'cause' from pg_get_replication_slots() and use that to
display 'conflicting' in pg_replication_slots view?

I'm not entirely sure I understand the difference - just whether we add one
new column or replace the existing 'conflicting' column? I can see arguments
for either. A conflicting column where NULL indicates no conflict, and other
values indicate the reason for the conflict, doesn't seem too bad.

And if we plan to return/display cause from either function or view,
then shall it be enum 'ReplicationSlotInvalidationCause' or
description/text corresponding to enum?

We clearly can't just expose the numerical value for a C enum. So it has to be
converted to something SQL representable.

In the other feature being discussed "Synchronize slots from primary
to standby" [1] , there is a requirement to replicate invalidation
cause of slot from the primary to standby and thus it is needed in
enum form there. And thus there was a suggestion earlier to have the
function return enum-value and let the view display it as
text/description to the user. So kindly let us know your thoughts.

[1] - /messages/by-id/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com

Can you point me to a more specific message for that requirement? It seems
pretty odd to me. Your link goes to the top of a 400 message thread, I don't
have time to find one specific design point in that...

Greetings,

Andres

#7shveta malik
shveta.malik@gmail.com
In reply to: Andres Freund (#6)
Re: Track in pg_replication_slots the reason why slots conflict?

On Thu, Dec 21, 2023 at 5:04 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2023-12-21 16:08:48 +0530, shveta malik wrote:

On Thu, Dec 21, 2023 at 3:10 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2023-12-21 09:21:04 +0900, Michael Paquier wrote:

While listening at Bertrand's talk about logical decoding on standbys
last week at Prague, I got surprised by the fact that we do not
reflect in the catalogs the reason why a conflict happened for a slot.
There are three of them depending on ReplicationSlotInvalidationCause:
- WAL removed.
- Invalid horizon.
- Insufficient WAL level.

It should be extremely rare to hit any of these other than "WAL removed", so
I'm not sure it's worth adding interface complexity to show them.

ReplicationSlotCtl holds this information, so couldn't it be useful
for monitoring purposes to know why a slot got invalidated and add a
column to pg_get_replication_slots()? This could just be an extra
text conflicting_reason, defaulting to NULL when there's nothing to
see.

Extra columns aren't free from a usability perspective. IFF we do something, I
think it should be a single column with a cause.

Thanks for the feedback. But do you mean that we replace existing
'conflicting' column with 'cause' in both the function and view
(pg_get_replication_slots() and pg_replication_slots)? Or do you mean
that we expose 'cause' from pg_get_replication_slots() and use that to
display 'conflicting' in pg_replication_slots view?

I'm not entirely sure I understand the difference - just whether we add one
new column or replace the existing 'conflicting' column? I can see arguments
for either. A conflicting column where NULL indicates no conflict, and other
values indicate the reason for the conflict, doesn't seem too bad.

And if we plan to return/display cause from either function or view,
then shall it be enum 'ReplicationSlotInvalidationCause' or
description/text corresponding to enum?

We clearly can't just expose the numerical value for a C enum. So it has to be
converted to something SQL representable.

In the other feature being discussed "Synchronize slots from primary
to standby" [1] , there is a requirement to replicate invalidation
cause of slot from the primary to standby and thus it is needed in
enum form there. And thus there was a suggestion earlier to have the
function return enum-value and let the view display it as
text/description to the user. So kindly let us know your thoughts.

[1] - /messages/by-id/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com

Can you point me to a more specific message for that requirement? It seems
pretty odd to me. Your link goes to the top of a 400 message thread, I don't
have time to find one specific design point in that...

It is currently implemented there as a new function
'pg_get_slot_invalidation_cause()' without changing existing view
pg_replication_slots. (See 2.1 in [1]/messages/by-id/CAJpy0uAuzbzvcjpnzFTiWuDBctnH-SDZC6AZabPX65x9GWBrjQ@mail.gmail.com where it was introduced).

Then it was suggested in [2]/messages/by-id/CAA4eK1K0KCDNtpDyUKucMRdyK-5KdrCRWakCpHEdHT9muAiEOw@mail.gmail.com to fork a new thread as it makes sense to
have it independent of this slot-synchronization feature.

The new thread forked is [3]/messages/by-id/CAJpy0uBpr0ym12+0mXpjcRFA6N=anX+Yk9aGU4EJhHNu=fWykQ@mail.gmail.com. In that thread, the issues in having a
new function pg_get_slot_invalidation_cause() are discussed and also
we came to know about this very thread that started the next day.

[1]: /messages/by-id/CAJpy0uAuzbzvcjpnzFTiWuDBctnH-SDZC6AZabPX65x9GWBrjQ@mail.gmail.com
[2]: /messages/by-id/CAA4eK1K0KCDNtpDyUKucMRdyK-5KdrCRWakCpHEdHT9muAiEOw@mail.gmail.com
[3]: /messages/by-id/CAJpy0uBpr0ym12+0mXpjcRFA6N=anX+Yk9aGU4EJhHNu=fWykQ@mail.gmail.com

thanks
Shveta

#8Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#6)
Re: Track in pg_replication_slots the reason why slots conflict?

On Thu, Dec 21, 2023 at 5:05 PM Andres Freund <andres@anarazel.de> wrote:

On 2023-12-21 16:08:48 +0530, shveta malik wrote:

On Thu, Dec 21, 2023 at 3:10 PM Andres Freund <andres@anarazel.de> wrote:

Extra columns aren't free from a usability perspective. IFF we do something, I
think it should be a single column with a cause.

Thanks for the feedback. But do you mean that we replace existing
'conflicting' column with 'cause' in both the function and view
(pg_get_replication_slots() and pg_replication_slots)? Or do you mean
that we expose 'cause' from pg_get_replication_slots() and use that to
display 'conflicting' in pg_replication_slots view?

I'm not entirely sure I understand the difference - just whether we add one
new column or replace the existing 'conflicting' column? I can see arguments
for either.

Agreed. I think the argument against replacing the existing
'conflicting' column is that there is a chance that it is being used
by some monitoring script which I guess shouldn't be a big deal to
change. So, if we don't see that as a problem, I would prefer to have
a single column with conflict reason where one of its values indicates
there is no conflict.

A conflicting column where NULL indicates no conflict, and other

values indicate the reason for the conflict, doesn't seem too bad.

This is fine too.

And if we plan to return/display cause from either function or view,
then shall it be enum 'ReplicationSlotInvalidationCause' or
description/text corresponding to enum?

We clearly can't just expose the numerical value for a C enum. So it has to be
converted to something SQL representable.

We can return int2 value from the function pg_get_replication_slots()
and then use that to display a string in the view
pg_replication_slots.

--
With Regards,
Amit Kapila.

#9Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Amit Kapila (#8)
Re: Track in pg_replication_slots the reason why slots conflict?

Hi,

On Thu, Dec 21, 2023 at 07:55:51PM +0530, Amit Kapila wrote:

On Thu, Dec 21, 2023 at 5:05 PM Andres Freund <andres@anarazel.de> wrote:

I'm not entirely sure I understand the difference - just whether we add one
new column or replace the existing 'conflicting' column? I can see arguments
for either.

Agreed. I think the argument against replacing the existing
'conflicting' column is that there is a chance that it is being used
by some monitoring script which I guess shouldn't be a big deal to
change. So, if we don't see that as a problem, I would prefer to have
a single column with conflict reason where one of its values indicates
there is no conflict.

+1

A conflicting column where NULL indicates no conflict, and other

values indicate the reason for the conflict, doesn't seem too bad.

This is fine too.

+1

And if we plan to return/display cause from either function or view,
then shall it be enum 'ReplicationSlotInvalidationCause' or
description/text corresponding to enum?

We clearly can't just expose the numerical value for a C enum. So it has to be
converted to something SQL representable.

We can return int2 value from the function pg_get_replication_slots()
and then use that to display a string in the view
pg_replication_slots.

Yeah, and in the sync slot related work we could use pg_get_replication_slots()
then to get the enum.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#10Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#8)
Re: Track in pg_replication_slots the reason why slots conflict?

On 2023-12-21 19:55:51 +0530, Amit Kapila wrote:

On Thu, Dec 21, 2023 at 5:05 PM Andres Freund <andres@anarazel.de> wrote:

We clearly can't just expose the numerical value for a C enum. So it has to be
converted to something SQL representable.

We can return int2 value from the function pg_get_replication_slots()
and then use that to display a string in the view
pg_replication_slots.

I strongly dislike that pattern. It just leads to complicated views - and
doesn't provide a single benefit that I am aware of. It's much bettter to
simply populate the text version in pg_get_replication_slots().

#11Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#10)
Re: Track in pg_replication_slots the reason why slots conflict?

On Thu, Dec 21, 2023 at 07:26:56AM -0800, Andres Freund wrote:

On 2023-12-21 19:55:51 +0530, Amit Kapila wrote:

We can return int2 value from the function pg_get_replication_slots()
and then use that to display a string in the view
pg_replication_slots.

I strongly dislike that pattern. It just leads to complicated views - and
doesn't provide a single benefit that I am aware of. It's much bettter to
simply populate the text version in pg_get_replication_slots().

I agree that this is a better integration in the view, and that's what
I would do FWIW.

Amit, how much of a problem would it be to do a text->enum mapping
when synchronizing the slots from a primary to a standby? Sure you
could have a system function that does some of the mapping work, but I
am not sure what's the best integration when it comes to the other
patch.
--
Michael

#12Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#11)
Re: Track in pg_replication_slots the reason why slots conflict?

On Fri, Dec 22, 2023 at 5:00 AM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Dec 21, 2023 at 07:26:56AM -0800, Andres Freund wrote:

On 2023-12-21 19:55:51 +0530, Amit Kapila wrote:

We can return int2 value from the function pg_get_replication_slots()
and then use that to display a string in the view
pg_replication_slots.

I strongly dislike that pattern. It just leads to complicated views - and
doesn't provide a single benefit that I am aware of. It's much bettter to
simply populate the text version in pg_get_replication_slots().

I agree that this is a better integration in the view, and that's what
I would do FWIW.

Amit, how much of a problem would it be to do a text->enum mapping
when synchronizing the slots from a primary to a standby?

There is no problem as such in that. We were trying to see if there is
a more convenient way but let's move by having cause as text from both
the function and view as that seems to be a preferred way.

--
With Regards,
Amit Kapila.

#13Amit Kapila
amit.kapila16@gmail.com
In reply to: Bertrand Drouvot (#9)
Re: Track in pg_replication_slots the reason why slots conflict?

On Thu, Dec 21, 2023 at 8:21 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Thu, Dec 21, 2023 at 07:55:51PM +0530, Amit Kapila wrote:

On Thu, Dec 21, 2023 at 5:05 PM Andres Freund <andres@anarazel.de> wrote:

I'm not entirely sure I understand the difference - just whether we add one
new column or replace the existing 'conflicting' column? I can see arguments
for either.

Agreed. I think the argument against replacing the existing
'conflicting' column is that there is a chance that it is being used
by some monitoring script which I guess shouldn't be a big deal to
change. So, if we don't see that as a problem, I would prefer to have
a single column with conflict reason where one of its values indicates
there is no conflict.

+1

Does anyone else have a preference on whether to change the existing
column or add a new one?

--
With Regards,
Amit Kapila.

#14Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#13)
Re: Track in pg_replication_slots the reason why slots conflict?

On Tue, Dec 26, 2023 at 08:44:44AM +0530, Amit Kapila wrote:

Does anyone else have a preference on whether to change the existing
column or add a new one?

Just to be clear here, I'd vote for replacing the existing boolean
with a text.
--
Michael

#15Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Michael Paquier (#14)
Re: Track in pg_replication_slots the reason why slots conflict?

Hi,

On Tue, Dec 26, 2023 at 05:23:56PM +0900, Michael Paquier wrote:

On Tue, Dec 26, 2023 at 08:44:44AM +0530, Amit Kapila wrote:

Does anyone else have a preference on whether to change the existing
column or add a new one?

Just to be clear here, I'd vote for replacing the existing boolean
with a text.

Same here, I'd vote to avoid 2 columns having the same "meaning".

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#16Isaac Morland
isaac.morland@gmail.com
In reply to: Amit Kapila (#8)
Re: Track in pg_replication_slots the reason why slots conflict?

On Thu, 21 Dec 2023 at 09:26, Amit Kapila <amit.kapila16@gmail.com> wrote:

A conflicting column where NULL indicates no conflict, and other

values indicate the reason for the conflict, doesn't seem too bad.

This is fine too.

I prefer this option. There is precedent for doing it this way, for example
in pg_stat_activity.wait_event_type.

The most common test of this field is likely to be "is there a conflict"
and it's better to write this as "[fieldname] IS NOT NULL" than to
introduce a magic constant. Also, it makes clear to future maintainers that
this field has one purpose: saying what type of conflict there is, if any.
If we find ourselves wanting to record a new non-conflict status (no idea
what that could be: "almost conflict"? "probably conflict soon"?) there
would be less temptation to break existing tests for "is there a conflict".

#17shveta malik
shveta.malik@gmail.com
In reply to: Isaac Morland (#16)
1 attachment(s)
Re: Track in pg_replication_slots the reason why slots conflict?

On Tue, Dec 26, 2023 at 7:35 PM Isaac Morland <isaac.morland@gmail.com> wrote:

On Thu, 21 Dec 2023 at 09:26, Amit Kapila <amit.kapila16@gmail.com> wrote:

A conflicting column where NULL indicates no conflict, and other

values indicate the reason for the conflict, doesn't seem too bad.

This is fine too.

I prefer this option. There is precedent for doing it this way, for example in pg_stat_activity.wait_event_type.

The most common test of this field is likely to be "is there a conflict" and it's better to write this as "[fieldname] IS NOT NULL" than to introduce a magic constant. Also, it makes clear to future maintainers that this field has one purpose: saying what type of conflict there is, if any. If we find ourselves wanting to record a new non-conflict status (no idea what that could be: "almost conflict"? "probably conflict soon"?) there would be less temptation to break existing tests for "is there a conflict".

+1 on using "[fieldname] IS NOT NULL" to find "is there a conflict"

PFA the patch which attempts to implement this.

This patch changes the existing 'conflicting' field to
'conflicting_cause' in pg_replication_slots. This new field is always
NULL for physical slots (like the previous field conflicting), as well
as for those logical slots which are not invalidated.

thanks
Shveta

Attachments:

v1-0001-Track-conflicting_cause-in-pg_replication_slots.patchapplication/octet-stream; name=v1-0001-Track-conflicting_cause-in-pg_replication_slots.patchDownload
From 43d8e1a7fa9e8ea0870fc77a1e4b4301187be987 Mon Sep 17 00:00:00 2001
From: Shveta Malik <shveta.malik@gmail.com>
Date: Wed, 27 Dec 2023 10:22:39 +0530
Subject: [PATCH v1] Track conflicting_cause in pg_replication_slots

This patch changes existing 'conflicting' field to 'conflicting_cause'
in pg_replication_slots. This new field indicates cause if this logical
slot conflicted with recovery (and so is now invalidated).
It is always NULL for physical slots, as well as for those logical slots
which are not invalidated. Possible values are:

    wal_removed = required WAL has been removed.
    rows_removed = required rows have been removed.
    wal_level_insufficient = wal_level insufficient on the primary server.
---
 doc/src/sgml/system-views.sgml                | 28 +++++++++--
 src/backend/catalog/system_views.sql          |  2 +-
 src/backend/replication/slotfuncs.c           | 22 ++++++--
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  4 +-
 .../t/035_standby_logical_decoding.pl         | 50 ++++++++++++++++---
 src/test/regress/expected/rules.out           |  4 +-
 7 files changed, 92 insertions(+), 22 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 0ef1745631..f02c23ba52 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,11 +2525,29 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflicting</structfield> <type>bool</type>
-      </para>
-      <para>
-       True if this logical slot conflicted with recovery (and so is now
-       invalidated). Always NULL for physical slots.
+       <structfield>conflicting_cause</structfield> <type>text</type>
+      </para>
+      <para>
+       Cause if this logical slot conflicted with recovery (and so is now
+       invalidated). It is always NULL for physical slots, as well as for
+       those logical slots which are not invalidated. Possible values are:
+        <itemizedlist spacing="compact">
+         <listitem>
+          <para>
+           <literal>wal_removed</literal> = required WAL has been removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>rows_removed</literal> = required rows have been removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>wal_level_insufficient</literal> = wal_level insufficient on the primary server.
+          </para>
+         </listitem>
+        </itemizedlist>
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 058fc47c91..ba9f497e78 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflicting
+            L.conflicting_cause
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 4b694a03d0..8e3d27adc8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -406,10 +406,24 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			nulls[i++] = true;
 		else
 		{
-			if (slot_contents.data.invalidated != RS_INVAL_NONE)
-				values[i++] = BoolGetDatum(true);
-			else
-				values[i++] = BoolGetDatum(false);
+			switch (slot_contents.data.invalidated)
+			{
+				case RS_INVAL_NONE:
+					nulls[i++] = true;
+					break;
+
+				case RS_INVAL_WAL_REMOVED:
+					values[i++] = CStringGetTextDatum("wal_removed");
+					break;
+
+				case RS_INVAL_HORIZON:
+					values[i++] = CStringGetTextDatum("rows_removed");
+					break;
+
+				case RS_INVAL_WAL_LEVEL:
+					values[i++] = CStringGetTextDatum("wal_level_insufficient");
+					break;
+			}
 		}
 
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..edff3107d3 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -662,13 +662,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, "
-							"%s as caught_up, conflicting as invalid "
+							"%s as caught_up, conflicting_cause IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflicting THEN FALSE "
+							"(CASE WHEN conflicting_cause IS NOT NULL THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9052f5262a..de9a94a2bb 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11115,9 +11115,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting_cause}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 5d7c278d01..05ff94d121 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -177,7 +177,8 @@ sub check_slots_conflicting_status
 	{
 		$res = $node_standby->safe_psql(
 			'postgres', qq(
-				 select bool_and(conflicting) from pg_replication_slots;));
+				 select bool_and(conflicting) from
+				 (select conflicting_cause is not NULL as conflicting from pg_replication_slots);));
 
 		is($res, 't', "Logical slots are reported as conflicting");
 	}
@@ -185,12 +186,34 @@ sub check_slots_conflicting_status
 	{
 		$res = $node_standby->safe_psql(
 			'postgres', qq(
-				select bool_or(conflicting) from pg_replication_slots;));
+				select bool_or(conflicting) from
+				(select conflicting_cause is not NULL as conflicting from pg_replication_slots);));
 
 		is($res, 'f', "Logical slots are reported as non conflicting");
 	}
 }
 
+# Check conflicting_cause in pg_replication_slots.
+sub check_slots_conflicting_cause
+{
+	my ($slot_prefix, $cause) = @_;
+
+	my $active_slot = $slot_prefix . 'activeslot';
+	my $inactive_slot = $slot_prefix . 'inactiveslot';
+
+	$res = $node_standby->safe_psql(
+		'postgres', qq(
+			 select conflicting_cause from pg_replication_slots where slot_name = '$active_slot';));
+
+	is($res, "$cause", "$active_slot conflicting cause is $cause");
+
+	$res = $node_standby->safe_psql(
+		'postgres', qq(
+			 select conflicting_cause from pg_replication_slots where slot_name = '$inactive_slot';));
+
+	is($res, "$cause", "$inactive_slot conflicting cause is $cause");
+}
+
 # Drop the slots, re-create them, change hot_standby_feedback,
 # check xmin and catalog_xmin values, make slot active and reset stat.
 sub reactive_slots_change_hfs_and_wait_for_xmins
@@ -260,13 +283,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflicting is NULL for physical slot
+# Check conflicting_cause is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflicting is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT conflicting_cause is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflicting as NULL");
+is($res, 't', "Physical slot reports conflicting_cause as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -486,6 +509,9 @@ check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 # Verify slots are reported as conflicting in pg_replication_slots
 check_slots_conflicting_status(1);
 
+# Verify conflicting_cause is 'rows_removed' in pg_replication_slots
+check_slots_conflicting_cause('vacuum_full_', 'rows_removed');
+
 $handle =
   make_slot_active($node_standby, 'vacuum_full_', 0, \$stdout, \$stderr);
 
@@ -511,7 +537,7 @@ check_slots_conflicting_status(1);
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflicting is true;"
+	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflicting_cause is not null;"
 );
 
 chomp($restart_lsn);
@@ -568,6 +594,9 @@ check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 # Verify slots are reported as conflicting in pg_replication_slots
 check_slots_conflicting_status(1);
 
+# Verify conflicting_cause is 'rows_removed' in pg_replication_slots
+check_slots_conflicting_cause('row_removal_', 'rows_removed');
+
 $handle =
   make_slot_active($node_standby, 'row_removal_', 0, \$stdout, \$stderr);
 
@@ -607,6 +636,9 @@ check_for_invalidation('shared_row_removal_', $logstart,
 # Verify slots are reported as conflicting in pg_replication_slots
 check_slots_conflicting_status(1);
 
+# Verify conflicting_cause is 'rows_removed' in pg_replication_slots
+check_slots_conflicting_cause('shared_row_removal_', 'rows_removed');
+
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
 	\$stderr);
 
@@ -696,6 +728,9 @@ check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 # Verify slots are reported as conflicting in pg_replication_slots
 check_slots_conflicting_status(1);
 
+# Verify conflicting_cause is 'rows_removed' in pg_replication_slots
+check_slots_conflicting_cause('pruning_', 'rows_removed');
+
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
 
 # We are not able to read from the slot as it has been invalidated
@@ -740,6 +775,9 @@ check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 # Verify slots are reported as conflicting in pg_replication_slots
 check_slots_conflicting_status(1);
 
+# Verify conflicting_cause is 'wal_level_insufficient' in pg_replication_slots
+check_slots_conflicting_cause('wal_level_', 'wal_level_insufficient');
+
 $handle =
   make_slot_active($node_standby, 'wal_level_', 0, \$stdout, \$stderr);
 # We are not able to read from the slot as it requires wal_level >= logical on the primary server
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f645e8486b..95701f1175 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,8 +1473,8 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflicting
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting)
+    l.conflicting_cause
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting_cause)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

#18Amit Kapila
amit.kapila16@gmail.com
In reply to: shveta malik (#17)
Re: Track in pg_replication_slots the reason why slots conflict?

On Wed, Dec 27, 2023 at 3:08 PM shveta malik <shveta.malik@gmail.com> wrote:

PFA the patch which attempts to implement this.

This patch changes the existing 'conflicting' field to
'conflicting_cause' in pg_replication_slots.

This name sounds a bit odd to me, would it be better to name it as
conflict_cause?

A few other minor comments:
=========================
*
+       <structfield>conflicting_cause</structfield> <type>text</type>
+      </para>
+      <para>
+       Cause if this logical slot conflicted with recovery (and so is now
+       invalidated). It is always NULL for physical slots, as well as for
+       those logical slots which are not invalidated. Possible values are:

Would it better to use description as follows:" Cause of logical
slot's conflict with recovery. It is always NULL for physical slots,
as well as for logical slots which are not invalidated. The non-NULL
values indicate that the slot is marked as invalidated. Possible
values are:
.."

*
  $res = $node_standby->safe_psql(
  'postgres', qq(
- select bool_and(conflicting) from pg_replication_slots;));
+ select bool_and(conflicting) from
+ (select conflicting_cause is not NULL as conflicting from
pg_replication_slots);));

Won't the query "select conflicting_cause is not NULL as conflicting
from pg_replication_slots" can return false even for physical slots
and then impact the result of the main query whereas the original
query would seem to be simply ignoring physical slots? If this
observation is correct then you might want to add a 'slot_type'
condition in the new query.

* After introducing check_slots_conflicting_cause(), do we need to
have check_slots_conflicting_status()? Aren't both checking the same
thing?

--
With Regards,
Amit Kapila.

#19shveta malik
shveta.malik@gmail.com
In reply to: Amit Kapila (#18)
Re: Track in pg_replication_slots the reason why slots conflict?

On Wed, Dec 27, 2023 at 4:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 27, 2023 at 3:08 PM shveta malik <shveta.malik@gmail.com> wrote:

PFA the patch which attempts to implement this.

This patch changes the existing 'conflicting' field to
'conflicting_cause' in pg_replication_slots.

This name sounds a bit odd to me, would it be better to name it as
conflict_cause?

A few other minor comments:
=========================
*
+       <structfield>conflicting_cause</structfield> <type>text</type>
+      </para>
+      <para>
+       Cause if this logical slot conflicted with recovery (and so is now
+       invalidated). It is always NULL for physical slots, as well as for
+       those logical slots which are not invalidated. Possible values are:

Would it better to use description as follows:" Cause of logical
slot's conflict with recovery. It is always NULL for physical slots,
as well as for logical slots which are not invalidated. The non-NULL
values indicate that the slot is marked as invalidated. Possible
values are:
.."

*
$res = $node_standby->safe_psql(
'postgres', qq(
- select bool_and(conflicting) from pg_replication_slots;));
+ select bool_and(conflicting) from
+ (select conflicting_cause is not NULL as conflicting from
pg_replication_slots);));

Won't the query "select conflicting_cause is not NULL as conflicting
from pg_replication_slots" can return false even for physical slots
and then impact the result of the main query whereas the original
query would seem to be simply ignoring physical slots? If this
observation is correct then you might want to add a 'slot_type'
condition in the new query.

* After introducing check_slots_conflicting_cause(), do we need to
have check_slots_conflicting_status()? Aren't both checking the same
thing?

I think it is needed for the case where we want to check that there is
no conflict.

# Verify slots are reported as non conflicting in pg_replication_slots
check_slots_conflicting_status(0);

For the cases where there is conflict, I think
check_slots_conflicting_cause() can replace
check_slots_conflicting_status().

thanks
Shveta

#20shveta malik
shveta.malik@gmail.com
In reply to: shveta malik (#19)
1 attachment(s)
Re: Track in pg_replication_slots the reason why slots conflict?

On Thu, Dec 28, 2023 at 10:16 AM shveta malik <shveta.malik@gmail.com> wrote:

On Wed, Dec 27, 2023 at 4:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Dec 27, 2023 at 3:08 PM shveta malik <shveta.malik@gmail.com> wrote:

PFA the patch which attempts to implement this.

This patch changes the existing 'conflicting' field to
'conflicting_cause' in pg_replication_slots.

This name sounds a bit odd to me, would it be better to name it as
conflict_cause?

A few other minor comments:
=========================

Thanks for the feedback Amit.

*
+       <structfield>conflicting_cause</structfield> <type>text</type>
+      </para>
+      <para>
+       Cause if this logical slot conflicted with recovery (and so is now
+       invalidated). It is always NULL for physical slots, as well as for
+       those logical slots which are not invalidated. Possible values are:

Would it better to use description as follows:" Cause of logical
slot's conflict with recovery. It is always NULL for physical slots,
as well as for logical slots which are not invalidated. The non-NULL
values indicate that the slot is marked as invalidated. Possible
values are:
.."

*
$res = $node_standby->safe_psql(
'postgres', qq(
- select bool_and(conflicting) from pg_replication_slots;));
+ select bool_and(conflicting) from
+ (select conflicting_cause is not NULL as conflicting from
pg_replication_slots);));

Won't the query "select conflicting_cause is not NULL as conflicting
from pg_replication_slots" can return false even for physical slots
and then impact the result of the main query whereas the original
query would seem to be simply ignoring physical slots? If this
observation is correct then you might want to add a 'slot_type'
condition in the new query.

* After introducing check_slots_conflicting_cause(), do we need to
have check_slots_conflicting_status()? Aren't both checking the same
thing?

I think it is needed for the case where we want to check that there is
no conflict.

# Verify slots are reported as non conflicting in pg_replication_slots
check_slots_conflicting_status(0);

For the cases where there is conflict, I think
check_slots_conflicting_cause() can replace
check_slots_conflicting_status().

I have removed check_slots_conflicting_status() and where it was
needed to check non-conflicting, I have added a simple query.

PFA the v2-patch with all your comments addressed.

thanks
Shveta

Attachments:

v2-0001-Track-conflicting_cause-in-pg_replication_slots.patchapplication/octet-stream; name=v2-0001-Track-conflicting_cause-in-pg_replication_slots.patchDownload
From be75ad4924834d4018c222125eeeae4a9d6d9a26 Mon Sep 17 00:00:00 2001
From: Shveta Malik <shveta.malik@gmail.com>
Date: Wed, 27 Dec 2023 10:22:39 +0530
Subject: [PATCH v2] Track conflicting_cause in pg_replication_slots

This patch changes the existing 'conflicting' field to 'conflict_cause'
in pg_replication_slots. This new field indicates the cause of logical
slot's conflict with recovery. It is always NULL for physical slots,
as well as for logical slots which are not invalidated. The non-NULL
values indicate that the slot is marked as invalidated. Possible values
are:
    wal_removed = required WAL has been removed.
    rows_removed = required rows have been removed.
    wal_level_insufficient = wal_level insufficient on the primary server.
---
 doc/src/sgml/system-views.sgml                | 29 ++++++--
 src/backend/catalog/system_views.sql          |  2 +-
 src/backend/replication/slotfuncs.c           | 22 ++++--
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  4 +-
 .../t/035_standby_logical_decoding.pl         | 72 ++++++++++---------
 src/test/regress/expected/rules.out           |  4 +-
 7 files changed, 87 insertions(+), 50 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 0ef1745631..ca0cc0d739 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,11 +2525,30 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflicting</structfield> <type>bool</type>
-      </para>
-      <para>
-       True if this logical slot conflicted with recovery (and so is now
-       invalidated). Always NULL for physical slots.
+       <structfield>conflict_cause</structfield> <type>text</type>
+      </para>
+      <para>
+       Cause of logical slot's conflict with recovery. It is always NULL
+       for physical slots, as well as for logical slots which are not
+       invalidated. The non-NULL values indicate that the slot is marked
+       as invalidated. Possible values are:
+        <itemizedlist spacing="compact">
+         <listitem>
+          <para>
+           <literal>wal_removed</literal> = required WAL has been removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>rows_removed</literal> = required rows have been removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>wal_level_insufficient</literal> = wal_level insufficient on the primary server.
+          </para>
+         </listitem>
+        </itemizedlist>
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 058fc47c91..9eb10efe09 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflicting
+            L.conflict_cause
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 4b694a03d0..8e3d27adc8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -406,10 +406,24 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			nulls[i++] = true;
 		else
 		{
-			if (slot_contents.data.invalidated != RS_INVAL_NONE)
-				values[i++] = BoolGetDatum(true);
-			else
-				values[i++] = BoolGetDatum(false);
+			switch (slot_contents.data.invalidated)
+			{
+				case RS_INVAL_NONE:
+					nulls[i++] = true;
+					break;
+
+				case RS_INVAL_WAL_REMOVED:
+					values[i++] = CStringGetTextDatum("wal_removed");
+					break;
+
+				case RS_INVAL_HORIZON:
+					values[i++] = CStringGetTextDatum("rows_removed");
+					break;
+
+				case RS_INVAL_WAL_LEVEL:
+					values[i++] = CStringGetTextDatum("wal_level_insufficient");
+					break;
+			}
 		}
 
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..0e0828c890 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -662,13 +662,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, "
-							"%s as caught_up, conflicting as invalid "
+							"%s as caught_up, conflict_cause IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflicting THEN FALSE "
+							"(CASE WHEN conflict_cause IS NOT NULL THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9052f5262a..e88ab3e0ff 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11115,9 +11115,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_cause}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 5d7c278d01..3bedac32ac 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,27 +168,25 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflicting status in pg_replication_slots.
-sub check_slots_conflicting_status
+# Check conflict_cause in pg_replication_slots.
+sub check_slots_conflict_cause
 {
-	my ($conflicting) = @_;
+	my ($slot_prefix, $cause) = @_;
 
-	if ($conflicting)
-	{
-		$res = $node_standby->safe_psql(
-			'postgres', qq(
-				 select bool_and(conflicting) from pg_replication_slots;));
+	my $active_slot = $slot_prefix . 'activeslot';
+	my $inactive_slot = $slot_prefix . 'inactiveslot';
 
-		is($res, 't', "Logical slots are reported as conflicting");
-	}
-	else
-	{
-		$res = $node_standby->safe_psql(
-			'postgres', qq(
-				select bool_or(conflicting) from pg_replication_slots;));
+	$res = $node_standby->safe_psql(
+		'postgres', qq(
+			 select conflict_cause from pg_replication_slots where slot_name = '$active_slot';));
 
-		is($res, 'f', "Logical slots are reported as non conflicting");
-	}
+	is($res, "$cause", "$active_slot conflicting cause is $cause");
+
+	$res = $node_standby->safe_psql(
+		'postgres', qq(
+			 select conflict_cause from pg_replication_slots where slot_name = '$inactive_slot';));
+
+	is($res, "$cause", "$inactive_slot conflicting cause is $cause");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -260,13 +258,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflicting is NULL for physical slot
+# Check conflict_cause is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflicting is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT conflict_cause is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflicting as NULL");
+is($res, 't', "Physical slot reports conflict_cause as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -483,8 +481,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_cause is 'rows_removed' in pg_replication_slots
+check_slots_conflict_cause('vacuum_full_', 'rows_removed');
 
 $handle =
   make_slot_active($node_standby, 'vacuum_full_', 0, \$stdout, \$stderr);
@@ -502,8 +500,8 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_cause is retained across a restart.
+check_slots_conflict_cause('vacuum_full_', 'rows_removed');
 
 ##################################################
 # Verify that invalidated logical slots do not lead to retaining WAL.
@@ -511,7 +509,7 @@ check_slots_conflicting_status(1);
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflicting is true;"
+	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_cause is not null;"
 );
 
 chomp($restart_lsn);
@@ -565,8 +563,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_cause is 'rows_removed' in pg_replication_slots
+check_slots_conflict_cause('row_removal_', 'rows_removed');
 
 $handle =
   make_slot_active($node_standby, 'row_removal_', 0, \$stdout, \$stderr);
@@ -604,8 +602,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_cause is 'rows_removed' in pg_replication_slots
+check_slots_conflict_cause('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
 	\$stderr);
@@ -657,7 +655,13 @@ ok( $node_standby->poll_query_until(
 ) or die "Timed out waiting confl_active_logicalslot to be updated";
 
 # Verify slots are reported as non conflicting in pg_replication_slots
-check_slots_conflicting_status(0);
+is( $node_standby->safe_psql(
+		'postgres',
+		q[select bool_or(conflicting) from
+		  (select conflict_cause is not NULL as conflicting
+		   from pg_replication_slots WHERE slot_type = 'logical')]),
+	'f',
+	'Logical slots are reported as non conflicting');
 
 # Turn hot_standby_feedback back on
 change_hot_standby_feedback_and_wait_for_xmins(1, 0);
@@ -693,8 +697,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_cause is 'rows_removed' in pg_replication_slots
+check_slots_conflict_cause('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
 
@@ -737,8 +741,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_cause is 'wal_level_insufficient' in pg_replication_slots
+check_slots_conflict_cause('wal_level_', 'wal_level_insufficient');
 
 $handle =
   make_slot_active($node_standby, 'wal_level_', 0, \$stdout, \$stderr);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f645e8486b..692cdbd781 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,8 +1473,8 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflicting
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting)
+    l.conflict_cause
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_cause)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

#21Amit Kapila
amit.kapila16@gmail.com
In reply to: shveta malik (#20)
Re: Track in pg_replication_slots the reason why slots conflict?

On Thu, Dec 28, 2023 at 2:58 PM shveta malik <shveta.malik@gmail.com> wrote:

PFA the v2-patch with all your comments addressed.

Does anyone have a preference for a column name? The options on the
table are conflict_cause, conflicting_cause, conflict_reason. Any
others? I was checking docs for similar usage and found
"pg_event_trigger_table_rewrite_reason" function, so based on that we
can even go with conflict_reason.

--
With Regards,
Amit Kapila.

#22Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#21)
Re: Track in pg_replication_slots the reason why slots conflict?

On Fri, Dec 29, 2023 at 09:20:52AM +0530, Amit Kapila wrote:

Does anyone have a preference for a column name? The options on the
table are conflict_cause, conflicting_cause, conflict_reason. Any
others? I was checking docs for similar usage and found
"pg_event_trigger_table_rewrite_reason" function, so based on that we
can even go with conflict_reason.

"conflict_reason" sounds like the natural choice here.
--
Michael

#23shveta malik
shveta.malik@gmail.com
In reply to: Michael Paquier (#22)
Re: Track in pg_replication_slots the reason why slots conflict?

On Fri, Dec 29, 2023 at 3:35 PM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Dec 29, 2023 at 09:20:52AM +0530, Amit Kapila wrote:

Does anyone have a preference for a column name? The options on the
table are conflict_cause, conflicting_cause, conflict_reason. Any
others? I was checking docs for similar usage and found
"pg_event_trigger_table_rewrite_reason" function, so based on that we
can even go with conflict_reason.

"conflict_reason" sounds like the natural choice here.

Do we have more comments on the patch apart from column_name?

thanks
Shveta

#24shveta malik
shveta.malik@gmail.com
In reply to: shveta malik (#23)
1 attachment(s)
Re: Track in pg_replication_slots the reason why slots conflict?

On Mon, Jan 1, 2024 at 9:14 AM shveta malik <shveta.malik@gmail.com> wrote:

On Fri, Dec 29, 2023 at 3:35 PM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Dec 29, 2023 at 09:20:52AM +0530, Amit Kapila wrote:

Does anyone have a preference for a column name? The options on the
table are conflict_cause, conflicting_cause, conflict_reason. Any
others? I was checking docs for similar usage and found
"pg_event_trigger_table_rewrite_reason" function, so based on that we
can even go with conflict_reason.

"conflict_reason" sounds like the natural choice here.

Do we have more comments on the patch apart from column_name?

thanks
Shveta

PFA v3 after changing column name to 'conflict_reason'

thanks
Shveta

Attachments:

v3-0001-Track-conflicting_reason-in-pg_replication_slots.patchapplication/octet-stream; name=v3-0001-Track-conflicting_reason-in-pg_replication_slots.patchDownload
From caa4277afa0a4f463372061ecaec08ec614048ad Mon Sep 17 00:00:00 2001
From: Shveta Malik <shveta.malik@gmail.com>
Date: Wed, 27 Dec 2023 10:22:39 +0530
Subject: [PATCH v3] Track conflicting_reason in pg_replication_slots

This patch changes the existing 'conflicting' field to 'conflict_reason'
in pg_replication_slots. This new field indicates the reason of logical
slot's conflict with recovery. It is always NULL for physical slots,
as well as for logical slots which are not invalidated. The non-NULL
values indicate that the slot is marked as invalidated. Possible values
are:
    wal_removed = required WAL has been removed.
    rows_removed = required rows have been removed.
    wal_level_insufficient = wal_level insufficient on the primary server.
---
 doc/src/sgml/system-views.sgml                | 29 ++++++--
 src/backend/catalog/system_views.sql          |  2 +-
 src/backend/replication/slotfuncs.c           | 22 ++++--
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  4 +-
 .../t/035_standby_logical_decoding.pl         | 72 ++++++++++---------
 src/test/regress/expected/rules.out           |  4 +-
 7 files changed, 87 insertions(+), 50 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 0ef1745631..6191e2b2ec 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,11 +2525,30 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflicting</structfield> <type>bool</type>
-      </para>
-      <para>
-       True if this logical slot conflicted with recovery (and so is now
-       invalidated). Always NULL for physical slots.
+       <structfield>conflict_reason</structfield> <type>text</type>
+      </para>
+      <para>
+       The reason of logical slot's conflict with recovery. It is always NULL
+       for physical slots, as well as for logical slots which are not
+       invalidated. The non-NULL values indicate that the slot is marked
+       as invalidated. Possible values are:
+        <itemizedlist spacing="compact">
+         <listitem>
+          <para>
+           <literal>wal_removed</literal> = required WAL has been removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>rows_removed</literal> = required rows have been removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>wal_level_insufficient</literal> = wal_level insufficient on the primary server.
+          </para>
+         </listitem>
+        </itemizedlist>
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 058fc47c91..dd9ed0958b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflicting
+            L.conflict_reason
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 4b694a03d0..8e3d27adc8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -406,10 +406,24 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			nulls[i++] = true;
 		else
 		{
-			if (slot_contents.data.invalidated != RS_INVAL_NONE)
-				values[i++] = BoolGetDatum(true);
-			else
-				values[i++] = BoolGetDatum(false);
+			switch (slot_contents.data.invalidated)
+			{
+				case RS_INVAL_NONE:
+					nulls[i++] = true;
+					break;
+
+				case RS_INVAL_WAL_REMOVED:
+					values[i++] = CStringGetTextDatum("wal_removed");
+					break;
+
+				case RS_INVAL_HORIZON:
+					values[i++] = CStringGetTextDatum("rows_removed");
+					break;
+
+				case RS_INVAL_WAL_LEVEL:
+					values[i++] = CStringGetTextDatum("wal_level_insufficient");
+					break;
+			}
 		}
 
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..537af558be 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -662,13 +662,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, "
-							"%s as caught_up, conflicting as invalid "
+							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflicting THEN FALSE "
+							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9052f5262a..8fe8234a1b 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11115,9 +11115,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 2e7893ec4e..e52c26a960 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,27 +168,25 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflicting status in pg_replication_slots.
-sub check_slots_conflicting_status
+# Check conflict_reason in pg_replication_slots.
+sub check_slots_conflict_reason
 {
-	my ($conflicting) = @_;
+	my ($slot_prefix, $reason) = @_;
 
-	if ($conflicting)
-	{
-		$res = $node_standby->safe_psql(
-			'postgres', qq(
-				 select bool_and(conflicting) from pg_replication_slots;));
+	my $active_slot = $slot_prefix . 'activeslot';
+	my $inactive_slot = $slot_prefix . 'inactiveslot';
 
-		is($res, 't', "Logical slots are reported as conflicting");
-	}
-	else
-	{
-		$res = $node_standby->safe_psql(
-			'postgres', qq(
-				select bool_or(conflicting) from pg_replication_slots;));
+	$res = $node_standby->safe_psql(
+		'postgres', qq(
+			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
 
-		is($res, 'f', "Logical slots are reported as non conflicting");
-	}
+	is($res, "$reason", "$active_slot conflict_reason is $reason");
+
+	$res = $node_standby->safe_psql(
+		'postgres', qq(
+			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+
+	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -260,13 +258,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflicting is NULL for physical slot
+# Check conflict_reason is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflicting is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflicting as NULL");
+is($res, 't', "Physical slot reports conflict_reason as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -483,8 +481,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 $handle =
   make_slot_active($node_standby, 'vacuum_full_', 0, \$stdout, \$stderr);
@@ -502,8 +500,8 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is retained across a restart.
+check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
 # Verify that invalidated logical slots do not lead to retaining WAL.
@@ -511,7 +509,7 @@ check_slots_conflicting_status(1);
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflicting is true;"
+	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
 );
 
 chomp($restart_lsn);
@@ -565,8 +563,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
   make_slot_active($node_standby, 'row_removal_', 0, \$stdout, \$stderr);
@@ -604,8 +602,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
 	\$stderr);
@@ -657,7 +655,13 @@ ok( $node_standby->poll_query_until(
 ) or die "Timed out waiting confl_active_logicalslot to be updated";
 
 # Verify slots are reported as non conflicting in pg_replication_slots
-check_slots_conflicting_status(0);
+is( $node_standby->safe_psql(
+		'postgres',
+		q[select bool_or(conflicting) from
+		  (select conflict_reason is not NULL as conflicting
+		   from pg_replication_slots WHERE slot_type = 'logical')]),
+	'f',
+	'Logical slots are reported as non conflicting');
 
 # Turn hot_standby_feedback back on
 change_hot_standby_feedback_and_wait_for_xmins(1, 0);
@@ -693,8 +697,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
 
@@ -737,8 +741,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
   make_slot_active($node_standby, 'wal_level_', 0, \$stdout, \$stderr);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f645e8486b..d878a971df 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,8 +1473,8 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflicting
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting)
+    l.conflict_reason
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

#25Amit Kapila
amit.kapila16@gmail.com
In reply to: shveta malik (#24)
Re: Track in pg_replication_slots the reason why slots conflict?

On Mon, Jan 1, 2024 at 12:32 PM shveta malik <shveta.malik@gmail.com> wrote:

PFA v3 after changing column name to 'conflict_reason'

Few minor comments:
===================
1.
+          <para>
+           <literal>wal_removed</literal> = required WAL has been removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>rows_removed</literal> = required rows have been removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>wal_level_insufficient</literal> = wal_level
insufficient on the primary server.
+          </para>

Should we use the same style to write the description as we are using
for the wal_status column? For example, <literal>wal_removed</literal>
means that the required WAL has been removed.

2.
+      <para>
+       The reason of logical slot's conflict with recovery.

My grammar tool says it should be: "The reason for the logical slot's
conflict with recovery."

Other than these minor comments, the patch looks good to me.

--
With Regards,
Amit Kapila.

#26shveta malik
shveta.malik@gmail.com
In reply to: Amit Kapila (#25)
1 attachment(s)
Re: Track in pg_replication_slots the reason why slots conflict?

On Mon, Jan 1, 2024 at 4:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Jan 1, 2024 at 12:32 PM shveta malik <shveta.malik@gmail.com> wrote:

PFA v3 after changing column name to 'conflict_reason'

Few minor comments:
===================
1.
+          <para>
+           <literal>wal_removed</literal> = required WAL has been removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>rows_removed</literal> = required rows have been removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>wal_level_insufficient</literal> = wal_level
insufficient on the primary server.
+          </para>

Should we use the same style to write the description as we are using
for the wal_status column? For example, <literal>wal_removed</literal>
means that the required WAL has been removed.

2.
+      <para>
+       The reason of logical slot's conflict with recovery.

My grammar tool says it should be: "The reason for the logical slot's
conflict with recovery."

Other than these minor comments, the patch looks good to me.

PFA v4 which addresses the doc comments.

thanks
Shveta

Attachments:

v4-0001-Track-conflicting_reason-in-pg_replication_slots.patchapplication/octet-stream; name=v4-0001-Track-conflicting_reason-in-pg_replication_slots.patchDownload
From f65147e913b88c295a35485b35c2800c449056ee Mon Sep 17 00:00:00 2001
From: Shveta Malik <shveta.malik@gmail.com>
Date: Wed, 27 Dec 2023 10:22:39 +0530
Subject: [PATCH v4] Track conflicting_reason in pg_replication_slots

This patch changes the existing 'conflicting' field to 'conflict_reason'
in pg_replication_slots. This new field indicates the reason for the logical
slot's conflict with recovery. It is always NULL for physical slots,
as well as for logical slots which are not invalidated. The non-NULL
values indicate that the slot is marked as invalidated. Possible values
are:
    wal_removed = required WAL has been removed.
    rows_removed = required rows have been removed.
    wal_level_insufficient = wal_level insufficient on the primary server.
---
 doc/src/sgml/system-views.sgml                | 33 +++++++--
 src/backend/catalog/system_views.sql          |  2 +-
 src/backend/replication/slotfuncs.c           | 22 ++++--
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  4 +-
 .../t/035_standby_logical_decoding.pl         | 72 ++++++++++---------
 src/test/regress/expected/rules.out           |  4 +-
 7 files changed, 91 insertions(+), 50 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 0ef1745631..05cb785409 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,11 +2525,34 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflicting</structfield> <type>bool</type>
-      </para>
-      <para>
-       True if this logical slot conflicted with recovery (and so is now
-       invalidated). Always NULL for physical slots.
+       <structfield>conflict_reason</structfield> <type>text</type>
+      </para>
+      <para>
+       The reason for the logical slot's conflict with recovery. It is always
+       NULL for physical slots, as well as for logical slots which are not
+       invalidated. The non-NULL values indicate that the slot is marked
+       as invalidated. Possible values are:
+        <itemizedlist spacing="compact">
+         <listitem>
+          <para>
+           <literal>wal_removed</literal> means that the required WAL has been
+           removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>rows_removed</literal> means that the required rows have
+           been removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>wal_level_insufficient</literal> means that the
+           <xref linkend="guc-wal-level"/> is insufficient on the primary
+           server.
+          </para>
+         </listitem>
+        </itemizedlist>
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 058fc47c91..dd9ed0958b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflicting
+            L.conflict_reason
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 4b694a03d0..8e3d27adc8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -406,10 +406,24 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			nulls[i++] = true;
 		else
 		{
-			if (slot_contents.data.invalidated != RS_INVAL_NONE)
-				values[i++] = BoolGetDatum(true);
-			else
-				values[i++] = BoolGetDatum(false);
+			switch (slot_contents.data.invalidated)
+			{
+				case RS_INVAL_NONE:
+					nulls[i++] = true;
+					break;
+
+				case RS_INVAL_WAL_REMOVED:
+					values[i++] = CStringGetTextDatum("wal_removed");
+					break;
+
+				case RS_INVAL_HORIZON:
+					values[i++] = CStringGetTextDatum("rows_removed");
+					break;
+
+				case RS_INVAL_WAL_LEVEL:
+					values[i++] = CStringGetTextDatum("wal_level_insufficient");
+					break;
+			}
 		}
 
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..537af558be 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -662,13 +662,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, "
-							"%s as caught_up, conflicting as invalid "
+							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflicting THEN FALSE "
+							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9052f5262a..8fe8234a1b 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11115,9 +11115,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 2e7893ec4e..e52c26a960 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,27 +168,25 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflicting status in pg_replication_slots.
-sub check_slots_conflicting_status
+# Check conflict_reason in pg_replication_slots.
+sub check_slots_conflict_reason
 {
-	my ($conflicting) = @_;
+	my ($slot_prefix, $reason) = @_;
 
-	if ($conflicting)
-	{
-		$res = $node_standby->safe_psql(
-			'postgres', qq(
-				 select bool_and(conflicting) from pg_replication_slots;));
+	my $active_slot = $slot_prefix . 'activeslot';
+	my $inactive_slot = $slot_prefix . 'inactiveslot';
 
-		is($res, 't', "Logical slots are reported as conflicting");
-	}
-	else
-	{
-		$res = $node_standby->safe_psql(
-			'postgres', qq(
-				select bool_or(conflicting) from pg_replication_slots;));
+	$res = $node_standby->safe_psql(
+		'postgres', qq(
+			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
 
-		is($res, 'f', "Logical slots are reported as non conflicting");
-	}
+	is($res, "$reason", "$active_slot conflict_reason is $reason");
+
+	$res = $node_standby->safe_psql(
+		'postgres', qq(
+			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+
+	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -260,13 +258,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflicting is NULL for physical slot
+# Check conflict_reason is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflicting is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflicting as NULL");
+is($res, 't', "Physical slot reports conflict_reason as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -483,8 +481,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 $handle =
   make_slot_active($node_standby, 'vacuum_full_', 0, \$stdout, \$stderr);
@@ -502,8 +500,8 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is retained across a restart.
+check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
 # Verify that invalidated logical slots do not lead to retaining WAL.
@@ -511,7 +509,7 @@ check_slots_conflicting_status(1);
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflicting is true;"
+	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
 );
 
 chomp($restart_lsn);
@@ -565,8 +563,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
   make_slot_active($node_standby, 'row_removal_', 0, \$stdout, \$stderr);
@@ -604,8 +602,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
 	\$stderr);
@@ -657,7 +655,13 @@ ok( $node_standby->poll_query_until(
 ) or die "Timed out waiting confl_active_logicalslot to be updated";
 
 # Verify slots are reported as non conflicting in pg_replication_slots
-check_slots_conflicting_status(0);
+is( $node_standby->safe_psql(
+		'postgres',
+		q[select bool_or(conflicting) from
+		  (select conflict_reason is not NULL as conflicting
+		   from pg_replication_slots WHERE slot_type = 'logical')]),
+	'f',
+	'Logical slots are reported as non conflicting');
 
 # Turn hot_standby_feedback back on
 change_hot_standby_feedback_and_wait_for_xmins(1, 0);
@@ -693,8 +697,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
 
@@ -737,8 +741,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
   make_slot_active($node_standby, 'wal_level_', 0, \$stdout, \$stderr);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f645e8486b..d878a971df 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,8 +1473,8 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflicting
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting)
+    l.conflict_reason
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

#27shveta malik
shveta.malik@gmail.com
In reply to: shveta malik (#26)
1 attachment(s)
Re: Track in pg_replication_slots the reason why slots conflict?

On Mon, Jan 1, 2024 at 5:17 PM shveta malik <shveta.malik@gmail.com> wrote:

On Mon, Jan 1, 2024 at 4:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Jan 1, 2024 at 12:32 PM shveta malik <shveta.malik@gmail.com> wrote:

PFA v3 after changing column name to 'conflict_reason'

Few minor comments:
===================
1.
+          <para>
+           <literal>wal_removed</literal> = required WAL has been removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>rows_removed</literal> = required rows have been removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>wal_level_insufficient</literal> = wal_level
insufficient on the primary server.
+          </para>

Should we use the same style to write the description as we are using
for the wal_status column? For example, <literal>wal_removed</literal>
means that the required WAL has been removed.

2.
+      <para>
+       The reason of logical slot's conflict with recovery.

My grammar tool says it should be: "The reason for the logical slot's
conflict with recovery."

Other than these minor comments, the patch looks good to me.

PFA v4 which addresses the doc comments.

Please ignore the previous patch and PFA new v4 (v4_2). The only
change from the earlier v4 is the subject correction in commit msg.

thanks
Shveta

Attachments:

v4_2-0001-Track-conflict_reason-in-pg_replication_slots.patchapplication/octet-stream; name=v4_2-0001-Track-conflict_reason-in-pg_replication_slots.patchDownload
From 3932c00bfc3158cc3419a2a0abc17b6de4a55118 Mon Sep 17 00:00:00 2001
From: Shveta Malik <shveta.malik@gmail.com>
Date: Wed, 27 Dec 2023 10:22:39 +0530
Subject: [PATCH v4] Track conflict_reason in pg_replication_slots

This patch changes the existing 'conflicting' field to 'conflict_reason'
in pg_replication_slots. This new field indicates the reason for the logical
slot's conflict with recovery. It is always NULL for physical slots,
as well as for logical slots which are not invalidated. The non-NULL
values indicate that the slot is marked as invalidated. Possible values
are:
    wal_removed = required WAL has been removed.
    rows_removed = required rows have been removed.
    wal_level_insufficient = wal_level insufficient on the primary server.
---
 doc/src/sgml/system-views.sgml                | 33 +++++++--
 src/backend/catalog/system_views.sql          |  2 +-
 src/backend/replication/slotfuncs.c           | 22 ++++--
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  4 +-
 .../t/035_standby_logical_decoding.pl         | 72 ++++++++++---------
 src/test/regress/expected/rules.out           |  4 +-
 7 files changed, 91 insertions(+), 50 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 0ef1745631..05cb785409 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,11 +2525,34 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflicting</structfield> <type>bool</type>
-      </para>
-      <para>
-       True if this logical slot conflicted with recovery (and so is now
-       invalidated). Always NULL for physical slots.
+       <structfield>conflict_reason</structfield> <type>text</type>
+      </para>
+      <para>
+       The reason for the logical slot's conflict with recovery. It is always
+       NULL for physical slots, as well as for logical slots which are not
+       invalidated. The non-NULL values indicate that the slot is marked
+       as invalidated. Possible values are:
+        <itemizedlist spacing="compact">
+         <listitem>
+          <para>
+           <literal>wal_removed</literal> means that the required WAL has been
+           removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>rows_removed</literal> means that the required rows have
+           been removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>wal_level_insufficient</literal> means that the
+           <xref linkend="guc-wal-level"/> is insufficient on the primary
+           server.
+          </para>
+         </listitem>
+        </itemizedlist>
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 058fc47c91..dd9ed0958b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflicting
+            L.conflict_reason
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 4b694a03d0..8e3d27adc8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -406,10 +406,24 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			nulls[i++] = true;
 		else
 		{
-			if (slot_contents.data.invalidated != RS_INVAL_NONE)
-				values[i++] = BoolGetDatum(true);
-			else
-				values[i++] = BoolGetDatum(false);
+			switch (slot_contents.data.invalidated)
+			{
+				case RS_INVAL_NONE:
+					nulls[i++] = true;
+					break;
+
+				case RS_INVAL_WAL_REMOVED:
+					values[i++] = CStringGetTextDatum("wal_removed");
+					break;
+
+				case RS_INVAL_HORIZON:
+					values[i++] = CStringGetTextDatum("rows_removed");
+					break;
+
+				case RS_INVAL_WAL_LEVEL:
+					values[i++] = CStringGetTextDatum("wal_level_insufficient");
+					break;
+			}
 		}
 
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 4878aa22bf..537af558be 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -662,13 +662,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, "
-							"%s as caught_up, conflicting as invalid "
+							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflicting THEN FALSE "
+							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9052f5262a..8fe8234a1b 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11115,9 +11115,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 2e7893ec4e..e52c26a960 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,27 +168,25 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflicting status in pg_replication_slots.
-sub check_slots_conflicting_status
+# Check conflict_reason in pg_replication_slots.
+sub check_slots_conflict_reason
 {
-	my ($conflicting) = @_;
+	my ($slot_prefix, $reason) = @_;
 
-	if ($conflicting)
-	{
-		$res = $node_standby->safe_psql(
-			'postgres', qq(
-				 select bool_and(conflicting) from pg_replication_slots;));
+	my $active_slot = $slot_prefix . 'activeslot';
+	my $inactive_slot = $slot_prefix . 'inactiveslot';
 
-		is($res, 't', "Logical slots are reported as conflicting");
-	}
-	else
-	{
-		$res = $node_standby->safe_psql(
-			'postgres', qq(
-				select bool_or(conflicting) from pg_replication_slots;));
+	$res = $node_standby->safe_psql(
+		'postgres', qq(
+			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
 
-		is($res, 'f', "Logical slots are reported as non conflicting");
-	}
+	is($res, "$reason", "$active_slot conflict_reason is $reason");
+
+	$res = $node_standby->safe_psql(
+		'postgres', qq(
+			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+
+	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -260,13 +258,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflicting is NULL for physical slot
+# Check conflict_reason is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflicting is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflicting as NULL");
+is($res, 't', "Physical slot reports conflict_reason as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -483,8 +481,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 $handle =
   make_slot_active($node_standby, 'vacuum_full_', 0, \$stdout, \$stderr);
@@ -502,8 +500,8 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is retained across a restart.
+check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
 # Verify that invalidated logical slots do not lead to retaining WAL.
@@ -511,7 +509,7 @@ check_slots_conflicting_status(1);
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflicting is true;"
+	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
 );
 
 chomp($restart_lsn);
@@ -565,8 +563,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
   make_slot_active($node_standby, 'row_removal_', 0, \$stdout, \$stderr);
@@ -604,8 +602,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
 	\$stderr);
@@ -657,7 +655,13 @@ ok( $node_standby->poll_query_until(
 ) or die "Timed out waiting confl_active_logicalslot to be updated";
 
 # Verify slots are reported as non conflicting in pg_replication_slots
-check_slots_conflicting_status(0);
+is( $node_standby->safe_psql(
+		'postgres',
+		q[select bool_or(conflicting) from
+		  (select conflict_reason is not NULL as conflicting
+		   from pg_replication_slots WHERE slot_type = 'logical')]),
+	'f',
+	'Logical slots are reported as non conflicting');
 
 # Turn hot_standby_feedback back on
 change_hot_standby_feedback_and_wait_for_xmins(1, 0);
@@ -693,8 +697,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
 
@@ -737,8 +741,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
   make_slot_active($node_standby, 'wal_level_', 0, \$stdout, \$stderr);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f645e8486b..d878a971df 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,8 +1473,8 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflicting
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting)
+    l.conflict_reason
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.34.1

#28Amit Kapila
amit.kapila16@gmail.com
In reply to: shveta malik (#27)
1 attachment(s)
Re: Track in pg_replication_slots the reason why slots conflict?

On Mon, Jan 1, 2024 at 5:24 PM shveta malik <shveta.malik@gmail.com> wrote:

Please ignore the previous patch and PFA new v4 (v4_2). The only
change from the earlier v4 is the subject correction in commit msg.

The patch looks good to me. I have slightly changed one of the
descriptions in the docs and also modified the commit message a bit. I
will push this after two days unless there are any more
comments/suggestions.

--
With Regards,
Amit Kapila.

Attachments:

v5-0001-Track-conflict_reason-in-pg_replication_slots.patchapplication/octet-stream; name=v5-0001-Track-conflict_reason-in-pg_replication_slots.patchDownload
From 12d83c5ae3760cba1b50ead8011abcaaf671417c Mon Sep 17 00:00:00 2001
From: Shveta Malik <shveta.malik@gmail.com>
Date: Wed, 27 Dec 2023 10:22:39 +0530
Subject: [PATCH v5] Track conflict_reason in pg_replication_slots.

This patch changes the existing 'conflicting' field to 'conflict_reason'
in pg_replication_slots. This new field indicates the reason for the
logical slot's conflict with recovery. It is always NULL for physical
slots, as well as for logical slots which are not invalidated. The
non-NULL values indicate that the slot is marked as invalidated. Possible
values are:

wal_removed = required WAL has been removed.
rows_removed = required rows have been removed.
wal_level_insufficient = the primary doesn't have a wal_level sufficient
to perform logical decoding.

The existing users of 'conflicting' column can get the same answer by
using 'conflict_reason' IS NOT NULL.

Author: Shveta Malik
Reviewed-by: Amit Kapila, Bertrand Drouvot
Discussion: https://postgr.es/m/ZYOE8IguqTbp-seF@paquier.xyz
---
 doc/src/sgml/system-views.sgml                | 33 +++++++--
 src/backend/catalog/system_views.sql          |  2 +-
 src/backend/replication/slotfuncs.c           | 22 ++++--
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  4 +-
 .../t/035_standby_logical_decoding.pl         | 72 ++++++++++---------
 src/test/regress/expected/rules.out           |  4 +-
 7 files changed, 91 insertions(+), 50 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 0ef1745631..ce00293e5c 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,11 +2525,34 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflicting</structfield> <type>bool</type>
-      </para>
-      <para>
-       True if this logical slot conflicted with recovery (and so is now
-       invalidated). Always NULL for physical slots.
+       <structfield>conflict_reason</structfield> <type>text</type>
+      </para>
+      <para>
+       The reason for the logical slot's conflict with recovery. It is always
+       NULL for physical slots, as well as for logical slots which are not
+       invalidated. The non-NULL values indicate that the slot is marked
+       as invalidated. Possible values are:
+        <itemizedlist spacing="compact">
+         <listitem>
+          <para>
+           <literal>wal_removed</literal> means that the required WAL has been
+           removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>rows_removed</literal> means that the required rows have
+           been removed.
+          </para>
+         </listitem>
+         <listitem>
+          <para>
+           <literal>wal_level_insufficient</literal> means that the
+           primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
+           perform logical decoding.
+          </para>
+         </listitem>
+        </itemizedlist>
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 058fc47c91..dd9ed0958b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflicting
+            L.conflict_reason
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 4b694a03d0..8e3d27adc8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -406,10 +406,24 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			nulls[i++] = true;
 		else
 		{
-			if (slot_contents.data.invalidated != RS_INVAL_NONE)
-				values[i++] = BoolGetDatum(true);
-			else
-				values[i++] = BoolGetDatum(false);
+			switch (slot_contents.data.invalidated)
+			{
+				case RS_INVAL_NONE:
+					nulls[i++] = true;
+					break;
+
+				case RS_INVAL_WAL_REMOVED:
+					values[i++] = CStringGetTextDatum("wal_removed");
+					break;
+
+				case RS_INVAL_HORIZON:
+					values[i++] = CStringGetTextDatum("rows_removed");
+					break;
+
+				case RS_INVAL_WAL_LEVEL:
+					values[i++] = CStringGetTextDatum("wal_level_insufficient");
+					break;
+			}
 		}
 
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index f70742851c..4361d98103 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -667,13 +667,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, "
-							"%s as caught_up, conflicting as invalid "
+							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflicting THEN FALSE "
+							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 5b67784731..04f55ec936 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11115,9 +11115,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 2e7893ec4e..e52c26a960 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,27 +168,25 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflicting status in pg_replication_slots.
-sub check_slots_conflicting_status
+# Check conflict_reason in pg_replication_slots.
+sub check_slots_conflict_reason
 {
-	my ($conflicting) = @_;
+	my ($slot_prefix, $reason) = @_;
 
-	if ($conflicting)
-	{
-		$res = $node_standby->safe_psql(
-			'postgres', qq(
-				 select bool_and(conflicting) from pg_replication_slots;));
+	my $active_slot = $slot_prefix . 'activeslot';
+	my $inactive_slot = $slot_prefix . 'inactiveslot';
 
-		is($res, 't', "Logical slots are reported as conflicting");
-	}
-	else
-	{
-		$res = $node_standby->safe_psql(
-			'postgres', qq(
-				select bool_or(conflicting) from pg_replication_slots;));
+	$res = $node_standby->safe_psql(
+		'postgres', qq(
+			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
 
-		is($res, 'f', "Logical slots are reported as non conflicting");
-	}
+	is($res, "$reason", "$active_slot conflict_reason is $reason");
+
+	$res = $node_standby->safe_psql(
+		'postgres', qq(
+			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+
+	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -260,13 +258,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflicting is NULL for physical slot
+# Check conflict_reason is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflicting is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflicting as NULL");
+is($res, 't', "Physical slot reports conflict_reason as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -483,8 +481,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 $handle =
   make_slot_active($node_standby, 'vacuum_full_', 0, \$stdout, \$stderr);
@@ -502,8 +500,8 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is retained across a restart.
+check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
 # Verify that invalidated logical slots do not lead to retaining WAL.
@@ -511,7 +509,7 @@ check_slots_conflicting_status(1);
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflicting is true;"
+	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
 );
 
 chomp($restart_lsn);
@@ -565,8 +563,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
   make_slot_active($node_standby, 'row_removal_', 0, \$stdout, \$stderr);
@@ -604,8 +602,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
 	\$stderr);
@@ -657,7 +655,13 @@ ok( $node_standby->poll_query_until(
 ) or die "Timed out waiting confl_active_logicalslot to be updated";
 
 # Verify slots are reported as non conflicting in pg_replication_slots
-check_slots_conflicting_status(0);
+is( $node_standby->safe_psql(
+		'postgres',
+		q[select bool_or(conflicting) from
+		  (select conflict_reason is not NULL as conflicting
+		   from pg_replication_slots WHERE slot_type = 'logical')]),
+	'f',
+	'Logical slots are reported as non conflicting');
 
 # Turn hot_standby_feedback back on
 change_hot_standby_feedback_and_wait_for_xmins(1, 0);
@@ -693,8 +697,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
 
@@ -737,8 +741,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
   make_slot_active($node_standby, 'wal_level_', 0, \$stdout, \$stderr);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f645e8486b..d878a971df 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,8 +1473,8 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflicting
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting)
+    l.conflict_reason
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.39.1

#29Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Amit Kapila (#28)
Re: Track in pg_replication_slots the reason why slots conflict?

Hi,

On Tue, Jan 02, 2024 at 10:35:59AM +0530, Amit Kapila wrote:

On Mon, Jan 1, 2024 at 5:24 PM shveta malik <shveta.malik@gmail.com> wrote:

Please ignore the previous patch and PFA new v4 (v4_2). The only
change from the earlier v4 is the subject correction in commit msg.

Thanks for the patch!

The patch looks good to me. I have slightly changed one of the
descriptions in the docs and also modified the commit message a bit. I
will push this after two days unless there are any more
comments/suggestions.

The patch LGTM, I just have a Nit comment:

+           <literal>wal_level_insufficient</literal> means that the
+           <xref linkend="guc-wal-level"/> is insufficient on the primary
+           server.

I'd prefer "primary_wal_level" instead of "wal_level_insufficient". I think it's
better to directly mention it is linked to the primary (without the need to refer
to the documentation) and that the fact that it is "insufficient" is more or less
implicit.

Basically I think that with "primary_wal_level" one would need to refer to the doc
less frequently than with "wal_level_insufficient".

But again, that's a Nit so feel free to ignore.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#30Michael Paquier
michael@paquier.xyz
In reply to: Bertrand Drouvot (#29)
Re: Track in pg_replication_slots the reason why slots conflict?

On Tue, Jan 02, 2024 at 02:07:58PM +0000, Bertrand Drouvot wrote:

+           <literal>wal_level_insufficient</literal> means that the
+           <xref linkend="guc-wal-level"/> is insufficient on the primary
+           server.

I'd prefer "primary_wal_level" instead of "wal_level_insufficient". I think it's
better to directly mention it is linked to the primary (without the need to refer
to the documentation) and that the fact that it is "insufficient" is more or less
implicit.

Basically I think that with "primary_wal_level" one would need to refer to the doc
less frequently than with "wal_level_insufficient".

I can see your point, but wal_level_insufficient speaks a bit more to
me because of its relationship with the GUC setting. Something like
wal_level_insufficient_on_primary may speak better, but that's also
quite long. I'm OK with what the patch does.

+       as invalidated. Possible values are:
+        <itemizedlist spacing="compact">
Higher-level nit: indentation seems to be one space off here.
--
Michael
#31Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#30)
1 attachment(s)
Re: Track in pg_replication_slots the reason why slots conflict?

On Wed, Jan 3, 2024 at 7:10 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Jan 02, 2024 at 02:07:58PM +0000, Bertrand Drouvot wrote:

+           <literal>wal_level_insufficient</literal> means that the
+           <xref linkend="guc-wal-level"/> is insufficient on the primary
+           server.

I'd prefer "primary_wal_level" instead of "wal_level_insufficient". I think it's
better to directly mention it is linked to the primary (without the need to refer
to the documentation) and that the fact that it is "insufficient" is more or less
implicit.

Basically I think that with "primary_wal_level" one would need to refer to the doc
less frequently than with "wal_level_insufficient".

I can see your point, but wal_level_insufficient speaks a bit more to
me because of its relationship with the GUC setting. Something like
wal_level_insufficient_on_primary may speak better, but that's also
quite long. I'm OK with what the patch does.

Thanks, I also prefer "wal_level_insufficient". To me
"primary_wal_level" sounds more along the lines of a GUC name than the
conflict_reason. The other names that come to mind are
"wal_level_lower_than_required", "wal_level_lower",
"wal_level_lesser_than_required", "wal_level_lesser" but I feel
"wal_level_insufficient" sounds better than these. Having said that, I
am open to any of these or better options for this conflict_reason.

+       as invalidated. Possible values are:
+        <itemizedlist spacing="compact">
Higher-level nit: indentation seems to be one space off here.

Thanks, fixed in the attached patch.

--
With Regards,
Amit Kapila.

Attachments:

v6-0001-Track-conflict_reason-in-pg_replication_slots.patchapplication/octet-stream; name=v6-0001-Track-conflict_reason-in-pg_replication_slots.patchDownload
From 6628122f3fb2f8ef2f8d539089160ad1170684d8 Mon Sep 17 00:00:00 2001
From: Shveta Malik <shveta.malik@gmail.com>
Date: Wed, 27 Dec 2023 10:22:39 +0530
Subject: [PATCH v6] Track conflict_reason in pg_replication_slots.

This patch changes the existing 'conflicting' field to 'conflict_reason'
in pg_replication_slots. This new field indicates the reason for the
logical slot's conflict with recovery. It is always NULL for physical
slots, as well as for logical slots which are not invalidated. The
non-NULL values indicate that the slot is marked as invalidated. Possible
values are:

wal_removed = required WAL has been removed.
rows_removed = required rows have been removed.
wal_level_insufficient = the primary doesn't have a wal_level sufficient
to perform logical decoding.

The existing users of 'conflicting' column can get the same answer by
using 'conflict_reason' IS NOT NULL.

Author: Shveta Malik
Reviewed-by: Amit Kapila, Bertrand Drouvot, Michael Paquier
Discussion: https://postgr.es/m/ZYOE8IguqTbp-seF@paquier.xyz
---
 doc/src/sgml/system-views.sgml                | 29 +++++++-
 src/backend/catalog/system_views.sql          |  2 +-
 src/backend/replication/slotfuncs.c           | 22 ++++--
 src/bin/pg_upgrade/info.c                     |  4 +-
 src/include/catalog/pg_proc.dat               |  4 +-
 .../t/035_standby_logical_decoding.pl         | 72 ++++++++++---------
 src/test/regress/expected/rules.out           |  4 +-
 7 files changed, 89 insertions(+), 48 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 0ef1745631..72d01fc624 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,11 +2525,34 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>conflicting</structfield> <type>bool</type>
+       <structfield>conflict_reason</structfield> <type>text</type>
       </para>
       <para>
-       True if this logical slot conflicted with recovery (and so is now
-       invalidated). Always NULL for physical slots.
+       The reason for the logical slot's conflict with recovery. It is always
+       NULL for physical slots, as well as for logical slots which are not
+       invalidated. The non-NULL values indicate that the slot is marked
+       as invalidated. Possible values are:
+       <itemizedlist spacing="compact">
+        <listitem>
+         <para>
+          <literal>wal_removed</literal> means that the required WAL has been
+          removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>rows_removed</literal> means that the required rows have
+          been removed.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>wal_level_insufficient</literal> means that the
+          primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
+          perform logical decoding.
+         </para>
+        </listitem>
+       </itemizedlist>
       </para></entry>
      </row>
     </tbody>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 058fc47c91..dd9ed0958b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1023,7 +1023,7 @@ CREATE VIEW pg_replication_slots AS
             L.wal_status,
             L.safe_wal_size,
             L.two_phase,
-            L.conflicting
+            L.conflict_reason
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 4b694a03d0..8e3d27adc8 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -406,10 +406,24 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 			nulls[i++] = true;
 		else
 		{
-			if (slot_contents.data.invalidated != RS_INVAL_NONE)
-				values[i++] = BoolGetDatum(true);
-			else
-				values[i++] = BoolGetDatum(false);
+			switch (slot_contents.data.invalidated)
+			{
+				case RS_INVAL_NONE:
+					nulls[i++] = true;
+					break;
+
+				case RS_INVAL_WAL_REMOVED:
+					values[i++] = CStringGetTextDatum("wal_removed");
+					break;
+
+				case RS_INVAL_HORIZON:
+					values[i++] = CStringGetTextDatum("rows_removed");
+					break;
+
+				case RS_INVAL_WAL_LEVEL:
+					values[i++] = CStringGetTextDatum("wal_level_insufficient");
+					break;
+			}
 		}
 
 		Assert(i == PG_GET_REPLICATION_SLOTS_COLS);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index f70742851c..4361d98103 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -667,13 +667,13 @@ get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check)
 	 * removed.
 	 */
 	res = executeQueryOrDie(conn, "SELECT slot_name, plugin, two_phase, "
-							"%s as caught_up, conflicting as invalid "
+							"%s as caught_up, conflict_reason IS NOT NULL as invalid "
 							"FROM pg_catalog.pg_replication_slots "
 							"WHERE slot_type = 'logical' AND "
 							"database = current_database() AND "
 							"temporary IS FALSE;",
 							live_check ? "FALSE" :
-							"(CASE WHEN conflicting THEN FALSE "
+							"(CASE WHEN conflict_reason IS NOT NULL THEN FALSE "
 							"ELSE (SELECT pg_catalog.binary_upgrade_logical_slot_has_caught_up(slot_name)) "
 							"END)");
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 5b67784731..04f55ec936 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11115,9 +11115,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,bool}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool,text}',
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflicting}',
+  proargnames => '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,two_phase,conflict_reason}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 2e7893ec4e..e52c26a960 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -168,27 +168,25 @@ sub change_hot_standby_feedback_and_wait_for_xmins
 	}
 }
 
-# Check conflicting status in pg_replication_slots.
-sub check_slots_conflicting_status
+# Check conflict_reason in pg_replication_slots.
+sub check_slots_conflict_reason
 {
-	my ($conflicting) = @_;
+	my ($slot_prefix, $reason) = @_;
 
-	if ($conflicting)
-	{
-		$res = $node_standby->safe_psql(
-			'postgres', qq(
-				 select bool_and(conflicting) from pg_replication_slots;));
+	my $active_slot = $slot_prefix . 'activeslot';
+	my $inactive_slot = $slot_prefix . 'inactiveslot';
 
-		is($res, 't', "Logical slots are reported as conflicting");
-	}
-	else
-	{
-		$res = $node_standby->safe_psql(
-			'postgres', qq(
-				select bool_or(conflicting) from pg_replication_slots;));
+	$res = $node_standby->safe_psql(
+		'postgres', qq(
+			 select conflict_reason from pg_replication_slots where slot_name = '$active_slot';));
 
-		is($res, 'f', "Logical slots are reported as non conflicting");
-	}
+	is($res, "$reason", "$active_slot conflict_reason is $reason");
+
+	$res = $node_standby->safe_psql(
+		'postgres', qq(
+			 select conflict_reason from pg_replication_slots where slot_name = '$inactive_slot';));
+
+	is($res, "$reason", "$inactive_slot conflict_reason is $reason");
 }
 
 # Drop the slots, re-create them, change hot_standby_feedback,
@@ -260,13 +258,13 @@ $node_primary->safe_psql('testdb',
 	qq[SELECT * FROM pg_create_physical_replication_slot('$primary_slotname');]
 );
 
-# Check conflicting is NULL for physical slot
+# Check conflict_reason is NULL for physical slot
 $res = $node_primary->safe_psql(
 	'postgres', qq[
-		 SELECT conflicting is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
+		 SELECT conflict_reason is null FROM pg_replication_slots where slot_name = '$primary_slotname';]
 );
 
-is($res, 't', "Physical slot reports conflicting as NULL");
+is($res, 't', "Physical slot reports conflict_reason as NULL");
 
 my $backup_name = 'b1';
 $node_primary->backup($backup_name);
@@ -483,8 +481,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 $handle =
   make_slot_active($node_standby, 'vacuum_full_', 0, \$stdout, \$stderr);
@@ -502,8 +500,8 @@ change_hot_standby_feedback_and_wait_for_xmins(1, 1);
 ##################################################
 $node_standby->restart;
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is retained across a restart.
+check_slots_conflict_reason('vacuum_full_', 'rows_removed');
 
 ##################################################
 # Verify that invalidated logical slots do not lead to retaining WAL.
@@ -511,7 +509,7 @@ check_slots_conflicting_status(1);
 
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
-	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflicting is true;"
+	"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 'vacuum_full_activeslot' and conflict_reason is not null;"
 );
 
 chomp($restart_lsn);
@@ -565,8 +563,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('row_removal_', 'rows_removed');
 
 $handle =
   make_slot_active($node_standby, 'row_removal_', 0, \$stdout, \$stderr);
@@ -604,8 +602,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 check_for_invalidation('shared_row_removal_', $logstart,
 	'with vacuum on pg_authid');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'shared_row_removal_', 0, \$stdout,
 	\$stderr);
@@ -657,7 +655,13 @@ ok( $node_standby->poll_query_until(
 ) or die "Timed out waiting confl_active_logicalslot to be updated";
 
 # Verify slots are reported as non conflicting in pg_replication_slots
-check_slots_conflicting_status(0);
+is( $node_standby->safe_psql(
+		'postgres',
+		q[select bool_or(conflicting) from
+		  (select conflict_reason is not NULL as conflicting
+		   from pg_replication_slots WHERE slot_type = 'logical')]),
+	'f',
+	'Logical slots are reported as non conflicting');
 
 # Turn hot_standby_feedback back on
 change_hot_standby_feedback_and_wait_for_xmins(1, 0);
@@ -693,8 +697,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'rows_removed' in pg_replication_slots
+check_slots_conflict_reason('pruning_', 'rows_removed');
 
 $handle = make_slot_active($node_standby, 'pruning_', 0, \$stdout, \$stderr);
 
@@ -737,8 +741,8 @@ $node_primary->wait_for_replay_catchup($node_standby);
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
 
-# Verify slots are reported as conflicting in pg_replication_slots
-check_slots_conflicting_status(1);
+# Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
+check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
 
 $handle =
   make_slot_active($node_standby, 'wal_level_', 0, \$stdout, \$stderr);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f645e8486b..d878a971df 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1473,8 +1473,8 @@ pg_replication_slots| SELECT l.slot_name,
     l.wal_status,
     l.safe_wal_size,
     l.two_phase,
-    l.conflicting
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflicting)
+    l.conflict_reason
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin, catalog_xmin, restart_lsn, confirmed_flush_lsn, wal_status, safe_wal_size, two_phase, conflict_reason)
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.39.1

#32Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Amit Kapila (#31)
Re: Track in pg_replication_slots the reason why slots conflict?

Hi,

On Wed, Jan 03, 2024 at 08:53:44AM +0530, Amit Kapila wrote:

On Wed, Jan 3, 2024 at 7:10 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Jan 02, 2024 at 02:07:58PM +0000, Bertrand Drouvot wrote:

+           <literal>wal_level_insufficient</literal> means that the
+           <xref linkend="guc-wal-level"/> is insufficient on the primary
+           server.

I'd prefer "primary_wal_level" instead of "wal_level_insufficient". I think it's
better to directly mention it is linked to the primary (without the need to refer
to the documentation) and that the fact that it is "insufficient" is more or less
implicit.

Basically I think that with "primary_wal_level" one would need to refer to the doc
less frequently than with "wal_level_insufficient".

I can see your point, but wal_level_insufficient speaks a bit more to
me because of its relationship with the GUC setting. Something like
wal_level_insufficient_on_primary may speak better, but that's also
quite long. I'm OK with what the patch does.

Thanks, I also prefer "wal_level_insufficient". To me
"primary_wal_level" sounds more along the lines of a GUC name than the
conflict_reason. The other names that come to mind are
"wal_level_lower_than_required", "wal_level_lower",
"wal_level_lesser_than_required", "wal_level_lesser" but I feel
"wal_level_insufficient" sounds better than these. Having said that, I
am open to any of these or better options for this conflict_reason.

Thank you both for giving your thoughts on it, I got your points and I'm OK with
"wal_level_insufficient".

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#33vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#31)
Re: Track in pg_replication_slots the reason why slots conflict?

On Wed, 3 Jan 2024 at 08:54, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jan 3, 2024 at 7:10 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Jan 02, 2024 at 02:07:58PM +0000, Bertrand Drouvot wrote:

+           <literal>wal_level_insufficient</literal> means that the
+           <xref linkend="guc-wal-level"/> is insufficient on the primary
+           server.

I'd prefer "primary_wal_level" instead of "wal_level_insufficient". I think it's
better to directly mention it is linked to the primary (without the need to refer
to the documentation) and that the fact that it is "insufficient" is more or less
implicit.

Basically I think that with "primary_wal_level" one would need to refer to the doc
less frequently than with "wal_level_insufficient".

I can see your point, but wal_level_insufficient speaks a bit more to
me because of its relationship with the GUC setting. Something like
wal_level_insufficient_on_primary may speak better, but that's also
quite long. I'm OK with what the patch does.

Thanks, I also prefer "wal_level_insufficient". To me
"primary_wal_level" sounds more along the lines of a GUC name than the
conflict_reason. The other names that come to mind are
"wal_level_lower_than_required", "wal_level_lower",
"wal_level_lesser_than_required", "wal_level_lesser" but I feel
"wal_level_insufficient" sounds better than these. Having said that, I
am open to any of these or better options for this conflict_reason.

+       as invalidated. Possible values are:
+        <itemizedlist spacing="compact">
Higher-level nit: indentation seems to be one space off here.

Thanks, fixed in the attached patch.

I have marked the commitfest entry to the committed state as the patch
is committed.

Regards,
Vignesh