pg_replication_slot_advance xmin handling when active slot becomes inactive

Started by Dimitri Fontaineover 4 years ago2 messagesbugs
Jump to latest
#1Dimitri Fontaine
Dimitri.Fontaine@microsoft.com

Hi folks,

I believe we have found another bug in Postgres when using pg_auto_failover. The details can be seen at https://github.com/citusdata/pg_auto_failover/issues/814 ; and the Postgres warning message to consider is the following:

WARNING: oldest xmin is far in the past

When a replication slot switches from active to inactive, whatever xmin value that is registered on the replication slot is then kept.

It seems to me that we should either document that a replication slot that has been active (used in streaming replication) can not be maintained through calls to pg_replication_slot_advance later; or better yet that this should be made to work, somehow.

Regards,
--
Dimitri Fontaine
PostgreSQL Major Contributor, Citus Data, Microsoft
Author of “The Art of PostgreSQL<https://theartofpostgresql.com/&gt;”

#2Andres Freund
andres@anarazel.de
In reply to: Dimitri Fontaine (#1)
Re: pg_replication_slot_advance xmin handling when active slot becomes inactive

Hi,

On 2021-10-06 08:22:08 +0000, Dimitri Fontaine wrote:

I believe we have found another bug in Postgres when using pg_auto_failover. The details can be seen at https://github.com/citusdata/pg_auto_failover/issues/814 ; and the Postgres warning message to consider is the following:

WARNING: oldest xmin is far in the past

When a replication slot switches from active to inactive, whatever xmin
value that is registered on the replication slot is then kept.

That's required - otherwise the slot would e.g. stop keeping
hot_standby_feedback across the replication connection breaking.

It seems to me that we should either document that a replication slot that
has been active (used in streaming replication) can not be maintained
through calls to pg_replication_slot_advance later; or better yet that this
should be made to work, somehow.

You encountered this on a physical slot, by the sound of this? For a logical
slot we cannot just safely change xmin, but
pg_physical_replication_slot_advance() should update it.

I wonder if we optionally should do something similar in
pg_physical_replication_slot_advance(). I.e. read the WAL between the current
position and the "moveto" LSN, and see what xmin should be updated to. If we
see WAL records that would cause conflicts for an older xmin, we can update
xmin to that.

Greetings,

Andres Freund