BUG #14228: replication slot catalog_xmin not cleared on slot reuse
The following bug has been logged on the website:
Bug reference: 14228
Logged by: Andrew Gierth
Email address: andrew@tao11.riddles.org.uk
PostgreSQL version: 9.6beta2
Operating system: any
Description:
[My analysis of a bug reported on IRC; please CC original reporter: md at
chewy dot com]
Reported on pg 9.4.6 but also reproduced on 9.6beta2
When creating a physical replication slot, the catalog_xmin field of the new
slot is not initialized. If the slot storage had previously been used for a
logical slot, the old catalog_xmin will remain in place and interfere with
vacuum.
Trivially reproducible:
select pg_create_logical_replication_slot('test','test_decoding');
select pg_drop_replication_slot('test');
select pg_create_physical_replication_slot('test');
select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | active | active_pid |
xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
-----------+--------+-----------+--------+----------+--------+------------+------+--------------+-------------+---------------------
test | | physical | | | f | |
| 546 | | 0/1525438
(1 row)
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Wed, Jul 6, 2016 at 2:35 AM, <andrew@tao11.riddles.org.uk> wrote:
When creating a physical replication slot, the catalog_xmin field of the new
slot is not initialized. If the slot storage had previously been used for a
logical slot, the old catalog_xmin will remain in place and interfere with
vacuum.
Good catch! The same applies to confirmed_flush_lsn, which is used
only by logical decoding and should remain as NULL for physical slots.
So I propose the patch attached to address both problems.
--
Michael
Attachments:
fix-repslot-init.patchtext/x-diff; charset=US-ASCII; name=fix-repslot-init.patchDownload+2-0
"Michael" == Michael Paquier <michael.paquier@gmail.com> writes:
When creating a physical replication slot, the catalog_xmin field of
the new slot is not initialized. If the slot storage had previously
been used for a logical slot, the old catalog_xmin will remain in
place and interfere with vacuum.
Michael> Good catch! The same applies to confirmed_flush_lsn, which is
Michael> used only by logical decoding and should remain as NULL for
Michael> physical slots. So I propose the patch attached to address
Michael> both problems.
What about slot->effective_catalog_xmin ?
--
Andrew (irc:RhodiumToad)
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
_
On Wed, Jul 6, 2016 at 12:56 PM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
"Michael" == Michael Paquier <michael.paquier@gmail.com> writes:
When creating a physical replication slot, the catalog_xmin field of
the new slot is not initialized. If the slot storage had previously
been used for a logical slot, the old catalog_xmin will remain in
place and interfere with vacuum.Michael> Good catch! The same applies to confirmed_flush_lsn, which is
Michael> used only by logical decoding and should remain as NULL for
Michael> physical slots. So I propose the patch attached to address
Michael> both problems.What about slot->effective_catalog_xmin ?
Yes. I guess so, as well as the other candidate_* fields in the slot
to begin from a clean state.
--
Michael
Attachments:
fix-repslot-init-v2.patchapplication/x-patch; name=fix-repslot-init-v2.patchDownload+7-0
* Michael Paquier (michael.paquier@gmail.com) wrote:
On Wed, Jul 6, 2016 at 12:56 PM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:"Michael" == Michael Paquier <michael.paquier@gmail.com> writes:
When creating a physical replication slot, the catalog_xmin field of
the new slot is not initialized. If the slot storage had previously
been used for a logical slot, the old catalog_xmin will remain in
place and interfere with vacuum.Michael> Good catch! The same applies to confirmed_flush_lsn, which is
Michael> used only by logical decoding and should remain as NULL for
Michael> physical slots. So I propose the patch attached to address
Michael> both problems.What about slot->effective_catalog_xmin ?
Yes. I guess so, as well as the other candidate_* fields in the slot
to begin from a clean state.
Seems like we should try to get this in before the next round of point
releases...?
Thanks!
Stephen
On Tue, Jul 26, 2016 at 12:25 AM, Stephen Frost <sfrost@snowman.net> wrote:
* Michael Paquier (michael.paquier@gmail.com) wrote:
On Wed, Jul 6, 2016 at 12:56 PM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:"Michael" == Michael Paquier <michael.paquier@gmail.com> writes:
When creating a physical replication slot, the catalog_xmin field of
the new slot is not initialized. If the slot storage had previously
been used for a logical slot, the old catalog_xmin will remain in
place and interfere with vacuum.Michael> Good catch! The same applies to confirmed_flush_lsn, which is
Michael> used only by logical decoding and should remain as NULL for
Michael> physical slots. So I propose the patch attached to address
Michael> both problems.What about slot->effective_catalog_xmin ?
Yes. I guess so, as well as the other candidate_* fields in the slot
to begin from a clean state.Seems like we should try to get this in before the next round of point
releases...?
That would be nice, I would guess that Andres or Robert (added in CC)
are the best fits to commit this patch, even if this is just a
variable initialization issue.
--
Michael
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On 2016-07-06 13:07:36 +0900, Michael Paquier wrote:
_
On Wed, Jul 6, 2016 at 12:56 PM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:"Michael" == Michael Paquier <michael.paquier@gmail.com> writes:
When creating a physical replication slot, the catalog_xmin field of
the new slot is not initialized. If the slot storage had previously
been used for a logical slot, the old catalog_xmin will remain in
place and interfere with vacuum.Michael> Good catch! The same applies to confirmed_flush_lsn, which is
Michael> used only by logical decoding and should remain as NULL for
Michael> physical slots. So I propose the patch attached to address
Michael> both problems.What about slot->effective_catalog_xmin ?
Yes. I guess so, as well as the other candidate_* fields in the slot
to begin from a clean state.
I think it'd be better if we explicitly zeroed .data - that way the
likelihood of future bugs of the same ilk is smaller.
Greetings,
Andres Freund
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Thu, Jul 28, 2016 at 9:24 AM, Andres Freund <andres@anarazel.de> wrote:
I think it'd be better if we explicitly zeroed .data - that way the
likelihood of future bugs of the same ilk is smaller.
Okay, I have spent some time looking at all the fields here, and their
significance before reaching this code path in ReplicationSlotCreate,
but did not find any hole if slot->data is zeroed. So here is an
updated patch. You could get rid of all the field initializations I
have done for slot->data, but I think that's cheap to keep them.
--
Michael
Attachments:
fix-repslot-init-v3.patchapplication/x-patch; name=fix-repslot-init-v3.patchDownload+8-0
"Stephen" == Stephen Frost <sfrost@snowman.net> writes:
Stephen> Seems like we should try to get this in before the next round
Stephen> of point releases...?
I notice that this didn't happen....
--
Andrew (irc:RhodiumToad)
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Tue, Aug 9, 2016 at 10:30 PM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
"Stephen" == Stephen Frost <sfrost@snowman.net> writes:
Stephen> Seems like we should try to get this in before the next round
Stephen> of point releases...?I notice that this didn't happen....
At this point I have added it to the next CF as a bug fix:
https://commitfest.postgresql.org/10/705/
--
Michael
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On 2016-08-09 14:30:58 +0100, Andrew Gierth wrote:
"Stephen" == Stephen Frost <sfrost@snowman.net> writes:
Stephen> Seems like we should try to get this in before the next round
Stephen> of point releases...?I notice that this didn't happen....
I'll piuck it up when I'm back from holidays (Tuesday/Wednesday). Sorry
for the delay.
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Mon, Aug 15, 2016 at 10:03 AM, Andres Freund <andres@anarazel.de> wrote:
On 2016-08-09 14:30:58 +0100, Andrew Gierth wrote:
"Stephen" == Stephen Frost <sfrost@snowman.net> writes:
Stephen> Seems like we should try to get this in before the next round
Stephen> of point releases...?I notice that this didn't happen....
I'll piuck it up when I'm back from holidays (Tuesday/Wednesday). Sorry
for the delay.
Thanks!
--
Michael
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On 2016-07-28 14:22:29 +0900, Michael Paquier wrote:
On Thu, Jul 28, 2016 at 9:24 AM, Andres Freund <andres@anarazel.de> wrote:
I think it'd be better if we explicitly zeroed .data - that way the
likelihood of future bugs of the same ilk is smaller.Okay, I have spent some time looking at all the fields here, and their
significance before reaching this code path in ReplicationSlotCreate,
but did not find any hole if slot->data is zeroed. So here is an
updated patch. You could get rid of all the field initializations I
have done for slot->data, but I think that's cheap to keep them.
Pushed without the additional initializations - they're imo more
confusing than helpful - and with some more reordering to match the
struct order.
Regards,
Andres
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs