spurious(?) warnings in archive recovery

Started by Andrew Gierthover 7 years ago4 messageshackers

andrew@tao11.riddles.org.uk

over 7 years ago

So while investigating a case of this warning (in
UpdateMinRecoveryPoint):

"xlog min recovery request %X/%X is past current point %X/%X"

I noticed that it is issued even in cases where we know that
minRecoveryPoint is not yet valid, for example because we're waiting to
see XLOG_BACKUP_END before declaring consistency.

But, you'd think, you shouldn't get this error because any page we
modify during recovery should have been restored from an FPI with a
suitably early LSN? For data pages that is correct, but not for VM or
(iff wal_log_hints or checksums are enabled) FSM pages.

When we replay an operation that, for example, clears a bit in the VM,
the redo code will read in that VM page from disk, and because we're not
yet consistent and because _clearing_ a VM bit is not in itself
wal-logged and doesn't result in any FPI being generated for the VM
page, it could well read a VM page that has a far-future LSN from the
point of view of replay, and dirty it, causing a later eviction to try
and do UpdateMinRecoveryPoint with that future LSN.

(I haven't investigated this aspect, but there also appears to be no
protection against torn pages in the VM when checksums are enabled? am I
missing something somewhere?)

I'm less clear on the exact mechanisms, but when wal_log_hints (or
checksums) is on, FSM pages also get LSNs, sometimes, thanks to
MarkBufferDirtyHint, and at least some code paths can also do
MarkBufferDirty on FSM pages during recovery, which would cause their
eviction with possible future LSNs as with VM pages.

This means that if you simply do an old-style base backup using
pg_start_backup/rsync/pg_stop_backup (on a sufficiently active system
and taking long enough) and then recover from it, you're likely to get a
log spammed with these errors for no very good reason.

So it seems to me that issuing this error is a bug if the conditions
described are actually harmless, while if they're not harmless, then
obviously that is a bug. So _something_ needs fixing here, but I'm not
yet sufficiently confident of my analysis to say what.

Opinions?

(as a further point, it seems to me that backupEndRequired is a rather
misleadingly named variable, since what _actually_ determines whether an
XLOG_BACKUP_END record is expected is whether backupStartPoint is set.
backupEndRequired seems to change one error message and, questionably,
one decision about whether to do crash recovery before entering archive
recovery, but nothing else.)

--
Andrew (irc:RhodiumToad)

Vik Fearing

vik@postgresfriends.org

over 7 years ago

In reply to: Andrew Gierth (#1)

Re: spurious(?) warnings in archive recovery

On 13/11/2018 16:34, Andrew Gierth wrote:

So while investigating a case of this warning (in
UpdateMinRecoveryPoint):

"xlog min recovery request %X/%X is past current point %X/%X"

I noticed that it is issued even in cases where we know that
minRecoveryPoint is not yet valid, for example because we're waiting to
see XLOG_BACKUP_END before declaring consistency.

But, you'd think, you shouldn't get this error because any page we
modify during recovery should have been restored from an FPI with a
suitably early LSN? For data pages that is correct, but not for VM or
(iff wal_log_hints or checksums are enabled) FSM pages.

When we replay an operation that, for example, clears a bit in the VM,
the redo code will read in that VM page from disk, and because we're not
yet consistent and because _clearing_ a VM bit is not in itself
wal-logged and doesn't result in any FPI being generated for the VM
page, it could well read a VM page that has a far-future LSN from the
point of view of replay, and dirty it, causing a later eviction to try
and do UpdateMinRecoveryPoint with that future LSN.

(I haven't investigated this aspect, but there also appears to be no
protection against torn pages in the VM when checksums are enabled? am I
missing something somewhere?)

I'm less clear on the exact mechanisms, but when wal_log_hints (or
checksums) is on, FSM pages also get LSNs, sometimes, thanks to
MarkBufferDirtyHint, and at least some code paths can also do
MarkBufferDirty on FSM pages during recovery, which would cause their
eviction with possible future LSNs as with VM pages.

This means that if you simply do an old-style base backup using
pg_start_backup/rsync/pg_stop_backup (on a sufficiently active system
and taking long enough) and then recover from it, you're likely to get a
log spammed with these errors for no very good reason.

So it seems to me that issuing this error is a bug if the conditions
described are actually harmless, while if they're not harmless, then
obviously that is a bug. So _something_ needs fixing here, but I'm not
yet sufficiently confident of my analysis to say what.

Opinions?

(as a further point, it seems to me that backupEndRequired is a rather
misleadingly named variable, since what _actually_ determines whether an
XLOG_BACKUP_END record is expected is whether backupStartPoint is set.
backupEndRequired seems to change one error message and, questionably,
one decision about whether to do crash recovery before entering archive
recovery, but nothing else.)

Bump.

I was the originator of this report. I work with Postgres every single
day and I was spooked by these warnings. People with much less
involvement would probably be terrified.
--
Vik Fearing +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Andrew Gierth

andrew@tao11.riddles.org.uk

over 7 years ago

In reply to: Vik Fearing (#2)

Re: spurious(?) warnings in archive recovery

Here is a script that reproduces the issue conveniently.

This is targetted for pg11, might also work on pg10 (uses declarative
partitioning as a convenient way to spread traffic over many tables).

What it does is:

- spin up a new server in ./data1
- populate some data
- start a base backup into ./data2 (but no copying files yet)
- generate a bunch of activity
- actually copy the files for the backup
- finish the backup
- spin up the recovery server in ./data2

(there's some dead code in the script for testing torn-page cases, not
used at present and can be ignored)

Looking at the ./postgres2.log file created for the server in ./data2,
what I see is a ton of this:

LOG: redo starts at 0/15000028
WARNING: xlog min recovery request 0/41420F58 is past current point 0/151D1248
CONTEXT: writing block 2 of relation base/12719/16591_fsm
WAL redo at 0/151D01F0 for Heap/HOT_UPDATE: off 76 xmax 772 ; new off 79 xmax 0
WARNING: xlog min recovery request 0/458157D8 is past current point 0/15392030
CONTEXT: writing block 0 of relation base/12719/16387_vm
WAL redo at 0/15391FD0 for Heap/INSERT: off 14
WARNING: xlog min recovery request 0/42B23FA0 is past current point 0/154A8408
CONTEXT: writing block 2 of relation base/12719/16459_fsm
WAL redo at 0/154A7350 for Heap/HOT_UPDATE: off 43 xmax 776 ; new off 79 xmax 0
WARNING: xlog min recovery request 0/41406A60 is past current point 0/155CD2D0
CONTEXT: writing block 2 of relation base/12719/16393_fsm
WAL redo at 0/155CC230 for Heap/HOT_UPDATE: off 27 xmax 778 ; new off 79 xmax 0
[repeat some dozens of times]

The noise is worst with wal_log_hints=on (as given in the script);
turning that off doesn't solve the issue, there are still warnings given
for _vm pages.

--
Andrew (irc:RhodiumToad)

Andrew Gierth

andrew@tao11.riddles.org.uk

over 7 years ago

In reply to: Andrew Gierth (#3)

Re: spurious(?) warnings in archive recovery

"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

Andrew> Here is a script that reproduces the issue conveniently.
Andrew> This is targetted for pg11, might also work on pg10

Nope, forgot I'd used non-atomic DO, so pg11 or later only.
It could be rewritten easily enough to avoid that.

--
Andrew (irc:RhodiumToad)

spurious(?) warnings in archive recovery

Attachments: