BUG #16894: PANIC: WAL contains references to invalid pages

Started by PG Bug reporting formabout 5 years ago11 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 16894
Logged by: Anton
Email address: antkurochkin@gmail.com
PostgreSQL version: 9.6.20
Operating system: Oracle Linux Server 7.6
Description:

Good day,

We got error when PITR restoring individual databases from a backup using
pgbackrest.

Opened issue in pgbackrest repository:
https://github.com/pgbackrest/pgbackrest/issues/796
It would be great if you would look at this problem. All the information is
in the issue.

Pgbackrest developers offered to provide this information to the postgres
folks.

Turned out that the problem is playing WAL files on databases that are
restored with zero files.
The database files from the backup are restored successfully, but an error
occurs when starting PG and playing WAL:

```
< 2021-02-21 10:34:57.950 UTC > LOG: starting archive recovery
< 2021-02-21 10:34:58.137 UTC > LOG: restored log file
"000000010000007E0000007D" from archive
< 2021-02-21 10:34:58.151 UTC > LOG: redo starts at 7E/7D0090A8
< 2021-02-21 10:34:58.521 UTC > LOG: restored log file
"000000010000007E0000007E" from archive
< 2021-02-21 10:34:58.711 UTC > LOG: consistent recovery state reached at
7E/7EF5E478
< 2021-02-21 10:34:58.915 UTC > LOG: restored log file
"000000010000007E0000007F" from archive
< 2021-02-21 10:34:59.415 UTC > LOG: restored log file
"000000010000007E00000080" from archive
< 2021-02-21 10:35:01.521 UTC > LOG: restored log file
"000000010000007E00000081" from archive
< 2021-02-21 10:35:02.110 UTC > LOG: restored log file
"000000010000007E00000082" from archive
< 2021-02-21 10:35:02.577 UTC > LOG: restored log file
"000000010000007E00000083" from archive
< 2021-02-21 10:35:03.045 UTC > LOG: restored log file
"000000010000007E00000084" from archive
< 2021-02-21 10:35:03.544 UTC > LOG: restored log file
"000000010000007E00000085" from archive
< 2021-02-21 10:35:03.985 UTC > LOG: restored log file
"000000010000007E00000086" from archive
< 2021-02-21 10:35:05.394 UTC > LOG: restored log file
"000000010000007E00000087" from archive
< 2021-02-21 10:35:05.945 UTC > LOG: restored log file
"000000010000007E00000088" from archive
< 2021-02-21 10:35:06.623 UTC > LOG: restored log file
"000000010000007E00000089" from archive
< 2021-02-21 10:35:07.264 UTC > LOG: restored log file
"000000010000007E0000008A" from archive
< 2021-02-21 10:35:07.831 UTC > LOG: restored log file
"000000010000007E0000008B" from archive
< 2021-02-21 10:35:08.318 UTC > LOG: restored log file
"000000010000007E0000008C" from archive
< 2021-02-21 10:35:08.892 UTC > LOG: restored log file
"000000010000007E0000008D" from archive
< 2021-02-21 10:35:09.050 UTC > WARNING: page 22659 of relation
base/17221/17557 is uninitialized
< 2021-02-21 10:35:09.050 UTC > CONTEXT: xlog redo at 7E/8D9C04C0 for
Heap2/VISIBLE: cutoff xid 810424291 flags 1
< 2021-02-21 10:35:09.050 UTC > PANIC: WAL contains references to invalid
pages
< 2021-02-21 10:35:09.050 UTC > CONTEXT: xlog redo at 7E/8D9C04C0 for
Heap2/VISIBLE: cutoff xid 810424291 flags 1
< 2021-02-21 10:35:09.675 UTC > LOG: startup process (PID 9) was terminated
by signal 6: Aborted
```

In WAL there is only one line for `page 27536`:

```
rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn:
7E/8D9C04C0, prev 7E/8D9BE508, desc: VISIBLE cutoff xid 810424291 flags 1,
blkref #0: rel 1663/17221/17557 fork vm blk 0, blkref #1: rel
1663/17221/17557 blk 22659
```

We tried to find another information in WALs for `xid 810424291` and `blk
22659` - nothing.

A few records before 7E/8D9C04C0 from last checkpoint:

```
rmgr: XLOG len (rec/tot): 106/ 106, tx: 0, lsn:
7E/8D99C4B0, prev 7E/8D99C478, desc: CHECKPOINT_ONLINE redo 7E/8D90D0D8; tli
1; prev tli 1; fpw true; xid 0:810982162; oid 476718; multi 1; offset 0;
oldest xid 628489828 in DB 16386; oldest multi 1 in DB 17221; oldest/newest
commit timestamp xid: 0/0; oldest running xid 810982162; online
rmgr: Heap2 len (rec/tot): 57/ 7965, tx: 0, lsn:
7E/8D99C520, prev 7E/8D99C4B0, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 3, blkref #0: rel 1663/17221/17557 blk 22557 FPW
rmgr: Heap2 len (rec/tot): 57/ 8213, tx: 0, lsn:
7E/8D99E458, prev 7E/8D99C520, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 1, blkref #0: rel 1663/17221/17557 blk 22631 FPW
rmgr: Heap2 len (rec/tot): 57/ 8229, tx: 0, lsn:
7E/8D9A0488, prev 7E/8D99E458, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 3, blkref #0: rel 1663/17221/17557 blk 22632 FPW
rmgr: Heap2 len (rec/tot): 57/ 8117, tx: 0, lsn:
7E/8D9A24C8, prev 7E/8D9A0488, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 3, blkref #0: rel 1663/17221/17557 blk 22633 FPW
rmgr: Heap2 len (rec/tot): 57/ 8093, tx: 0, lsn:
7E/8D9A4498, prev 7E/8D9A24C8, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 1, blkref #0: rel 1663/17221/17557 blk 22634 FPW
rmgr: Heap2 len (rec/tot): 64/ 8256, tx: 0, lsn:
7E/8D9A6450, prev 7E/8D9A4498, desc: VISIBLE cutoff xid 0 flags 2, blkref
#0: rel 1663/17221/17557 fork vm blk 0 FPW, blkref #1: rel
1663/17221/17557 blk 22634
rmgr: Heap2 len (rec/tot): 57/ 8101, tx: 0, lsn:
7E/8D9A84A8, prev 7E/8D9A6450, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 4, blkref #0: rel 1663/17221/17557 blk 22639 FPW
rmgr: Heap2 len (rec/tot): 57/ 8149, tx: 0, lsn:
7E/8D9AA468, prev 7E/8D9A84A8, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 1, blkref #0: rel 1663/17221/17557 blk 22640 FPW
rmgr: Heap2 len (rec/tot): 57/ 8213, tx: 0, lsn:
7E/8D9AC458, prev 7E/8D9AA468, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 1, blkref #0: rel 1663/17221/17557 blk 22641 FPW
rmgr: Heap2 len (rec/tot): 57/ 8093, tx: 0, lsn:
7E/8D9AE488, prev 7E/8D9AC458, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 4, blkref #0: rel 1663/17221/17557 blk 22642 FPW
rmgr: Heap2 len (rec/tot): 57/ 8141, tx: 0, lsn:
7E/8D9B0440, prev 7E/8D9AE488, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 2, blkref #0: rel 1663/17221/17557 blk 22643 FPW
rmgr: Heap2 len (rec/tot): 57/ 8125, tx: 0, lsn:
7E/8D9B2428, prev 7E/8D9B0440, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 1, blkref #0: rel 1663/17221/17557 blk 22646 FPW
rmgr: Heap2 len (rec/tot): 57/ 8245, tx: 0, lsn:
7E/8D9B4400, prev 7E/8D9B2428, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 3, blkref #0: rel 1663/17221/17557 blk 22648 FPW
rmgr: Heap2 len (rec/tot): 57/ 8237, tx: 0, lsn:
7E/8D9B6450, prev 7E/8D9B4400, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 1, blkref #0: rel 1663/17221/17557 blk 22649 FPW
rmgr: Heap2 len (rec/tot): 57/ 8221, tx: 0, lsn:
7E/8D9B8498, prev 7E/8D9B6450, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 3, blkref #0: rel 1663/17221/17557 blk 22652 FPW
rmgr: Heap2 len (rec/tot): 57/ 8229, tx: 0, lsn:
7E/8D9BA4D0, prev 7E/8D9B8498, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 3, blkref #0: rel 1663/17221/17557 blk 22653 FPW
rmgr: Heap2 len (rec/tot): 57/ 8157, tx: 0, lsn:
7E/8D9BC510, prev 7E/8D9BA4D0, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 3, blkref #0: rel 1663/17221/17557 blk 22655 FPW
rmgr: Heap2 len (rec/tot): 57/ 8093, tx: 0, lsn:
7E/8D9BE508, prev 7E/8D9BC510, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 1, blkref #0: rel 1663/17221/17557 blk 22657 FPW
rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn:
7E/8D9C04C0, prev 7E/8D9BE508, desc: VISIBLE cutoff xid 810424291 flags 1,
blkref #0: rel 1663/ 17221/17557 fork vm blk 0, blkref #1: rel
1663/17221/17557 blk 22659
```

File `base/17221/17557` - is a file for DB, which restored with zero
files.

recovery.conf:

```
restore_command = 'pgbackrest --stanza=foo-db archive-get %f "%p"'
recovery_target_time = '2021-02-20 07:00:00'
recovery_target_action = 'promote'
```

#2Антон Курочкин
antkurochkin@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #16894: PANIC: WAL contains references to invalid pages

Good day,

There are inaccuracies in the description, correct data (
https://github.com/pgbackrest/pgbackrest/issues/796):

* Restore without using PITR, just from a backup.

* Correct recovery.conf:

restore_command = 'pgbackrest --stanza=foo-db archive-get %f "%p"'

Ср, 24 февр. 2021 г. в 12:43, PG Bug reporting form <noreply@postgresql.org

:

The following bug has been logged on the website:

Bug reference: 16894
Logged by: Anton
Email address: antkurochkin@gmail.com
PostgreSQL version: 9.6.20
Operating system: Oracle Linux Server 7.6
Description:

Good day,

We got error when PITR restoring individual databases from a backup using
pgbackrest.

Opened issue in pgbackrest repository:
https://github.com/pgbackrest/pgbackrest/issues/796
It would be great if you would look at this problem. All the information is
in the issue.

Pgbackrest developers offered to provide this information to the postgres
folks.

Turned out that the problem is playing WAL files on databases that are
restored with zero files.
The database files from the backup are restored successfully, but an error
occurs when starting PG and playing WAL:

```
< 2021-02-21 10:34:57.950 UTC > LOG: starting archive recovery
< 2021-02-21 10:34:58.137 UTC > LOG: restored log file
"000000010000007E0000007D" from archive
< 2021-02-21 10:34:58.151 UTC > LOG: redo starts at 7E/7D0090A8
< 2021-02-21 10:34:58.521 UTC > LOG: restored log file
"000000010000007E0000007E" from archive
< 2021-02-21 10:34:58.711 UTC > LOG: consistent recovery state reached at
7E/7EF5E478
< 2021-02-21 10:34:58.915 UTC > LOG: restored log file
"000000010000007E0000007F" from archive
< 2021-02-21 10:34:59.415 UTC > LOG: restored log file
"000000010000007E00000080" from archive
< 2021-02-21 10:35:01.521 UTC > LOG: restored log file
"000000010000007E00000081" from archive
< 2021-02-21 10:35:02.110 UTC > LOG: restored log file
"000000010000007E00000082" from archive
< 2021-02-21 10:35:02.577 UTC > LOG: restored log file
"000000010000007E00000083" from archive
< 2021-02-21 10:35:03.045 UTC > LOG: restored log file
"000000010000007E00000084" from archive
< 2021-02-21 10:35:03.544 UTC > LOG: restored log file
"000000010000007E00000085" from archive
< 2021-02-21 10:35:03.985 UTC > LOG: restored log file
"000000010000007E00000086" from archive
< 2021-02-21 10:35:05.394 UTC > LOG: restored log file
"000000010000007E00000087" from archive
< 2021-02-21 10:35:05.945 UTC > LOG: restored log file
"000000010000007E00000088" from archive
< 2021-02-21 10:35:06.623 UTC > LOG: restored log file
"000000010000007E00000089" from archive
< 2021-02-21 10:35:07.264 UTC > LOG: restored log file
"000000010000007E0000008A" from archive
< 2021-02-21 10:35:07.831 UTC > LOG: restored log file
"000000010000007E0000008B" from archive
< 2021-02-21 10:35:08.318 UTC > LOG: restored log file
"000000010000007E0000008C" from archive
< 2021-02-21 10:35:08.892 UTC > LOG: restored log file
"000000010000007E0000008D" from archive
< 2021-02-21 10:35:09.050 UTC > WARNING: page 22659 of relation
base/17221/17557 is uninitialized
< 2021-02-21 10:35:09.050 UTC > CONTEXT: xlog redo at 7E/8D9C04C0 for
Heap2/VISIBLE: cutoff xid 810424291 flags 1
< 2021-02-21 10:35:09.050 UTC > PANIC: WAL contains references to invalid
pages
< 2021-02-21 10:35:09.050 UTC > CONTEXT: xlog redo at 7E/8D9C04C0 for
Heap2/VISIBLE: cutoff xid 810424291 flags 1
< 2021-02-21 10:35:09.675 UTC > LOG: startup process (PID 9) was
terminated
by signal 6: Aborted
```

In WAL there is only one line for `page 27536`:

```
rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn:
7E/8D9C04C0, prev 7E/8D9BE508, desc: VISIBLE cutoff xid 810424291 flags 1,
blkref #0: rel 1663/17221/17557 fork vm blk 0, blkref #1: rel
1663/17221/17557 blk 22659
```

We tried to find another information in WALs for `xid 810424291` and `blk
22659` - nothing.

A few records before 7E/8D9C04C0 from last checkpoint:

```
rmgr: XLOG len (rec/tot): 106/ 106, tx: 0, lsn:
7E/8D99C4B0, prev 7E/8D99C478, desc: CHECKPOINT_ONLINE redo 7E/8D90D0D8;
tli
1; prev tli 1; fpw true; xid 0:810982162; oid 476718; multi 1; offset 0;
oldest xid 628489828 in DB 16386; oldest multi 1 in DB 17221; oldest/newest
commit timestamp xid: 0/0; oldest running xid 810982162; online
rmgr: Heap2 len (rec/tot): 57/ 7965, tx: 0, lsn:
7E/8D99C520, prev 7E/8D99C4B0, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 3, blkref #0: rel 1663/17221/17557 blk 22557 FPW
rmgr: Heap2 len (rec/tot): 57/ 8213, tx: 0, lsn:
7E/8D99E458, prev 7E/8D99C520, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 1, blkref #0: rel 1663/17221/17557 blk 22631 FPW
rmgr: Heap2 len (rec/tot): 57/ 8229, tx: 0, lsn:
7E/8D9A0488, prev 7E/8D99E458, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 3, blkref #0: rel 1663/17221/17557 blk 22632 FPW
rmgr: Heap2 len (rec/tot): 57/ 8117, tx: 0, lsn:
7E/8D9A24C8, prev 7E/8D9A0488, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 3, blkref #0: rel 1663/17221/17557 blk 22633 FPW
rmgr: Heap2 len (rec/tot): 57/ 8093, tx: 0, lsn:
7E/8D9A4498, prev 7E/8D9A24C8, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 1, blkref #0: rel 1663/17221/17557 blk 22634 FPW
rmgr: Heap2 len (rec/tot): 64/ 8256, tx: 0, lsn:
7E/8D9A6450, prev 7E/8D9A4498, desc: VISIBLE cutoff xid 0 flags 2, blkref
#0: rel 1663/17221/17557 fork vm blk 0 FPW, blkref #1: rel
1663/17221/17557 blk 22634
rmgr: Heap2 len (rec/tot): 57/ 8101, tx: 0, lsn:
7E/8D9A84A8, prev 7E/8D9A6450, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 4, blkref #0: rel 1663/17221/17557 blk 22639 FPW
rmgr: Heap2 len (rec/tot): 57/ 8149, tx: 0, lsn:
7E/8D9AA468, prev 7E/8D9A84A8, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 1, blkref #0: rel 1663/17221/17557 blk 22640 FPW
rmgr: Heap2 len (rec/tot): 57/ 8213, tx: 0, lsn:
7E/8D9AC458, prev 7E/8D9AA468, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 1, blkref #0: rel 1663/17221/17557 blk 22641 FPW
rmgr: Heap2 len (rec/tot): 57/ 8093, tx: 0, lsn:
7E/8D9AE488, prev 7E/8D9AC458, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 4, blkref #0: rel 1663/17221/17557 blk 22642 FPW
rmgr: Heap2 len (rec/tot): 57/ 8141, tx: 0, lsn:
7E/8D9B0440, prev 7E/8D9AE488, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 2, blkref #0: rel 1663/17221/17557 blk 22643 FPW
rmgr: Heap2 len (rec/tot): 57/ 8125, tx: 0, lsn:
7E/8D9B2428, prev 7E/8D9B0440, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 1, blkref #0: rel 1663/17221/17557 blk 22646 FPW
rmgr: Heap2 len (rec/tot): 57/ 8245, tx: 0, lsn:
7E/8D9B4400, prev 7E/8D9B2428, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 3, blkref #0: rel 1663/17221/17557 blk 22648 FPW
rmgr: Heap2 len (rec/tot): 57/ 8237, tx: 0, lsn:
7E/8D9B6450, prev 7E/8D9B4400, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 1, blkref #0: rel 1663/17221/17557 blk 22649 FPW
rmgr: Heap2 len (rec/tot): 57/ 8221, tx: 0, lsn:
7E/8D9B8498, prev 7E/8D9B6450, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 3, blkref #0: rel 1663/17221/17557 blk 22652 FPW
rmgr: Heap2 len (rec/tot): 57/ 8229, tx: 0, lsn:
7E/8D9BA4D0, prev 7E/8D9B8498, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 3, blkref #0: rel 1663/17221/17557 blk 22653 FPW
rmgr: Heap2 len (rec/tot): 57/ 8157, tx: 0, lsn:
7E/8D9BC510, prev 7E/8D9BA4D0, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 3, blkref #0: rel 1663/17221/17557 blk 22655 FPW
rmgr: Heap2 len (rec/tot): 57/ 8093, tx: 0, lsn:
7E/8D9BE508, prev 7E/8D9BC510, desc: FREEZE_PAGE cutoff xid 760982701
ntuples 1, blkref #0: rel 1663/17221/17557 blk 22657 FPW
rmgr: Heap2 len (rec/tot): 59/ 59, tx: 0, lsn:
7E/8D9C04C0, prev 7E/8D9BE508, desc: VISIBLE cutoff xid 810424291 flags 1,
blkref #0: rel 1663/ 17221/17557 fork vm blk 0, blkref #1: rel
1663/17221/17557 blk 22659
```

File `base/17221/17557` - is a file for DB, which restored with zero
files.

recovery.conf:

```
restore_command = 'pgbackrest --stanza=foo-db archive-get %f "%p"'
recovery_target_time = '2021-02-20 07:00:00'
recovery_target_action = 'promote'
```

--

С уважением,
Антон

#3David Steele
david@pgmasters.net
In reply to: Антон Курочкин (#2)
Re: BUG #16894: PANIC: WAL contains references to invalid pages

On 2/24/21 7:22 AM, Антон Курочкин wrote:

We got error when PITR restoring individual databases from a backup
using
pgbackrest.

Opened issue in pgbackrest repository:
https://github.com/pgbackrest/pgbackrest/issues/796
<https://github.com/pgbackrest/pgbackrest/issues/796&gt;
It would be great if you would look at this problem. All the
information is
in the issue.

Pgbackrest developers offered to provide this information to the
postgres
folks.

Turned out that the problem is playing WAL files on databases that are
restored with zero files.
The database files from the backup are restored successfully, but an
error
occurs when starting PG and playing WAL:

<snip>

< 2021-02-21 10:35:09.050 UTC > WARNING: page 22659 of relation
base/17221/17557 is uninitialized
< 2021-02-21 10:35:09.050 UTC > CONTEXT: xlog redo at 7E/8D9C04C0 for
Heap2/VISIBLE: cutoff xid 810424291 flags 1
< 2021-02-21 10:35:09.050 UTC > PANIC: WAL contains references to >

invalid pages

< 2021-02-21 10:35:09.050 UTC > CONTEXT: xlog redo at 7E/8D9C04C0 for
Heap2/VISIBLE: cutoff xid 810424291 flags 1

For context, when a user performs selective restore all the relations
for databases *not* selected are restored as sparse zero files with the
correct size. WAL replay will update those relations but they are not
expected to be consistent, and in fact we make it impossible to logon to
those databases so the only option is to drop them.

Still, this is a fairly efficient way to get at a single database in a
cluster that contains hundreds or thousands of databases.

Clearly the methodology is a bit unorthodox, but even so we don't expect
to see an error here. This is the first time we've gotten a detailed
analysis of the issue, so I haven't had time to really look into it, but
perhaps there is a full page write missing? In the case of corruption
the administrator might zero a page and we expect WAL replay to succeed,
I think.

Regards,
--
-David
david@pgmasters.net

#4Michael Paquier
michael@paquier.xyz
In reply to: David Steele (#3)
Re: BUG #16894: PANIC: WAL contains references to invalid pages

On Thu, Feb 25, 2021 at 12:53:44PM -0500, David Steele wrote:

For context, when a user performs selective restore all the relations for
databases *not* selected are restored as sparse zero files with the correct
size. WAL replay will update those relations but they are not expected to
be consistent, and in fact we make it impossible to logon to those databases
so the only option is to drop them.

What is a selective restore? Is that a pgbackrest restore mode from a
full physical backup where only a portion of the database folders are
physically put back into the data folder restored before doing WAL
replay up to the consistent point of the full physical backup? Then,
you emulate the absence of the files from the other database folders
using files full of zeros but with the same size as the original
backup to trick WAL replay?
--
Michael

#5David Steele
david@pgmasters.net
In reply to: Michael Paquier (#4)
Re: BUG #16894: PANIC: WAL contains references to invalid pages

Hi Michael,

On 2/28/21 11:31 PM, Michael Paquier wrote:

On Thu, Feb 25, 2021 at 12:53:44PM -0500, David Steele wrote:

For context, when a user performs selective restore all the relations for
databases *not* selected are restored as sparse zero files with the correct
size. WAL replay will update those relations but they are not expected to
be consistent, and in fact we make it impossible to logon to those databases
so the only option is to drop them.

What is a selective restore? Is that a pgbackrest restore mode from a
full physical backup where only a portion of the database folders are
physically put back into the data folder restored before doing WAL
replay up to the consistent point of the full physical backup? Then,
you emulate the absence of the files from the other database folders
using files full of zeros but with the same size as the original
backup to trick WAL replay?

That summarizes the process very well.

My concern here is that we appear to have a partial page write as the
first write to a page after the backup start checkpoint. At least,
that's what this error seems to be indicating from my reading of the code.

The question is -- are there valid circumstances under which this might
be happening or is it a bug?

I have not submitted a bug before this because I am not able to
reproduce it in a test environment, but perhaps somebody can weigh in on
whether this behavior seems normal.

Regards,
--
-David
david@pgmasters.net

#6Michael Paquier
michael@paquier.xyz
In reply to: David Steele (#5)
Re: BUG #16894: PANIC: WAL contains references to invalid pages

On Tue, Mar 02, 2021 at 07:46:21AM -0500, David Steele wrote:

That summarizes the process very well.

Okay.

My concern here is that we appear to have a partial page write as the first
write to a page after the backup start checkpoint. At least, that's what
this error seems to be indicating from my reading of the code.

The question is -- are there valid circumstances under which this might be
happening or is it a bug?

Err, what you are doing here is qualified as unorthodox, but isn't
"unsupported" a better term? That's basically tricking WAL replay
into doing something it would never have to do for physical backups.
--
Michael

#7David Steele
david@pgmasters.net
In reply to: Michael Paquier (#6)
Re: BUG #16894: PANIC: WAL contains references to invalid pages

On 3/3/21 2:03 AM, Michael Paquier wrote:

On Tue, Mar 02, 2021 at 07:46:21AM -0500, David Steele wrote:

My concern here is that we appear to have a partial page write as the first
write to a page after the backup start checkpoint. At least, that's what
this error seems to be indicating from my reading of the code.

The question is -- are there valid circumstances under which this might be
happening or is it a bug?

Err, what you are doing here is qualified as unorthodox, but isn't
"unsupported" a better term? That's basically tricking WAL replay
into doing something it would never have to do for physical backups.

OK, but shouldn't we have a full page write for this page after the
backup starts, rather than the partial we seem to be seeing here? Is
there any condition where the full page write would be skipped
legitimately, or does it point to a problem?

If Postgres is running correctly there is certainly no expectation for
support of this unusual use case, but I do think that this possibly
points to an issue in Postgres, which under normal circumstances would
be very hard to detect.

Put another way, what if this page was torn while being copied during
the backup? Would this WAL record still be doing the right thing in that
case?

Regards,
--
-David
david@pgmasters.net

#8Michael Paquier
michael@paquier.xyz
In reply to: David Steele (#7)
Re: BUG #16894: PANIC: WAL contains references to invalid pages

On Wed, Mar 03, 2021 at 06:49:02AM -0500, David Steele wrote:

OK, but shouldn't we have a full page write for this page after the backup
starts, rather than the partial we seem to be seeing here? Is there any
condition where the full page write would be skipped legitimately, or does
it point to a problem?

That's how things work. If they don't work this way for physical
backups, we may have a problem. At replay, the full page will be
replayed if BKPIMAGE_APPLY is correctly set, as per
XLogReadBufferForRedoExtended(). And XLogRecordAssemble()
does the decision when building the record (just grep for
needs_backup).

If Postgres is running correctly there is certainly no expectation for
support of this unusual use case, but I do think that this possibly points
to an issue in Postgres, which under normal circumstances would be very hard
to detect.

Well, the report tells that this is an issue that happens on those
fake files full of zeros, but are you sure that you have the sizing
right? I still don't see any evidence of anything broken based on the
information gathered for full backups, FWIW.
--
Michael

#9David Steele
david@pgmasters.net
In reply to: Michael Paquier (#8)
Re: BUG #16894: PANIC: WAL contains references to invalid pages

On 3/4/21 11:32 PM, Michael Paquier wrote:

On Wed, Mar 03, 2021 at 06:49:02AM -0500, David Steele wrote:

OK, but shouldn't we have a full page write for this page after the backup
starts, rather than the partial we seem to be seeing here? Is there any
condition where the full page write would be skipped legitimately, or does
it point to a problem?

That's how things work. If they don't work this way for physical
backups, we may have a problem. At replay, the full page will be
replayed if BKPIMAGE_APPLY is correctly set, as per
XLogReadBufferForRedoExtended(). And XLogRecordAssemble()
does the decision when building the record (just grep for
needs_backup).

That's exactly the problem. This WAL record appears to be the only
reference to the page the submitter could find in their WAL. Anton,
could you confirm that?

If we can confirm that there is no FPI for this page after the initial
backup checkpoint, wouldn't that point to an issue?

If Postgres is running correctly there is certainly no expectation for
support of this unusual use case, but I do think that this possibly points
to an issue in Postgres, which under normal circumstances would be very hard
to detect.

Well, the report tells that this is an issue that happens on those
fake files full of zeros, but are you sure that you have the sizing
right? I still don't see any evidence of anything broken based on the
information gathered for full backups, FWIW.

This is the size we validate for normal restores so pretty sure it is
accurate. Actually, postgres doesn't seem to care if the file is large
enough as long as it is present. The reason we precreate the files at
the correct size is because when postgres extends the file it is not
sparse and uses more space.

Regards,
--
-David
david@pgmasters.net

In reply to: PG Bug reporting form (#1)
Re[2]: BUG #16894: PANIC: WAL contains references to invalid pages

 

05.03.2021 15:55, David Steele:

On 3/4/21 11:32 PM, Michael Paquier wrote:

On Wed, Mar 03, 2021 at 06:49:02AM -0500, David Steele wrote:

OK, but shouldn't we have a full page write for this page after the
backup
starts, rather than the partial we seem to be seeing here? Is there any
condition where the full page write would be skipped legitimately,
or does
it point to a problem?

That's how things work.  If they don't work this way for physical
backups, we may have a problem.  At replay, the full page will be
replayed if BKPIMAGE_APPLY is correctly set, as per
XLogReadBufferForRedoExtended().  And XLogRecordAssemble()
does the decision when building the record (just grep for
needs_backup).

That's exactly the problem. This WAL record appears to be the only
reference to the page the submitter could find in their WAL. Anton,
could you confirm that?

If we can confirm that there is no FPI for this page after the initial
backup checkpoint, wouldn't that point to an issue?

If Postgres is running correctly there is certainly no expectation for
support of this unusual use case, but I do think that this possibly
points
to an issue in Postgres, which under normal circumstances would be
very hard
to detect.

Well, the report tells that this is an issue that happens on those
fake files full of zeros, but are you sure that you have the sizing
right?  I still don't see any evidence of anything broken based on the
information gathered for full backups, FWIW.

This is the size we validate for normal restores so pretty sure it is
accurate. Actually, postgres doesn't seem to care if the file is large
enough as long as it is present. The reason we precreate the files at
the correct size is because when postgres extends the file it is not
sparse and uses more space.

Regards,

 
I have reproduced the situation with WAL files and zeroed pgbackrest files.
the same problem started on the first lsn in wal.
I attach the logs recovery process
 
    2021-03-06 00:36:13.784 MSK [95] LOG:  database system was interrupted; last known up at 2021-03-01 01:07:51 MSK
    2021-03-06 00:36:13.871 MSK [95] LOG:  starting point-in-time recovery to 2021-03-01 12:00:00+03
    2021-03-06 00:36:13.915 MSK [95] LOG:  restored log file "0000000B.history" from archive
    2021-03-06 00:36:14.158 MSK [95] LOG:  restored log file "0000000B000002710000008A" from archive
    2021-03-06 00:36:14.172 MSK [95] LOG:  redo starts at 271/8A000028
    2021-03-06 00:36:   14.325 MSK [95] WARNING:  page 27 of relation base/17946/28008 is uninitialized
    2021-03-06 00:36:14.325 MSK [95] WARNING:  page 15 of relation base/17946/2840 is uninitialized
    2021-03-06 00:36:14.325 MSK [95] PANIC:  WAL contains references to invalid pages
    2021-03-06 00:36:14.681 MSK [92] LOG:  startup process (PID 95) was terminated by signal 6: Aborted
    2021-03-06 00:36:14.681 MSK [92] LOG:  terminating any other active server processes
    2021-03-06 00:36:14.708 MSK [92] LOG:  database system is shut down
    2021-03-06 00:36:16.175 MSK [9] LOG:  database system was interrupted while in recovery at log time 2021-03-01 01:00:11 MSK
    2021-03-06 00:36:16.175 MSK [9] HINT:  If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
    2021-03-06 00:36:16.263 MSK [9] LOG:  starting point-in-time recovery to 2021-03-01 12:00:00+03
    2021-03-06 00:36:16.306 MSK [9] LOG:  restored log file "0000000B.history" from archive
 
and if i watch in 0000000B000002710000008A. i see:
 
    bash-4.2$  /usr/pgsql-11/bin/pg_waldump /var/lib/pgsql/11/data/pg_wal/0000000B000002710000008A -e 271/8A009E10
 
    rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 271/8A000028, prev 271/89A60D10, desc: RUNNING_XACTS nextXid 361855004 latestCompletedXid 361855003 oldestRunningXid 361855004
    rmgr: XLOG        len (rec/tot):    106/   106, tx:          0, lsn: 271/8A000060, prev 271/8A000028, desc: CHECKPOINT_ONLINE redo 271/8A000028; tli 11; prev tli 11; fpw true; xid 0:361855004; oid 29809888; multi 240501; offset 500196; oldest xid 162916086 in DB 17950; oldest multi 1 in DB 17948; oldest/newest commit timestamp xid: 0/0; oldest running xid 361855004; online
    rmgr: Heap        len (rec/tot):     65/  2685, tx:  361855004, lsn: 271/8A0000D0, prev 271/8A000060, desc: HOT_UPDATE off 4 xmax 361855004 ; new off 10 xmax 0, blkref #0: rel 1663/17948/24604 blk 4 FPW
    rmgr: Transaction len (rec/tot):     34/    34, tx:  361855004, lsn: 271/8A000B50, prev 271/8A0000D0, desc: COMMIT 2021-03-01 01:00:12.752834 MSK
    rmgr: Heap        len (rec/tot):     65/  1941, tx:  361855005, lsn: 271/8A000B78, prev 271/8A000B50, desc: HOT_UPDATE off 3 xmax 361855005 ; new off 5 xmax 0, blkref #0: rel 1663/17948/24604 blk 21 FPW
    rmgr: Transaction len (rec/tot):     34/    34, tx:  361855005, lsn: 271/8A001310, prev 271/8A000B78, desc: COMMIT 2021-03-01 01:00:13.772687 MSK
    rmgr: Heap        len (rec/tot):     65/  3745, tx:  361855006, lsn: 271/8A001338, prev 271/8A001310, desc: HOT_UPDATE off 2 xmax 361855006 ; new off 9 xmax 0, blkref #0: rel 1663/17948/24604 blk 1 FPW
    rmgr: Transaction len (rec/tot):     34/    34, tx:  361855006, lsn: 271/8A0021F8, prev 271/8A001338, desc: COMMIT 2021-03-01 01:00:14.865075 MSK
    rmgr: Heap        len (rec/tot):     73/    73, tx:  361855007, lsn: 271/8A002220, prev 271/8A0021F8, desc: HOT_UPDATE off 5 xmax 361855007 ; new off 11 xmax 0, blkref #0: rel 1663/17948/24604 blk 4
    rmgr: Transaction len (rec/tot):     34/    34, tx:  361855007, lsn: 271/8A002270, prev 271/8A002220, desc: COMMIT 2021-03-01 01:00:15.887661 MSK
    rmgr: Heap        len (rec/tot):     59/  1855, tx:  361855008, lsn: 271/8A002298, prev 271/8A002270, desc: LOCK off 27: xid 361855008: flags 0 LOCK_ONLY EXCL_LOCK KEYS_UPDATED , blkref #0: rel 1663/17946/27953 blk 0 FPW
    rmgr: Heap        len (rec/tot):     69/    69, tx:  361855008, lsn: 271/8A0029D8, prev 271/8A002298, desc: HOT_UPDATE off 27 xmax 361855008 KEYS_UPDATED ; new off 28 xmax 361855008, blkref #0: rel 1663/17946/27953 blk 0
    rmgr: Heap        len (rec/tot):     54/  8134, tx:  361855008, lsn: 271/8A002A20, prev 271/8A0029D8, desc: INSERT off 115, blkref #0: rel 1663/17946/27924 blk 2602 FPW
    rmgr: Btree       len (rec/tot):     53/  3353, tx:  361855008, lsn: 271/8A004A00, prev 271/8A002A20, desc: INSERT_LEAF off 163, blkref #0: rel 1663/17946/27930 blk 11930 FPW
    rmgr: Heap        len (rec/tot):     54/  5906, tx:  361855008, lsn: 271/8A005720, prev 271/8A004A00, desc: INSERT off 102, blkref #0: rel 1663/17946/28008 blk 8 FPW
    rmgr: Btree       len (rec/tot):     53/  4193, tx:  361855008, lsn: 271/8A006E50, prev 271/8A005720, desc: INSERT_LEAF off 205, blkref #0: rel 1663/17946/28011 blk 335 FPW
    rmgr: Btree       len (rec/tot):     53/  6873, tx:  361855008, lsn: 271/8A007EB8, prev 271/8A006E50, desc: INSERT_LEAF off 4, blkref #0: rel 1663/17946/28013 blk 2 FPW
    rmgr: Heap        len (rec/tot):     59/  1115, tx:  361855008, lsn: 271/8A0099B0, prev 271/8A007EB8, desc: LOCK off 9: xid 361855008: flags 0 LOCK_ONLY EXCL_LOCK KEYS_UPDATED , blkref #0: rel 1663/17946/27484 blk 2 FPW
for a complete picture, I attach information about the cluster
 
    bash-4.2$ /usr/pgsql-11/bin/pg_controldata /var/lib/pgsql/11/data/
 
    pg_control version number:            1100
    Catalog version number:               201809051
    Database system identifier:           6725332219754936319
    Database cluster state:               in archive recovery
    pg_control last modified:             Sat Mar  6 00:36:16 2021
    Latest checkpoint location:           271/8A000060
    Latest checkpoint's REDO location:    271/8A000028
    Latest checkpoint's REDO WAL file:    0000000B000002710000008A
    Latest checkpoint's TimeLineID:       11
    Latest checkpoint's PrevTimeLineID:   11
    Latest checkpoint's full_page_writes: on
    Latest checkpoint's NextXID:          0:361855004
    Latest checkpoint's NextOID:          29809888
    Latest checkpoint's NextMultiXactId:  240501
    Latest checkpoint's NextMultiOffset:  500196
    Latest checkpoint's oldestXID:        162916086
    Latest checkpoint's oldestXID's DB:   17950
    Latest checkpoint's oldestActiveXID:  361855004
    Latest checkpoint's oldestMultiXid:   1
    Latest checkpoint's oldestMulti's DB: 17948
    Latest checkpoint's oldestCommitTsXid:0
    Latest checkpoint's newestCommitTsXid:0
    Time of latest checkpoint:            Mon Mar  1 01:00:11 2021
    Fake LSN counter for unlogged rels:   0/1
    Minimum recovery ending location:     271/8AFB4028
    Min recovery ending loc's timeline:   11
    Backup start location:                0/0
    Backup end location:                  0/0
    End-of-backup record required:        no
    wal_level setting:                    replica
    wal_log_hints setting:                off
    max_connections setting:              1000
    max_worker_processes setting:         10
    max_prepared_xacts setting:           0
    max_locks_per_xact setting:           64
    track_commit_timestamp setting:       off
    Maximum data alignment:               8
    Database block size:                  8192
    Blocks per segment of large relation: 131072
    WAL block size:                       8192
    Bytes per WAL segment:                16777216
    Maximum length of identifiers:        64
    Maximum columns in an index:          32
    Maximum size of a TOAST chunk:        1996
    Size of a large-object chunk:         2048
    Date/time type storage:               64-bit integers
    Float4 argument passing:              by value
    Float8 argument passing:              by value
    Data page checksum version:           0
    Mock authentication nonce:            66bd29a825f4f3e8aea908b50f798ba53215e9f074129724e5a0936a51442e4f
 
--
Vladimir skilyazhnev
 

#11David Steele
david@pgmasters.net
In reply to: Владимир Владимир (#10)
Re: BUG #16894: PANIC: WAL contains references to invalid pages

On 3/6/21 2:32 AM, Владимир Владимир wrote:

I have reproduced the situation with WAL files and zeroed pgbackrest files.
the same problem started on the first lsn in wal.

I spent some time looking through the code this morning. As I read it,
WAL replay allows uninitialized (and missing) pages up to the point
where consistency is achieved and after that flags uninitialized (or
missing) pages as errors. It's pretty hard to argue with this logic and
it addresses my concern about torn pages in the backups, i.e.
torn/zero/missing pages will be repaired without complaint up to
consistency.

So I believe that means Postgres is working as expected and there is no bug.

For pgBackRest selective restore there are two solutions:

1) It appears recovery_target = 'immediate' works fine. This was
verified in Anton's case. Vladimir, perhaps you could try that?

2) PG13 introduced the ignore_invalid_pages GUC [1]/messages/by-id/E1iu6D8-0000tK-Cm@gemulon.postgresql.org which should allow
replay to work past consistency. Note that the onus would be on the user
to ensure that there were no invalid pages in the database(s) they are
restoring.

I think we should continue the discussion in [2]https://github.com/pgbackrest/pgbackrest/issues/796 since there doesn't
seem to be anything else to discuss in this thread.

Regards,
--
-David
david@pgmasters.net

[1]: /messages/by-id/E1iu6D8-0000tK-Cm@gemulon.postgresql.org
/messages/by-id/E1iu6D8-0000tK-Cm@gemulon.postgresql.org
[2]: https://github.com/pgbackrest/pgbackrest/issues/796