BUG #16172: failure of vacuum file truncation can cause permanent data corruption
The following bug has been logged on the website:
Bug reference: 16172
Logged by: TAKATSUKA Haruka
Email address: harukat@sraoss.co.jp
PostgreSQL version: 12.1
Operating system: Windows/Linux
Description:
Hello, pgsql hackers,
I found that failure of vacuum file truncation can cause permanent data
corruption.
I am reporting the reproduce steps below.
In Windows installation, the truncation sometime fails by permission
denied error because of anti-virus software. It has caused just ERROR
and people have offen dismissed it.
Truncation failure can also make the standby panic with the following
messages when replaying Heap2/VISIBLE or Heap2/CLEAN, because truncation
wal is emitted even if it doesn't complete actually in the primary.
WARNING: page .. of relation base/..../.... does not exist
CONTEXT: WAL redo at ..... for ....: cutoff xid ... flags ...
PANIC: WAL contains references to invalid pages
I think truncation failure is to be handled as more severe level.
Any thoughts?
with best regards,
Haruka Takatsuka / SRA OSS, Inc. Japan
reproduce steps (PG12)
======================
$ psql -U postgres -d db1
Pager usage is off.
psql (12.1)
Type "help" for help.
db1=#
$ gdb -p {its backend process}
(gdb) b FileTruncate
Breakpoint 1 at 0x73d320: file fd.c, line 2057.
(gdb) c
Continuing.
db1=# SHOW autovacuum;
autovacuum
------------
off
(1 row)
db1=# CREATE TABLE t1 (id int primary key, v text);
CREATE
db1=# INSERT INTO t1 SELECT g, md5(g::text) FROM generate_series(1, 10000)
as g;
INSERT 0 10000
db1=# CHECKPOINT;
Program received signal SIGUSR1, User defined signal 1.
0x00000036caae91a3 in __epoll_wait_nocancel () from /lib64/libc.so.6
(gdb) c
Continuing.
CHECKPOINT
db1=# DELETE FROM t1 WHERE id > 50;
DELETE 9950
db1=# VACUUM t1;
Breakpoint 1, FileTruncate (file=59, offset=8192,
wait_event_info=167772175)
at fd.c:2057
2057 {
(gdb) n
2065 returnCode = FileAccess(file);
(gdb) n
2066 if (returnCode < 0)
(gdb) p returnCode = -100
$6 = -100
(gdb) c
Continuing.
ERROR: could not truncate file "base/16384/16645" to 1 blocks: Success
db1=# SELECT count(*) FROM t1;
count
-------
9930
(1 row)
db1=# SELECT * FROM t1 WHERE id > 50 LIMIT 10;
id | v
-----+----------------------------------
121 | 4c56ff4ce4aaf9573aa5dff913df997a
122 | a0a080f42e6f13b3a2df133f073095dd
123 | 202cb962ac59075b964b07152d234b70
124 | c8ffe9a587b126f152ed3d89a146b445
125 | 3def184ad8f4755ff269862ea77393dd
126 | 069059b7ef840f0c74a814ec9237b6ec
127 | ec5decca5ed3d6b8079e2e7e7bacc9f2
128 | 76dc611d6ebaafc66cc0879c71b5db5c
129 | d1f491a404d6854880943e5c3cd9ca25
130 | 9b8619251a19057cff70779273e95aa6
(10 rows)
db1=# VACUUM FULL FREEZE t1;
VACUUM
db1=# REINDEX TABLE t1;
REINDEX
db1=# SELECT count(*) FROM t1;
count
-------
9930
(1 row)
db1=# SELECT * FROM t1 WHERE id > 50 LIMIT 10;
id | v
-----+----------------------------------
121 | 4c56ff4ce4aaf9573aa5dff913df997a
122 | a0a080f42e6f13b3a2df133f073095dd
123 | 202cb962ac59075b964b07152d234b70
124 | c8ffe9a587b126f152ed3d89a146b445
125 | 3def184ad8f4755ff269862ea77393dd
126 | 069059b7ef840f0c74a814ec9237b6ec
127 | ec5decca5ed3d6b8079e2e7e7bacc9f2
128 | 76dc611d6ebaafc66cc0879c71b5db5c
129 | d1f491a404d6854880943e5c3cd9ca25
130 | 9b8619251a19057cff70779273e95aa6
(10 rows)
real incident log sample (from Windows PG10.x)
==============================================
primary server:
---------------
2019-12-17 19:10:19 JST [14321] ERROR: 42501: could not truncate file
"base/444807/447791" to 7 blocks: Permission denied
2019-12-17 19:10:19 JST [14321] CONTEXT: automatic vacuum of table
"ddd.ssss.tttt"
hot standby server:
-------------------
2019-12-17 19:11:20 JST [12345] WARNING: 01000: page 7 of relation
base/444807/447791 does not exist
2019-12-17 19:11:20 JST [12345] CONTEXT: WAL redo at 4E/FC5AD060 for
Heap2/VISIBLE: cutoff xid 0 flags 3
2019-12-17 19:11:20 JST [12345] PANIC: XX000: WAL contains references to
invalid pages
2019-12-17 19:11:20 JST [12345] CONTEXT: WAL redo at 4E/FC5AD060 for
Heap2/VISIBLE: cutoff xid 0 flags 3
I also tested PostgreSQL with the attached patch avoided this data
corruption. The patch just removes DropRelFileNodeBuffers() from
smgrtruncate().
On Thu, 19 Dec 2019 07:14:42 +0000
PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:
Bug reference: 16172
Logged by: TAKATSUKA Haruka
Email address: harukat@sraoss.co.jp
PostgreSQL version: 12.1
Operating system: Windows/Linux
Description:Hello, pgsql hackers,
I found that failure of vacuum file truncation can cause permanent data
corruption.
I am reporting the reproduce steps below.In Windows installation, the truncation sometime fails by permission
denied error because of anti-virus software. It has caused just ERROR
and people have offen dismissed it.Truncation failure can also make the standby panic with the following
messages when replaying Heap2/VISIBLE or Heap2/CLEAN, because truncation
wal is emitted even if it doesn't complete actually in the primary.WARNING: page .. of relation base/..../.... does not exist
CONTEXT: WAL redo at ..... for ....: cutoff xid ... flags ...
PANIC: WAL contains references to invalid pagesI think truncation failure is to be handled as more severe level.
Any thoughts?with best regards,
Haruka Takatsuka / SRA OSS, Inc. Japanreproduce steps (PG12)
======================$ psql -U postgres -d db1
Pager usage is off.
psql (12.1)
Type "help" for help.db1=#
$ gdb -p {its backend process}
(gdb) b FileTruncate
Breakpoint 1 at 0x73d320: file fd.c, line 2057.
(gdb) c
Continuing.db1=# SHOW autovacuum;
autovacuum
------------
off
(1 row)db1=# CREATE TABLE t1 (id int primary key, v text);
CREATEdb1=# INSERT INTO t1 SELECT g, md5(g::text) FROM generate_series(1, 10000)
as g;
INSERT 0 10000db1=# CHECKPOINT;
Program received signal SIGUSR1, User defined signal 1.
0x00000036caae91a3 in __epoll_wait_nocancel () from /lib64/libc.so.6
(gdb) c
Continuing.CHECKPOINT
db1=# DELETE FROM t1 WHERE id > 50;
DELETE 9950db1=# VACUUM t1;
Breakpoint 1, FileTruncate (file=59, offset=8192,
wait_event_info=167772175)
at fd.c:2057
2057 {
(gdb) n
2065 returnCode = FileAccess(file);
(gdb) n
2066 if (returnCode < 0)
(gdb) p returnCode = -100
$6 = -100
(gdb) c
Continuing.ERROR: could not truncate file "base/16384/16645" to 1 blocks: Success
db1=# SELECT count(*) FROM t1;
count
-------
9930
(1 row)
(snip)
Attachments:
12stable_dont_drop_buffer.difftext/plain; name=12stable_dont_drop_buffer.diffDownload+0-6
I found moving DropRelFileNodeBuffers() from top to end in function
smgrtruncate() is a proper modification. It passed the regression test
and this reproduction test.
with best regards,
Haruka Takatsuka / SRA OSS, Inc. Japan
On Fri, 20 Dec 2019 10:19:52 +0900
TAKATSUKA Haruka <harukat@sraoss.co.jp> wrote:
I also tested PostgreSQL with the attached patch avoided this data
corruption. The patch just removes DropRelFileNodeBuffers() from
smgrtruncate().On Thu, 19 Dec 2019 07:14:42 +0000
PG Bug reporting form <noreply@postgresql.org> wrote:The following bug has been logged on the website:
Bug reference: 16172
Logged by: TAKATSUKA Haruka
Email address: harukat@sraoss.co.jp
PostgreSQL version: 12.1
Operating system: Windows/Linux
Description:Hello, pgsql hackers,
I found that failure of vacuum file truncation can cause permanent data
corruption.
I am reporting the reproduce steps below.In Windows installation, the truncation sometime fails by permission
denied error because of anti-virus software. It has caused just ERROR
and people have offen dismissed it.Truncation failure can also make the standby panic with the following
messages when replaying Heap2/VISIBLE or Heap2/CLEAN, because truncation
wal is emitted even if it doesn't complete actually in the primary.WARNING: page .. of relation base/..../.... does not exist
CONTEXT: WAL redo at ..... for ....: cutoff xid ... flags ...
PANIC: WAL contains references to invalid pagesI think truncation failure is to be handled as more severe level.
Any thoughts?with best regards,
Haruka Takatsuka / SRA OSS, Inc. Japanreproduce steps (PG12)
======================$ psql -U postgres -d db1
Pager usage is off.
psql (12.1)
Type "help" for help.db1=#
$ gdb -p {its backend process}
(gdb) b FileTruncate
Breakpoint 1 at 0x73d320: file fd.c, line 2057.
(gdb) c
Continuing.db1=# SHOW autovacuum;
autovacuum
------------
off
(1 row)db1=# CREATE TABLE t1 (id int primary key, v text);
CREATEdb1=# INSERT INTO t1 SELECT g, md5(g::text) FROM generate_series(1, 10000)
as g;
INSERT 0 10000db1=# CHECKPOINT;
Program received signal SIGUSR1, User defined signal 1.
0x00000036caae91a3 in __epoll_wait_nocancel () from /lib64/libc.so.6
(gdb) c
Continuing.CHECKPOINT
db1=# DELETE FROM t1 WHERE id > 50;
DELETE 9950db1=# VACUUM t1;
Breakpoint 1, FileTruncate (file=59, offset=8192,
wait_event_info=167772175)
at fd.c:2057
2057 {
(gdb) n
2065 returnCode = FileAccess(file);
(gdb) n
2066 if (returnCode < 0)
(gdb) p returnCode = -100
$6 = -100
(gdb) c
Continuing.ERROR: could not truncate file "base/16384/16645" to 1 blocks: Success
db1=# SELECT count(*) FROM t1;
count
-------
9930
(1 row)(snip)
______________________________________________________________________
高塚 遥 harukat@sraoss.co.jp SRA OSS, Inc. http://www.sraoss.co.jp
〒171-0022 東京都豊島区南池袋2-32-8
TEL: 03-5979-2701 FAX: 03-5979-2702 CellPhone: 080-1292-3396
Attachments:
12stable_move_bufferdrop.difftext/plain; name=12stable_move_bufferdrop.diffDownload+6-6
Hello,
At Fri, 20 Dec 2019 11:00:28 +0900, TAKATSUKA Haruka <harukat@sraoss.co.jp> wrote in
I found moving DropRelFileNodeBuffers() from top to end in function
smgrtruncate() is a proper modification. It passed the regression test
and this reproduction test.
I don't recall clealy but I think there was a thread similar to this
issue. Assume that checkpoint was running concurrently, the
checkpoint can revive the just truncated blocks inadvertently with
bogus content.
reagards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Horiguchi-san
Thanks for pointing out it.
I wasn't looking for past arguments enough.
You may mean this links are:
/messages/by-id/2348.1544474335@sss.pgh.pa.us
/messages/by-id/15667-8d3fca4eba25174f@postgresql.org
regards,
TAKATSUKA Haruka
On Fri, 20 Dec 2019 14:22:22 +0900 (JST)
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
Show quoted text
Hello,
At Fri, 20 Dec 2019 11:00:28 +0900, TAKATSUKA Haruka <harukat@sraoss.co.jp> wrote in
I found moving DropRelFileNodeBuffers() from top to end in function
smgrtruncate() is a proper modification. It passed the regression test
and this reproduction test.I don't recall clealy but I think there was a thread similar to this
issue. Assume that checkpoint was running concurrently, the
checkpoint can revive the just truncated blocks inadvertently with
bogus content.reagards.
--
Kyotaro Horiguchi
NTT Open Source Software Center