found xmin from before relfrozenxid on pg_catalog.pg_authid
Getting some concerning errors in one of our databases that is on 9.5.11,
on autovacuum from template0 database pg_authid and pg_auth_members. I
only saw some notes on the list about this error related to materialized
views. FWIW, we did use pg_upgrade to upgrade this database from 9.4 to
9.5. Here is an example:
2018-03-19 12:08:33.946 CDT,,,14892,,5aafee91.3a2c,1,,2018-03-19 12:08:33
CDT,59/340951,0,ERROR,XX001,"found xmin 2906288382 from before relfrozenxid
740087784",,,,,"automatic vacuum of table
""template0.pg_catalog.pg_authid""",,,,""
2018-03-19 12:08:33.957 CDT,,,14892,,5aafee91.3a2c,2,,2018-03-19 12:08:33
CDT,59/340953,0,ERROR,XX001,"found xmin 2906288383 from before relfrozenxid
740087784",,,,,"automatic vacuum of table
""template0.pg_catalog.pg_auth_members""",,,,""
Any insight would be much appreciated.
Thanks,
Jeremy
Jeremy Finzel wrote:
Getting some concerning errors in one of our databases that is on 9.5.11,
on autovacuum from template0 database pg_authid and pg_auth_members. I
only saw some notes on the list about this error related to materialized
views. FWIW, we did use pg_upgrade to upgrade this database from 9.4 to
9.5. Here is an example:2018-03-19 12:08:33.946 CDT,,,14892,,5aafee91.3a2c,1,,2018-03-19 12:08:33
CDT,59/340951,0,ERROR,XX001,"found xmin 2906288382 from before relfrozenxid
740087784",,,,,"automatic vacuum of table
""template0.pg_catalog.pg_authid""",,,,""
Can you please supply output of pg_controldata?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pg_control version number: 942
Catalog version number: 201510051
Database system identifier: 6351536019599012028
Database cluster state: in production
pg_control last modified: Mon 19 Mar 2018 12:56:10 PM CDT
Latest checkpoint location: 262BE/FE96240
Prior checkpoint location: 262BA/623D5E40
Latest checkpoint's REDO location: 262BA/F5499E98
Latest checkpoint's REDO WAL file: 00000001000262BA000000F5
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 16/3132524419
Latest checkpoint's NextOID: 1090653331
Latest checkpoint's NextMultiXactId: 2142
Latest checkpoint's NextMultiOffset: 5235
Latest checkpoint's oldestXID: 1829964553
Latest checkpoint's oldestXID's DB: 12376
Latest checkpoint's oldestActiveXID: 3131774441
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 16400
Latest checkpoint's oldestCommitTsXid:1829964553
Latest checkpoint's newestCommitTsXid:3132524418
Time of latest checkpoint: Mon 19 Mar 2018 12:54:08 PM CDT
Fake LSN counter for unlogged rels: 0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: logical
wal_log_hints setting: off
max_connections setting: 2000
max_worker_processes setting: 10
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: on
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0
On Mon, Mar 19, 2018 at 12:46 PM, Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:
Show quoted text
Jeremy Finzel wrote:
Getting some concerning errors in one of our databases that is on 9.5.11,
on autovacuum from template0 database pg_authid and pg_auth_members. I
only saw some notes on the list about this error related to materialized
views. FWIW, we did use pg_upgrade to upgrade this database from 9.4 to
9.5. Here is an example:2018-03-19 12:08:33.946 CDT,,,14892,,5aafee91.3a2c,1,,2018-03-19
12:08:33
CDT,59/340951,0,ERROR,XX001,"found xmin 2906288382 from before
relfrozenxid
740087784",,,,,"automatic vacuum of table
""template0.pg_catalog.pg_authid""",,,,""Can you please supply output of pg_controldata?
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Mar 19, 2018 at 12:46 PM, Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:
Jeremy Finzel wrote:
Getting some concerning errors in one of our databases that is on 9.5.11,
on autovacuum from template0 database pg_authid and pg_auth_members. I
only saw some notes on the list about this error related to materialized
views. FWIW, we did use pg_upgrade to upgrade this database from 9.4 to
9.5. Here is an example:2018-03-19 12:08:33.946 CDT,,,14892,,5aafee91.3a2c,1,,2018-03-19
12:08:33
CDT,59/340951,0,ERROR,XX001,"found xmin 2906288382 from before
relfrozenxid
740087784",,,,,"automatic vacuum of table
""template0.pg_catalog.pg_authid""",,,,""Can you please supply output of pg_controldata?
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Please forgive my accidental top-post. Here:
pg_control version number: 942
Catalog version number: 201510051
Database system identifier: 6351536019599012028
Database cluster state: in production
pg_control last modified: Mon 19 Mar 2018 12:56:10 PM CDT
Latest checkpoint location: 262BE/FE96240
Prior checkpoint location: 262BA/623D5E40
Latest checkpoint's REDO location: 262BA/F5499E98
Latest checkpoint's REDO WAL file: 00000001000262BA000000F5
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 16/3132524419
Latest checkpoint's NextOID: 1090653331
Latest checkpoint's NextMultiXactId: 2142
Latest checkpoint's NextMultiOffset: 5235
Latest checkpoint's oldestXID: 1829964553
Latest checkpoint's oldestXID's DB: 12376
Latest checkpoint's oldestActiveXID: 3131774441
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 16400
Latest checkpoint's oldestCommitTsXid:1829964553
Latest checkpoint's newestCommitTsXid:3132524418
Time of latest checkpoint: Mon 19 Mar 2018 12:54:08 PM CDT
Fake LSN counter for unlogged rels: 0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: logical
wal_log_hints setting: off
max_connections setting: 2000
max_worker_processes setting: 10
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: on
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0
Thanks,
Jeremy
Hi Jeremy, Alvaro,
On 2018-03-19 13:00:13 -0500, Jeremy Finzel wrote:
On Mon, Mar 19, 2018 at 12:46 PM, Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:Jeremy Finzel wrote:
Getting some concerning errors in one of our databases that is on 9.5.11,
on autovacuum from template0 database pg_authid and pg_auth_members. I
only saw some notes on the list about this error related to materialized
views. FWIW, we did use pg_upgrade to upgrade this database from 9.4 to
9.5. Here is an example:2018-03-19 12:08:33.946 CDT,,,14892,,5aafee91.3a2c,1,,2018-03-19
12:08:33
CDT,59/340951,0,ERROR,XX001,"found xmin 2906288382 from before
relfrozenxid
740087784",,,,,"automatic vacuum of table
""template0.pg_catalog.pg_authid""",,,,""Can you please supply output of pg_controldata?
Latest checkpoint's NextXID: 16/3132524419
Latest checkpoint's NextMultiXactId: 2142
Latest checkpoint's NextMultiOffset: 5235
Latest checkpoint's oldestXID: 1829964553
Latest checkpoint's oldestXID's DB: 12376
Latest checkpoint's oldestActiveXID: 3131774441
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 16400
Hm, based on these it doesn't look like multixacts were involved (based
on oldestMultiXid it's highly unlikley there've multi wraparound, and
there's not much multixact usage on system tables anyway). Which
suggests that there might have been actual corrpution here.
Jeremy:
- which version of 9.4 and 9.5 ran on this? Do you know?
- Can you install the pageinspect extension? If so, it might be a
CREATE EXTENSION pageinspect;
CREATE OR REPLACE FUNCTION check_rel(rel regclass, OUT blockno int8, OUT lp int2, OUT xmin xid)
RETURNS SETOF RECORD
LANGUAGE SQL
AS $$
SELECT blockno, lp, t_xmin
FROM
generate_series(0, pg_relation_size($1::text) / 8192 - 1) blockno, -- every block in the relation
heap_page_items(get_raw_page($1::text, blockno::int4)) -- every item on the page
WHERE
t_xmin IS NOT NULL -- filter out empty items
AND t_xmin != 1 -- filter out bootstrap
AND t_xmin != 2 -- filter out frozen transaction id
AND (t_infomask & ((x'0100' | x'0200')::int)) != ((x'0100' | x'0200')::int) -- filter out frozen rows with xid present
AND age(t_xmin) > age((SELECT relfrozenxid FROM pg_class WHERE oid = $1)) -- xid cutoff filter
$$;
SELECT * FROM check_rel('pg_authid') LIMIT 100;
and then display all items for one of the affected pages like
SELECT * FROM heap_page_items(get_raw_page('pg_authid', 34343));
Alvaro:
- Hm, we talked about code adding context for these kind of errors,
right? Is that just skipped for csvlog?
- Alvaro, does the above check_rel() function make sense?
Greetings,
Andres Freund
On Mon, Mar 19, 2018 at 1:17 PM, Andres Freund <andres@anarazel.de> wrote:
Hi Jeremy, Alvaro,
On 2018-03-19 13:00:13 -0500, Jeremy Finzel wrote:
On Mon, Mar 19, 2018 at 12:46 PM, Alvaro Herrera <
alvherre@alvh.no-ip.org>
wrote:
Jeremy Finzel wrote:
Getting some concerning errors in one of our databases that is on
9.5.11,
on autovacuum from template0 database pg_authid and
pg_auth_members. I
only saw some notes on the list about this error related to
materialized
views. FWIW, we did use pg_upgrade to upgrade this database from
9.4 to
9.5. Here is an example:
2018-03-19 12:08:33.946 CDT,,,14892,,5aafee91.3a2c,1,,2018-03-19
12:08:33
CDT,59/340951,0,ERROR,XX001,"found xmin 2906288382 from before
relfrozenxid
740087784",,,,,"automatic vacuum of table
""template0.pg_catalog.pg_authid""",,,,""Can you please supply output of pg_controldata?
Latest checkpoint's NextXID: 16/3132524419
Latest checkpoint's NextMultiXactId: 2142
Latest checkpoint's NextMultiOffset: 5235
Latest checkpoint's oldestXID: 1829964553
Latest checkpoint's oldestXID's DB: 12376
Latest checkpoint's oldestActiveXID: 3131774441
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 16400Hm, based on these it doesn't look like multixacts were involved (based
on oldestMultiXid it's highly unlikley there've multi wraparound, and
there's not much multixact usage on system tables anyway). Which
suggests that there might have been actual corrpution here.Jeremy:
- which version of 9.4 and 9.5 ran on this? Do you know?
We upgraded to 9.5.5, and today we are running 9.5.11. And actually we
upgraded from 9.3, not 9.4. We are still trying to figure out which point
release we were on at 9.3.
- Can you install the pageinspect extension? If so, it might be a
CREATE EXTENSION pageinspect;
CREATE OR REPLACE FUNCTION check_rel(rel regclass, OUT blockno int8, OUT
lp int2, OUT xmin xid)
RETURNS SETOF RECORD
LANGUAGE SQL
AS $$
SELECT blockno, lp, t_xmin
FROM
generate_series(0, pg_relation_size($1::text) / 8192 - 1) blockno,
-- every block in the relation
heap_page_items(get_raw_page($1::text, blockno::int4)) -- every
item on the page
WHERE
t_xmin IS NOT NULL -- filter out empty items
AND t_xmin != 1 -- filter out bootstrap
AND t_xmin != 2 -- filter out frozen transaction id
AND (t_infomask & ((x'0100' | x'0200')::int)) != ((x'0100' |
x'0200')::int) -- filter out frozen rows with xid present
AND age(t_xmin) > age((SELECT relfrozenxid FROM pg_class WHERE oid
= $1)) -- xid cutoff filter
$$;
SELECT * FROM check_rel('pg_authid') LIMIT 100;
Small note - Needs to be this because != is not supported for xid:
AND NOT t_xmin = 1 -- filter out bootstrap
AND NOT t_xmin = 2 -- filter out frozen transaction id
and then display all items for one of the affected pages like
SELECT * FROM heap_page_items(get_raw_page('pg_authid', 34343));Alvaro:
- Hm, we talked about code adding context for these kind of errors,
right? Is that just skipped for csvlog?
- Alvaro, does the above check_rel() function make sense?Greetings,
Andres Freund
The function does NOT show any issue with either of those tables.
One very interesting thing that is puzzling us - we have taken several san
snapshots of the system real time that are running on the exact same
version 9.5.11, and they do NOT show the same error when we vacuum these
tables. It makes us wonder if simply a db restart would solve the issue.
We will continue to investigate but interested in your feedback about what
we have seen thus far.
Thanks,
Jeremy
On 2018-03-19 14:37:24 -0500, Jeremy Finzel wrote:
We upgraded to 9.5.5, and today we are running 9.5.11. And actually we
upgraded from 9.3, not 9.4. We are still trying to figure out which point
release we were on at 9.3.
Ok. IIRC there used to be a bug a few years back that sometimes lead to
highly contended pages being skipped during vacuum, and we'd still
update relfrozenxid. IIRC it required the table to be extended at the
same time or something?
- Can you install the pageinspect extension? If so, it might be a
CREATE EXTENSION pageinspect;
CREATE OR REPLACE FUNCTION check_rel(rel regclass, OUT blockno int8, OUT
lp int2, OUT xmin xid)
RETURNS SETOF RECORD
LANGUAGE SQL
AS $$
SELECT blockno, lp, t_xmin
FROM
generate_series(0, pg_relation_size($1::text) / 8192 - 1) blockno,
-- every block in the relation
heap_page_items(get_raw_page($1::text, blockno::int4)) -- every
item on the page
WHERE
t_xmin IS NOT NULL -- filter out empty items
AND t_xmin != 1 -- filter out bootstrap
AND t_xmin != 2 -- filter out frozen transaction id
AND (t_infomask & ((x'0100' | x'0200')::int)) != ((x'0100' |
x'0200')::int) -- filter out frozen rows with xid present
AND age(t_xmin) > age((SELECT relfrozenxid FROM pg_class WHERE oid
= $1)) -- xid cutoff filter
$$;
SELECT * FROM check_rel('pg_authid') LIMIT 100;Small note - Needs to be this because != is not supported for xid:
AND NOT t_xmin = 1 -- filter out bootstrap
AND NOT t_xmin = 2 -- filter out frozen transaction id
Only on older releases ;). But yea, that looks right.
and then display all items for one of the affected pages like
SELECT * FROM heap_page_items(get_raw_page('pg_authid', 34343));Alvaro:
- Hm, we talked about code adding context for these kind of errors,
right? Is that just skipped for csvlog?
- Alvaro, does the above check_rel() function make sense?Greetings,
Andres Freund
The function does NOT show any issue with either of those tables.
Uh, huh? Alvaro, do you see a bug in my query?
Greetings,
Andres Freund
On Mon, Mar 19, 2018 at 2:41 PM, Andres Freund <andres@anarazel.de> wrote:
On 2018-03-19 14:37:24 -0500, Jeremy Finzel wrote:
We upgraded to 9.5.5, and today we are running 9.5.11. And actually we
upgraded from 9.3, not 9.4. We are still trying to figure out whichpoint
release we were on at 9.3.
Ok. IIRC there used to be a bug a few years back that sometimes lead to
highly contended pages being skipped during vacuum, and we'd still
update relfrozenxid. IIRC it required the table to be extended at the
same time or something?- Can you install the pageinspect extension? If so, it might be a
CREATE EXTENSION pageinspect;
CREATE OR REPLACE FUNCTION check_rel(rel regclass, OUT blockno int8,OUT
lp int2, OUT xmin xid)
RETURNS SETOF RECORD
LANGUAGE SQL
AS $$
SELECT blockno, lp, t_xmin
FROM
generate_series(0, pg_relation_size($1::text) / 8192 - 1)blockno,
-- every block in the relation
heap_page_items(get_raw_page($1::text, blockno::int4)) --every
item on the page
WHERE
t_xmin IS NOT NULL -- filter out empty items
AND t_xmin != 1 -- filter out bootstrap
AND t_xmin != 2 -- filter out frozen transaction id
AND (t_infomask & ((x'0100' | x'0200')::int)) != ((x'0100' |
x'0200')::int) -- filter out frozen rows with xid present
AND age(t_xmin) > age((SELECT relfrozenxid FROM pg_class WHEREoid
= $1)) -- xid cutoff filter
$$;
SELECT * FROM check_rel('pg_authid') LIMIT 100;Small note - Needs to be this because != is not supported for xid:
AND NOT t_xmin = 1 -- filter out bootstrap
AND NOT t_xmin = 2 -- filter out frozen transaction idOnly on older releases ;). But yea, that looks right.
and then display all items for one of the affected pages like
SELECT * FROM heap_page_items(get_raw_page('pg_authid', 34343));Alvaro:
- Hm, we talked about code adding context for these kind of errors,
right? Is that just skipped for csvlog?
- Alvaro, does the above check_rel() function make sense?Greetings,
Andres Freund
The function does NOT show any issue with either of those tables.
Uh, huh? Alvaro, do you see a bug in my query?
Greetings,
Andres Freund
FWIW, if I remove the last filter, I get these rows and I believe row 7/57/
2906288382 is the one generating error:
SELECT * FROM check_rel('pg_authid') LIMIT 100;
blockno | lp | xmin
---------+----+------------
7 | 4 | 2040863716
7 | 5 | 2040863716
7 | 8 | 2041172882
7 | 9 | 2041172882
7 | 12 | 2041201779
7 | 13 | 2041201779
7 | 16 | 2089742733
7 | 17 | 2090021318
7 | 18 | 2090021318
7 | 47 | 2090021898
7 | 48 | 2090021898
7 | 49 | 2102749003
7 | 50 | 2103210571
7 | 51 | 2103210571
7 | 54 | 2154640913
7 | 55 | 2163849781
7 | 56 | 2295315714
7 | 57 | 2906288382
7 | 58 | 2906329443
7 | 60 | 3131766386
8 | 1 | 2089844462
8 | 2 | 2089844462
8 | 3 | 2089844463
8 | 6 | 2089844463
8 | 9 | 2295318868
(25 rows)
Hi,
On 2018-03-19 14:53:58 -0500, Jeremy Finzel wrote:
FWIW, if I remove the last filter, I get these rows and I believe row 7/57/
2906288382 is the one generating error:
Oh, yea, that makes sense. It's wrapped around and looks like it's from
the future.
SELECT * FROM check_rel('pg_authid') LIMIT 100;
blockno | lp | xmin
---------+----+------------
7 | 4 | 2040863716
7 | 5 | 2040863716
7 | 8 | 2041172882
7 | 9 | 2041172882
7 | 12 | 2041201779
7 | 13 | 2041201779
7 | 16 | 2089742733
7 | 17 | 2090021318
7 | 18 | 2090021318
7 | 47 | 2090021898
7 | 48 | 2090021898
7 | 49 | 2102749003
7 | 50 | 2103210571
7 | 51 | 2103210571
7 | 54 | 2154640913
7 | 55 | 2163849781
7 | 56 | 2295315714
7 | 57 | 2906288382
7 | 58 | 2906329443
7 | 60 | 3131766386
8 | 1 | 2089844462
8 | 2 | 2089844462
8 | 3 | 2089844463
8 | 6 | 2089844463
8 | 9 | 2295318868
(25 rows)
Could you show the contents of those two pages with a query like I had
in an earlier email?
Greetings,
Andres Freund
On Mon, Mar 19, 2018 at 2:56 PM, Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2018-03-19 14:53:58 -0500, Jeremy Finzel wrote:
FWIW, if I remove the last filter, I get these rows and I believe row
7/57/
2906288382 is the one generating error:
Oh, yea, that makes sense. It's wrapped around and looks like it's from
the future.SELECT * FROM check_rel('pg_authid') LIMIT 100;
blockno | lp | xmin
---------+----+------------
7 | 4 | 2040863716
7 | 5 | 2040863716
7 | 8 | 2041172882
7 | 9 | 2041172882
7 | 12 | 2041201779
7 | 13 | 2041201779
7 | 16 | 2089742733
7 | 17 | 2090021318
7 | 18 | 2090021318
7 | 47 | 2090021898
7 | 48 | 2090021898
7 | 49 | 2102749003
7 | 50 | 2103210571
7 | 51 | 2103210571
7 | 54 | 2154640913
7 | 55 | 2163849781
7 | 56 | 2295315714
7 | 57 | 2906288382
7 | 58 | 2906329443
7 | 60 | 3131766386
8 | 1 | 2089844462
8 | 2 | 2089844462
8 | 3 | 2089844463
8 | 6 | 2089844463
8 | 9 | 2295318868
(25 rows)Could you show the contents of those two pages with a query like I had
in an earlier email?Greetings,
Andres Freund
SELECT heap_page_items(get_raw_page('pg_authid', 7));
heap_page_items
--------------------------------------------------------------------------------------------------------------------------------------------
(1,4720,1,108,1897434979,0,0,"(7,1)",11,2825,32,111111111000000000000000000000000000000001011000010011111100001001111000,507769370)
(2,4608,1,108,1897442758,0,18,"(7,2)",11,2825,32,111111111000000000000000000000000000000011000011101100000010001001111000,507776451)
(3,4496,1,108,1897442758,0,20,"(7,3)",11,2825,32,111111111000000000000000000000000000000000100011101100000010001001111000,507776452)
(4,4384,1,108,2040863716,0,37,"(7,4)",11,2313,32,111111111000000000000000000000000000000000110110111011100011001011111000,525105004)
(5,4272,1,108,2040863716,0,39,"(7,5)",11,2313,32,111111111000000000000000000000000000000010110110111011100011001011111000,525105005)
(6,0,3,0,,,,,,,,,)
(7,0,3,0,,,,,,,,,)
(8,4160,1,108,2041172882,0,49,"(7,8)",11,2313,32,111111111000000000000000000000000000000001110100101011000111001011111000,525219118)
(9,4048,1,108,2041172882,0,51,"(7,9)",11,2313,32,111111111000000000000000000000000000000011110100101011000111001011111000,525219119)
(10,0,3,0,,,,,,,,,)
(11,0,3,0,,,,,,,,,)
(12,3936,1,108,2041201779,0,181,"(7,12)",11,2313,32,111111111000000000000000000000000000000011010100010111100111001011111000,525236779)
(13,3824,1,108,2041201779,0,183,"(7,13)",11,2313,32,111111111000000000000000000000000000000000110100010111100111001011111000,525236780)
(14,0,3,0,,,,,,,,,)
(15,0,3,0,,,,,,,,,)
(16,3712,1,108,2089742733,0,0,"(7,16)",11,2313,32,111111111000000000000000000000000000000001000101110011100000001111111000,532706210)
(17,3600,1,108,2090021318,0,1,"(7,17)",11,2313,32,111111111000000000000000000000000000000001001100001101001000001111111000,532753458)
(18,3488,1,108,2090021318,0,3,"(7,18)",11,2313,32,111111111000000000000000000000000000000011001100001101001000001111111000,532753459)
(19,8080,1,108,753125991,0,2,"(7,19)",11,2825,32,111111111000000000000000000000000000000000110001010111001011100001110000,236796556)
(20,7968,1,108,753125991,0,4,"(7,20)",11,2825,32,111111111000000000000000000000000000000010110001010111001011100001110000,236796557)
(21,7856,1,108,753125992,0,19,"(7,21)",11,2825,32,111111111000000000000000000000000000000011001001010111001011100001110000,236796563)
(22,7744,1,108,753125992,0,21,"(7,22)",11,2825,32,111111111000000000000000000000000000000000101001010111001011100001110000,236796564)
(23,7632,1,108,753125993,0,36,"(7,23)",11,2825,32,111111111000000000000000000000000000000001011001010111001011100001110000,236796570)
(24,7520,1,108,753125993,0,38,"(7,24)",11,2825,32,111111111000000000000000000000000000000011011001010111001011100001110000,236796571)
(25,7408,1,108,753125994,0,53,"(7,25)",11,2825,32,111111111000000000000000000000000000000010000101010111001011100001110000,236796577)
(26,7296,1,108,753125994,0,55,"(7,26)",11,2825,32,111111111000000000000000000000000000000001000101010111001011100001110000,236796578)
(27,7184,1,108,753125995,0,70,"(7,27)",11,2825,32,111111111000000000000000000000000000000000010101010111001011100001110000,236796584)
(28,7072,1,108,753125995,0,72,"(7,28)",11,2825,32,111111111000000000000000000000000000000010010101010111001011100001110000,236796585)
(29,6960,1,108,753125997,0,87,"(7,29)",11,2825,32,111111111000000000000000000000000000000011001101010111001011100001110000,236796595)
(30,6848,1,108,753125997,0,89,"(7,30)",11,2825,32,111111111000000000000000000000000000000000101101010111001011100001110000,236796596)
(31,6736,1,108,753125998,0,104,"(7,31)",11,2825,32,111111111000000000000000000000000000000001011101010111001011100001110000,236796602)
(32,6624,1,108,753125998,0,106,"(7,32)",11,2825,32,111111111000000000000000000000000000000011011101010111001011100001110000,236796603)
(33,6512,1,108,753125999,0,121,"(7,33)",11,2825,32,111111111000000000000000000000000000000010000011010111001011100001110000,236796609)
(34,6400,1,108,753125999,0,123,"(7,34)",11,2825,32,111111111000000000000000000000000000000001000011010111001011100001110000,236796610)
(35,6288,1,108,753126000,0,138,"(7,35)",11,2825,32,111111111000000000000000000000000000000000010011010111001011100001110000,236796616)
(36,6176,1,108,753126000,0,140,"(7,36)",11,2825,32,111111111000000000000000000000000000000010010011010111001011100001110000,236796617)
(37,6064,1,108,753126001,0,155,"(7,37)",11,2825,32,111111111000000000000000000000000000000011110011010111001011100001110000,236796623)
(38,5952,1,108,753126001,0,157,"(7,38)",11,2825,32,111111111000000000000000000000000000000000001011010111001011100001110000,236796624)
(39,5840,1,108,753126002,0,172,"(7,39)",11,2825,32,111111111000000000000000000000000000000001101011010111001011100001110000,236796630)
(40,5728,1,108,753126002,0,174,"(7,40)",11,2825,32,111111111000000000000000000000000000000011101011010111001011100001110000,236796631)
(41,5616,1,108,753126003,0,189,"(7,41)",11,2825,32,111111111000000000000000000000000000000010111011010111001011100001110000,236796637)
(42,5504,1,108,753126003,0,191,"(7,42)",11,2825,32,111111111000000000000000000000000000000001111011010111001011100001110000,236796638)
(43,5392,1,108,753126004,0,206,"(7,43)",11,2825,32,111111111000000000000000000000000000000000100111010111001011100001110000,236796644)
(44,5280,1,108,753126004,0,208,"(7,44)",11,2825,32,111111111000000000000000000000000000000010100111010111001011100001110000,236796645)
(45,5168,1,108,753126005,0,223,"(7,45)",11,2825,32,111111111000000000000000000000000000000011010111010111001011100001110000,236796651)
(46,5056,1,108,753126005,0,225,"(7,46)",11,2825,32,111111111000000000000000000000000000000000110111010111001011100001110000,236796652)
(47,3376,1,108,2090021898,0,18,"(7,47)",11,2313,32,111111111000000000000000000000000000000001000111001101001000001111111000,532753634)
(48,3264,1,108,2090021898,0,20,"(7,48)",11,2313,32,111111111000000000000000000000000000000000100111001101001000001111111000,532753636)
(49,3152,1,108,2102749003,0,0,"(7,49)",11,2313,32,111111111000000000000000000000000000000001111011101110010000110100000100,548445662)
(50,3040,1,108,2103210571,0,107,"(7,50)",11,2313,32,111111111000000000000000000000000000000000010110101100100100110100000100,548556136)
(51,2928,1,108,2103210571,0,109,"(7,51)",11,2313,32,111111111000000000000000000000000000000010010110101100100100110100000100,548556137)
(52,0,3,0,,,,,,,,,)
(53,0,3,0,,,,,,,,,)
(54,2816,1,108,2154640913,0,0,"(7,54)",11,2313,32,111111111000000000000000000000000000000010111111010010000000100010000100,554701565)
(55,2704,1,108,2163849781,0,0,"(7,55)",11,2313,32,111111111000000000000000000000000000000000110000001011101001100010000100,555316236)
(56,2592,1,108,2295315714,0,0,"(7,56)",11,2313,32,111111111000000000000000000000000000000000100101010100001000010001000100,572590756)
(57,2480,1,108,2906288382,0,0,"(7,57)",11,2313,32,111111111000000000000000000000000000000000001000101101000100111001111100,1047670032)
(58,2368,1,108,2906329443,0,0,"(7,58)",11,2313,32,111111111000000000000000000000000000000001111001101101000100111001111100,1047670174)
(59,60,2,0,,,,,,,,,)
(60,2224,1,144,3131766386,0,0,"(7,60)",32779,10507,32,111111111100000000000000000000000000000000001101010001010011111100000010,1090298544)
(61,0,0,0,,,,,,,,,)
(62,0,0,0,,,,,,,,,)
(63,0,0,0,,,,,,,,,)
(64,0,0,0,,,,,,,,,)
(65,0,0,0,,,,,,,,,)
(66,0,0,0,,,,,,,,,)
(67,0,0,0,,,,,,,,,)
(68,0,0,0,,,,,,,,,)
(69,4944,1,108,1034607755,0,1,"(7,69)",11,2825,32,111111111000000000000000000000000000000000010111011000010010010100001000,279217896)
(70,4832,1,108,1034607755,0,3,"(7,70)",11,2825,32,111111111000000000000000000000000000000010010111011000010010010100001000,279217897)
(70 rows)
On Mon, Mar 19, 2018 at 3:01 PM, Jeremy Finzel <finzelj@gmail.com> wrote:
On Mon, Mar 19, 2018 at 2:56 PM, Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2018-03-19 14:53:58 -0500, Jeremy Finzel wrote:
FWIW, if I remove the last filter, I get these rows and I believe row
7/57/
2906288382 is the one generating error:
Oh, yea, that makes sense. It's wrapped around and looks like it's from
the future.SELECT * FROM check_rel('pg_authid') LIMIT 100;
blockno | lp | xmin
---------+----+------------
7 | 4 | 2040863716
7 | 5 | 2040863716
7 | 8 | 2041172882
7 | 9 | 2041172882
7 | 12 | 2041201779
7 | 13 | 2041201779
7 | 16 | 2089742733 <(208)%20974-2733>
7 | 17 | 2090021318
7 | 18 | 2090021318
7 | 47 | 2090021898
7 | 48 | 2090021898
7 | 49 | 2102749003 <(210)%20274-9003>
7 | 50 | 2103210571 <(210)%20321-0571>
7 | 51 | 2103210571 <(210)%20321-0571>
7 | 54 | 2154640913 <(215)%20464-0913>
7 | 55 | 2163849781 <(216)%20384-9781>
7 | 56 | 2295315714 <(229)%20531-5714>
7 | 57 | 2906288382
7 | 58 | 2906329443
7 | 60 | 3131766386
8 | 1 | 2089844462 <(208)%20984-4462>
8 | 2 | 2089844462 <(208)%20984-4462>
8 | 3 | 2089844463 <(208)%20984-4463>
8 | 6 | 2089844463 <(208)%20984-4463>
8 | 9 | 2295318868 <(229)%20531-8868>
(25 rows)Could you show the contents of those two pages with a query like I had
in an earlier email?Greetings,
Andres Freund
SELECT heap_page_items(get_raw_page('pg_authid', 7));
heap_page_items
------------------------------------------------------------
------------------------------------------------------------
--------------------
(1,4720,1,108,1897434979,0,0,"(7,1)",11,2825,32,
111111111000000000000000000000000000000001011000010011111100
001001111000,507769370)
(2,4608,1,108,1897442758,0,18,"(7,2)",11,2825,32,
111111111000000000000000000000000000000011000011101100000010
001001111000,507776451)
(3,4496,1,108,1897442758,0,20,"(7,3)",11,2825,32,
111111111000000000000000000000000000000000100011101100000010
001001111000,507776452)
(4,4384,1,108,2040863716,0,37,"(7,4)",11,2313,32,
111111111000000000000000000000000000000000110110111011100011
001011111000,525105004)
(5,4272,1,108,2040863716,0,39,"(7,5)",11,2313,32,
111111111000000000000000000000000000000010110110111011100011
001011111000,525105005)
(6,0,3,0,,,,,,,,,)
(7,0,3,0,,,,,,,,,)
(8,4160,1,108,2041172882,0,49,"(7,8)",11,2313,32,
111111111000000000000000000000000000000001110100101011000111
001011111000,525219118)
(9,4048,1,108,2041172882,0,51,"(7,9)",11,2313,32,
111111111000000000000000000000000000000011110100101011000111
001011111000,525219119)
(10,0,3,0,,,,,,,,,)
(11,0,3,0,,,,,,,,,)
(12,3936,1,108,2041201779,0,181,"(7,12)",11,2313,32,
111111111000000000000000000000000000000011010100010111100111
001011111000,525236779)
(13,3824,1,108,2041201779,0,183,"(7,13)",11,2313,32,
111111111000000000000000000000000000000000110100010111100111
001011111000,525236780)
(14,0,3,0,,,,,,,,,)
(15,0,3,0,,,,,,,,,)
(16,3712,1,108,2089742733 <(208)%20974-2733>,0,0,"(7,16)",11,2313,32,
111111111000000000000000000000000000000001000101110011100000
001111111000,532706210)
(17,3600,1,108,2090021318,0,1,"(7,17)",11,2313,32,
111111111000000000000000000000000000000001001100001101001000
001111111000,532753458)
(18,3488,1,108,2090021318,0,3,"(7,18)",11,2313,32,
111111111000000000000000000000000000000011001100001101001000
001111111000,532753459)
(19,8080,1,108,753125991,0,2,"(7,19)",11,2825,32,
111111111000000000000000000000000000000000110001010111001011
100001110000,236796556)
(20,7968,1,108,753125991,0,4,"(7,20)",11,2825,32,
111111111000000000000000000000000000000010110001010111001011
100001110000,236796557)
(21,7856,1,108,753125992,0,19,"(7,21)",11,2825,32,
111111111000000000000000000000000000000011001001010111001011
100001110000,236796563)
(22,7744,1,108,753125992,0,21,"(7,22)",11,2825,32,
111111111000000000000000000000000000000000101001010111001011
100001110000,236796564)
(23,7632,1,108,753125993,0,36,"(7,23)",11,2825,32,
111111111000000000000000000000000000000001011001010111001011
100001110000,236796570)
(24,7520,1,108,753125993,0,38,"(7,24)",11,2825,32,
111111111000000000000000000000000000000011011001010111001011
100001110000,236796571)
(25,7408,1,108,753125994,0,53,"(7,25)",11,2825,32,
111111111000000000000000000000000000000010000101010111001011
100001110000,236796577)
(26,7296,1,108,753125994,0,55,"(7,26)",11,2825,32,
111111111000000000000000000000000000000001000101010111001011
100001110000,236796578)
(27,7184,1,108,753125995,0,70,"(7,27)",11,2825,32,
111111111000000000000000000000000000000000010101010111001011
100001110000,236796584)
(28,7072,1,108,753125995,0,72,"(7,28)",11,2825,32,
111111111000000000000000000000000000000010010101010111001011
100001110000,236796585)
(29,6960,1,108,753125997,0,87,"(7,29)",11,2825,32,
111111111000000000000000000000000000000011001101010111001011
100001110000,236796595)
(30,6848,1,108,753125997,0,89,"(7,30)",11,2825,32,
111111111000000000000000000000000000000000101101010111001011
100001110000,236796596)
(31,6736,1,108,753125998,0,104,"(7,31)",11,2825,32,
111111111000000000000000000000000000000001011101010111001011
100001110000,236796602)
(32,6624,1,108,753125998,0,106,"(7,32)",11,2825,32,
111111111000000000000000000000000000000011011101010111001011
100001110000,236796603)
(33,6512,1,108,753125999,0,121,"(7,33)",11,2825,32,
111111111000000000000000000000000000000010000011010111001011
100001110000,236796609)
(34,6400,1,108,753125999,0,123,"(7,34)",11,2825,32,
111111111000000000000000000000000000000001000011010111001011
100001110000,236796610)
(35,6288,1,108,753126000,0,138,"(7,35)",11,2825,32,
111111111000000000000000000000000000000000010011010111001011
100001110000,236796616)
(36,6176,1,108,753126000,0,140,"(7,36)",11,2825,32,
111111111000000000000000000000000000000010010011010111001011
100001110000,236796617)
(37,6064,1,108,753126001,0,155,"(7,37)",11,2825,32,
111111111000000000000000000000000000000011110011010111001011
100001110000,236796623)
(38,5952,1,108,753126001,0,157,"(7,38)",11,2825,32,
111111111000000000000000000000000000000000001011010111001011
100001110000,236796624)
(39,5840,1,108,753126002,0,172,"(7,39)",11,2825,32,
111111111000000000000000000000000000000001101011010111001011
100001110000,236796630)
(40,5728,1,108,753126002,0,174,"(7,40)",11,2825,32,
111111111000000000000000000000000000000011101011010111001011
100001110000,236796631)
(41,5616,1,108,753126003,0,189,"(7,41)",11,2825,32,
111111111000000000000000000000000000000010111011010111001011
100001110000,236796637)
(42,5504,1,108,753126003,0,191,"(7,42)",11,2825,32,
111111111000000000000000000000000000000001111011010111001011
100001110000,236796638)
(43,5392,1,108,753126004,0,206,"(7,43)",11,2825,32,
111111111000000000000000000000000000000000100111010111001011
100001110000,236796644)
(44,5280,1,108,753126004,0,208,"(7,44)",11,2825,32,
111111111000000000000000000000000000000010100111010111001011
100001110000,236796645)
(45,5168,1,108,753126005,0,223,"(7,45)",11,2825,32,
111111111000000000000000000000000000000011010111010111001011
100001110000,236796651)
(46,5056,1,108,753126005,0,225,"(7,46)",11,2825,32,
111111111000000000000000000000000000000000110111010111001011
100001110000,236796652)
(47,3376,1,108,2090021898,0,18,"(7,47)",11,2313,32,
111111111000000000000000000000000000000001000111001101001000
001111111000,532753634)
(48,3264,1,108,2090021898,0,20,"(7,48)",11,2313,32,
111111111000000000000000000000000000000000100111001101001000
001111111000,532753636)
(49,3152,1,108,2102749003 <(210)%20274-9003>,0,0,"(7,49)",11,2313,32,
111111111000000000000000000000000000000001111011101110010000
110100000100,548445662)
(50,3040,1,108,2103210571 <(210)%20321-0571>,0,107,"(7,50)",11,2313,32,
111111111000000000000000000000000000000000010110101100100100
110100000100,548556136)
(51,2928,1,108,2103210571 <(210)%20321-0571>,0,109,"(7,51)",11,2313,32,
111111111000000000000000000000000000000010010110101100100100
110100000100,548556137)
(52,0,3,0,,,,,,,,,)
(53,0,3,0,,,,,,,,,)
(54,2816,1,108,2154640913 <(215)%20464-0913>,0,0,"(7,54)",11,2313,32,
111111111000000000000000000000000000000010111111010010000000
100010000100,554701565)
(55,2704,1,108,2163849781 <(216)%20384-9781>,0,0,"(7,55)",11,2313,32,
111111111000000000000000000000000000000000110000001011101001
100010000100,555316236)
(56,2592,1,108,2295315714 <(229)%20531-5714>,0,0,"(7,56)",11,2313,32,
111111111000000000000000000000000000000000100101010100001000
010001000100,572590756)
(57,2480,1,108,2906288382,0,0,"(7,57)",11,2313,32,
111111111000000000000000000000000000000000001000101101000100
111001111100,1047670032)
(58,2368,1,108,2906329443,0,0,"(7,58)",11,2313,32,
111111111000000000000000000000000000000001111001101101000100
111001111100,1047670174)
(59,60,2,0,,,,,,,,,)
(60,2224,1,144,3131766386,0,0,"(7,60)",32779,10507,32,
111111111100000000000000000000000000000000001101010001010011
111100000010,1090298544)
(61,0,0,0,,,,,,,,,)
(62,0,0,0,,,,,,,,,)
(63,0,0,0,,,,,,,,,)
(64,0,0,0,,,,,,,,,)
(65,0,0,0,,,,,,,,,)
(66,0,0,0,,,,,,,,,)
(67,0,0,0,,,,,,,,,)
(68,0,0,0,,,,,,,,,)
(69,4944,1,108,1034607755,0,1,"(7,69)",11,2825,32,
111111111000000000000000000000000000000000010111011000010010
010100001000,279217896)
(70,4832,1,108,1034607755,0,3,"(7,70)",11,2825,32,
111111111000000000000000000000000000000010010111011000010010
010100001000,279217897)
(70 rows)
Does the fact that a snapshot does not have this issue suggest it could be
memory-related corruption and a db restart could clear it up?
Thanks,
Jeremy
Hi,
On 2018-03-19 15:37:51 -0500, Jeremy Finzel wrote:
Does the fact that a snapshot does not have this issue suggest it could be
memory-related corruption and a db restart could clear it up?
Could you show the page from the snapshot? I suspect it might just be a
problem that's temporarily not visible as corrupted.
Greetings,
Andres Freund
On Mon, Mar 19, 2018 at 1:01 PM, Jeremy Finzel <finzelj@gmail.com> wrote:
SELECT heap_page_items(get_raw_page('pg_authid', 7));
Can you post this?
SELECT * FROM page_header(get_raw_page('pg_authid', 7));
--
Peter Geoghegan
On Mon, Mar 19, 2018 at 3:53 PM, Peter Geoghegan <pg@bowt.ie> wrote:
On Mon, Mar 19, 2018 at 1:01 PM, Jeremy Finzel <finzelj@gmail.com> wrote:
SELECT heap_page_items(get_raw_page('pg_authid', 7));
Can you post this?
SELECT * FROM page_header(get_raw_page('pg_authid', 7));
--
Peter Geoghegan
@Peter :
staging=# SELECT * FROM page_header(get_raw_page('pg_authid', 7));
lsn | checksum | flags | lower | upper | special | pagesize |
version | prune_xid
----------------+----------+-------+-------+-------+---------+----------+---------+-----------
262B4/10FDC478 | 0 | 1 | 304 | 2224 | 8192 | 8192 |
4 | 0
(1 row)
@Andres :
This is from snapshot (on 9.5.12, but we didn't have the error either on a
9.5.11 snap):
heap_page_items
-----------------------------------------------------------------------------------------------------------------------------------------
(1,0,0,0,,,,,,,,,)
(2,0,0,0,,,,,,,,,)
(3,0,0,0,,,,,,,,,)
(4,0,0,0,,,,,,,,,)
(5,0,0,0,,,,,,,,,)
(6,8080,1,108,3137434815,0,0,"(7,6)",11,10505,32,111111111000000000000000000000000000000001111110111100001000010010000001,2166427518)
(7,7936,1,144,3137434816,0,0,"(7,7)",11,10507,32,111111111100000000000000000000000000000001110111000001110101101100001011,3504005358)
(8,0,0,0,,,,,,,,,)
(9,0,0,0,,,,,,,,,)
(10,7792,1,144,3137434817,0,0,"(7,10)",11,10507,32,111111111100000000000000000000000000000001010110001101000011011111101000,401353834)
(11,7680,1,108,3137434818,0,0,"(7,11)",11,10505,32,111111111000000000000000000000000000000001100011010010010001000001100001,2248708806)
(12,0,0,0,,,,,,,,,)
(13,0,0,0,,,,,,,,,)
(14,7568,1,108,3137434819,0,0,"(7,14)",11,10505,32,111111111000000000000000000000000000000001110011110011011011100010100101,2770187214)
(15,7456,1,108,3137434820,0,0,"(7,15)",11,10505,32,111111111000000000000000000000000000000011110001010001010011110010100001,2235343503)
(16,0,0,0,,,,,,,,,)
(17,0,0,0,,,,,,,,,)
(18,0,0,0,,,,,,,,,)
(19,0,0,0,,,,,,,,,)
(20,0,0,0,,,,,,,,,)
(21,0,0,0,,,,,,,,,)
(22,0,0,0,,,,,,,,,)
(23,0,0,0,,,,,,,,,)
(24,0,0,0,,,,,,,,,)
(25,0,0,0,,,,,,,,,)
(26,0,0,0,,,,,,,,,)
(27,0,0,0,,,,,,,,,)
(28,0,0,0,,,,,,,,,)
(29,0,0,0,,,,,,,,,)
(30,0,0,0,,,,,,,,,)
(31,0,0,0,,,,,,,,,)
(32,0,0,0,,,,,,,,,)
(33,0,0,0,,,,,,,,,)
(34,0,0,0,,,,,,,,,)
(35,0,0,0,,,,,,,,,)
(36,0,0,0,,,,,,,,,)
(37,0,0,0,,,,,,,,,)
(38,0,0,0,,,,,,,,,)
(39,0,0,0,,,,,,,,,)
(40,0,0,0,,,,,,,,,)
(41,0,0,0,,,,,,,,,)
(42,0,0,0,,,,,,,,,)
(43,0,0,0,,,,,,,,,)
(44,0,0,0,,,,,,,,,)
(45,0,0,0,,,,,,,,,)
(46,0,0,0,,,,,,,,,)
(47,0,0,0,,,,,,,,,)
(48,0,0,0,,,,,,,,,)
(49,0,0,0,,,,,,,,,)
(50,0,0,0,,,,,,,,,)
(51,0,0,0,,,,,,,,,)
(52,7344,1,108,3137434821,0,0,"(7,52)",11,10505,32,111111111000000000000000000000000000000011011011111110001010010101000001,2191859675)
(53,7232,1,108,3137434822,0,0,"(7,53)",11,10505,32,111111111000000000000000000000000000000001101011010111100110011011100100,661027542)
(54,0,0,0,,,,,,,,,)
(55,0,0,0,,,,,,,,,)
(56,0,0,0,,,,,,,,,)
(57,0,0,0,,,,,,,,,)
(58,0,0,0,,,,,,,,,)
(59,0,0,0,,,,,,,,,)
(60,0,0,0,,,,,,,,,)
(61,7120,1,108,3137434823,0,0,"(7,61)",11,10505,32,111111111000000000000000000000000000000000010111010110000101010111010100,732568296)
(62,6976,1,144,3137434824,0,0,"(7,62)",11,10507,32,111111111100000000000000000000000000000010100100001001001010110000010100,674571301)
(63,6864,1,108,3137434825,0,0,"(7,63)",11,10505,32,111111111000000000000000000000000000000010011011011111011100001001010001,2319695577)
(64,6720,1,144,3137434826,0,0,"(7,64)",11,10507,32,111111111100000000000000000000000000000001000010011001111011100100101000,345892418)
(65,6608,1,108,3137434827,0,0,"(7,65)",11,10505,32,111111111000000000000000000000000000000001000001010100010010101011001010,1398049410)
(66,6496,1,108,3137434828,0,0,"(7,66)",11,10505,32,111111111000000000000000000000000000000000101000111100110101001000101010,1414188820)
(67,6384,1,108,3137434829,0,0,"(7,67)",11,10505,32,111111111000000000000000000000000000000000110101011101111011001110101001,2513301164)
(68,0,0,0,,,,,,,,,)
(69,0,0,0,,,,,,,,,)
(70,0,0,0,,,,,,,,,)
(70 rows)
Thanks,
Jeremy
On Mon, Mar 19, 2018 at 1:55 PM, Jeremy Finzel <finzelj@gmail.com> wrote:
@Peter :
staging=# SELECT * FROM page_header(get_raw_page('pg_authid', 7));
lsn | checksum | flags | lower | upper | special | pagesize |
version | prune_xid
----------------+----------+-------+-------+-------+---------+----------+---------+-----------
262B4/10FDC478 | 0 | 1 | 304 | 2224 | 8192 | 8192 |
4 | 0
(1 row)
Thanks.
That looks normal. I wonder if the contents of that page looks
consistent with the rest of the table following manual inspection,
though. I recently saw system catalog corruption on a 9.5 instance
where an entirely different relation's page ended up in pg_attribute
and pg_depend. They were actually pristine index pages from an
application index. I still have no idea why this happened.
This is very much a guess, but it can't hurt to check if the contents
of the tuples themselves are actually sane by inspecting them with
"SELECT * FROM pg_authid". heap_page_items() doesn't actually care
about the shape of the tuples in the page, so this might have been
missed.
--
Peter Geoghegan
On Mon, Mar 19, 2018 at 4:12 PM, Peter Geoghegan <pg@bowt.ie> wrote:
On Mon, Mar 19, 2018 at 1:55 PM, Jeremy Finzel <finzelj@gmail.com> wrote:
@Peter :
staging=# SELECT * FROM page_header(get_raw_page('pg_authid', 7));
lsn | checksum | flags | lower | upper | special | pagesize |
version | prune_xid
----------------+----------+-------+-------+-------+---------+----------+---------+-----------
262B4/10FDC478 | 0 | 1 | 304 | 2224 | 8192 | 8192 |
4 | 0
(1 row)Thanks.
That looks normal. I wonder if the contents of that page looks
consistent with the rest of the table following manual inspection,
though. I recently saw system catalog corruption on a 9.5 instance
where an entirely different relation's page ended up in pg_attribute
and pg_depend. They were actually pristine index pages from an
application index. I still have no idea why this happened.This is very much a guess, but it can't hurt to check if the contents
of the tuples themselves are actually sane by inspecting them with
"SELECT * FROM pg_authid". heap_page_items() doesn't actually care
about the shape of the tuples in the page, so this might have been
missed.--
Peter Geoghegan
The data all looks fine. I even forced the index scan on both indexes
which also looks fine.
Thanks,
Jeremy
On Mon, Mar 19, 2018 at 3:55 PM, Jeremy Finzel <finzelj@gmail.com> wrote:
On Mon, Mar 19, 2018 at 3:53 PM, Peter Geoghegan <pg@bowt.ie> wrote:
On Mon, Mar 19, 2018 at 1:01 PM, Jeremy Finzel <finzelj@gmail.com> wrote:
SELECT heap_page_items(get_raw_page('pg_authid', 7));
Can you post this?
SELECT * FROM page_header(get_raw_page('pg_authid', 7));
--
Peter Geoghegan@Peter :
staging=# SELECT * FROM page_header(get_raw_page('pg_authid', 7));
lsn | checksum | flags | lower | upper | special | pagesize |
version | prune_xid
----------------+----------+-------+-------+-------+--------
-+----------+---------+-----------
262B4/10FDC478 | 0 | 1 | 304 | 2224 | 8192 | 8192 |
4 | 0
(1 row)@Andres :
This is from snapshot (on 9.5.12, but we didn't have the error either on a
9.5.11 snap):heap_page_items
------------------------------------------------------------
------------------------------------------------------------
-----------------
(1,0,0,0,,,,,,,,,)
(2,0,0,0,,,,,,,,,)
(3,0,0,0,,,,,,,,,)
(4,0,0,0,,,,,,,,,)
(5,0,0,0,,,,,,,,,)
(6,8080,1,108,3137434815 <(313)%20743-4815>,0,0,"(7,6)",11,10505,32,
111111111000000000000000000000000000000001111110111100001000
010010000001,2166427518)
(7,7936,1,144,3137434816 <(313)%20743-4816>,0,0,"(7,7)",11,10507,32,
111111111100000000000000000000000000000001110111000001110101
101100001011,3504005358)
(8,0,0,0,,,,,,,,,)
(9,0,0,0,,,,,,,,,)
(10,7792,1,144,3137434817 <(313)%20743-4817>,0,0,"(7,10)",11,10507,32,
111111111100000000000000000000000000000001010110001101000011
011111101000,401353834)
(11,7680,1,108,3137434818 <(313)%20743-4818>,0,0,"(7,11)",11,10505,32,
111111111000000000000000000000000000000001100011010010010001
000001100001,2248708806)
(12,0,0,0,,,,,,,,,)
(13,0,0,0,,,,,,,,,)
(14,7568,1,108,3137434819 <(313)%20743-4819>,0,0,"(7,14)",11,10505,32,
111111111000000000000000000000000000000001110011110011011011
100010100101,2770187214)
(15,7456,1,108,3137434820 <(313)%20743-4820>,0,0,"(7,15)",11,10505,32,
111111111000000000000000000000000000000011110001010001010011
110010100001,2235343503)
(16,0,0,0,,,,,,,,,)
(17,0,0,0,,,,,,,,,)
(18,0,0,0,,,,,,,,,)
(19,0,0,0,,,,,,,,,)
(20,0,0,0,,,,,,,,,)
(21,0,0,0,,,,,,,,,)
(22,0,0,0,,,,,,,,,)
(23,0,0,0,,,,,,,,,)
(24,0,0,0,,,,,,,,,)
(25,0,0,0,,,,,,,,,)
(26,0,0,0,,,,,,,,,)
(27,0,0,0,,,,,,,,,)
(28,0,0,0,,,,,,,,,)
(29,0,0,0,,,,,,,,,)
(30,0,0,0,,,,,,,,,)
(31,0,0,0,,,,,,,,,)
(32,0,0,0,,,,,,,,,)
(33,0,0,0,,,,,,,,,)
(34,0,0,0,,,,,,,,,)
(35,0,0,0,,,,,,,,,)
(36,0,0,0,,,,,,,,,)
(37,0,0,0,,,,,,,,,)
(38,0,0,0,,,,,,,,,)
(39,0,0,0,,,,,,,,,)
(40,0,0,0,,,,,,,,,)
(41,0,0,0,,,,,,,,,)
(42,0,0,0,,,,,,,,,)
(43,0,0,0,,,,,,,,,)
(44,0,0,0,,,,,,,,,)
(45,0,0,0,,,,,,,,,)
(46,0,0,0,,,,,,,,,)
(47,0,0,0,,,,,,,,,)
(48,0,0,0,,,,,,,,,)
(49,0,0,0,,,,,,,,,)
(50,0,0,0,,,,,,,,,)
(51,0,0,0,,,,,,,,,)
(52,7344,1,108,3137434821 <(313)%20743-4821>,0,0,"(7,52)",11,10505,32,
111111111000000000000000000000000000000011011011111110001010
010101000001,2191859675)
(53,7232,1,108,3137434822 <(313)%20743-4822>,0,0,"(7,53)",11,10505,32,
111111111000000000000000000000000000000001101011010111100110
011011100100,661027542)
(54,0,0,0,,,,,,,,,)
(55,0,0,0,,,,,,,,,)
(56,0,0,0,,,,,,,,,)
(57,0,0,0,,,,,,,,,)
(58,0,0,0,,,,,,,,,)
(59,0,0,0,,,,,,,,,)
(60,0,0,0,,,,,,,,,)
(61,7120,1,108,3137434823 <(313)%20743-4823>,0,0,"(7,61)",11,10505,32,
111111111000000000000000000000000000000000010111010110000101
010111010100,732568296)
(62,6976,1,144,3137434824 <(313)%20743-4824>,0,0,"(7,62)",11,10507,32,
111111111100000000000000000000000000000010100100001001001010
110000010100,674571301)
(63,6864,1,108,3137434825 <(313)%20743-4825>,0,0,"(7,63)",11,10505,32,
111111111000000000000000000000000000000010011011011111011100
001001010001,2319695577)
(64,6720,1,144,3137434826 <(313)%20743-4826>,0,0,"(7,64)",11,10507,32,
111111111100000000000000000000000000000001000010011001111011
100100101000,345892418)
(65,6608,1,108,3137434827 <(313)%20743-4827>,0,0,"(7,65)",11,10505,32,
111111111000000000000000000000000000000001000001010100010010
101011001010,1398049410)
(66,6496,1,108,3137434828 <(313)%20743-4828>,0,0,"(7,66)",11,10505,32,
111111111000000000000000000000000000000000101000111100110101
001000101010,1414188820)
(67,6384,1,108,3137434829 <(313)%20743-4829>,0,0,"(7,67)",11,10505,32,
111111111000000000000000000000000000000000110101011101111011
001110101001,2513301164)
(68,0,0,0,,,,,,,,,)
(69,0,0,0,,,,,,,,,)
(70,0,0,0,,,,,,,,,)
(70 rows)Thanks,
Jeremy
Any suggestions as to what I can do from this point? Is it feasible that a
server restart would fix this, or are you saying it would just mask the
problem? I can't reproduce it on a snapshot which is quite odd.
FWIW, maybe this is obvious, but the pages look identical on the streamers
we have to the master.
Thanks,
Jeremy
On Tue, Mar 20, 2018 at 11:19 AM, Jeremy Finzel <finzelj@gmail.com> wrote:
On Mon, Mar 19, 2018 at 3:55 PM, Jeremy Finzel <finzelj@gmail.com> wrote:
On Mon, Mar 19, 2018 at 3:53 PM, Peter Geoghegan <pg@bowt.ie> wrote:
On Mon, Mar 19, 2018 at 1:01 PM, Jeremy Finzel <finzelj@gmail.com>
wrote:SELECT heap_page_items(get_raw_page('pg_authid', 7));
Can you post this?
SELECT * FROM page_header(get_raw_page('pg_authid', 7));
--
Peter Geoghegan@Peter :
staging=# SELECT * FROM page_header(get_raw_page('pg_authid', 7));
lsn | checksum | flags | lower | upper | special | pagesize |
version | prune_xid
----------------+----------+-------+-------+-------+--------
-+----------+---------+-----------
262B4/10FDC478 | 0 | 1 | 304 | 2224 | 8192 | 8192
| 4 | 0
(1 row)@Andres :
This is from snapshot (on 9.5.12, but we didn't have the error either on
a 9.5.11 snap):heap_page_items
------------------------------------------------------------
------------------------------------------------------------
-----------------
(1,0,0,0,,,,,,,,,)
(2,0,0,0,,,,,,,,,)
(3,0,0,0,,,,,,,,,)
(4,0,0,0,,,,,,,,,)
(5,0,0,0,,,,,,,,,)
(6,8080,1,108,3137434815 <(313)%20743-4815>,0,0,
"(7,6)",11,10505,32,1111111110000000000000000000000000000000
01111110111100001000010010000001,2166427518 <(216)%20642-7518>)
(7,7936,1,144,3137434816 <(313)%20743-4816>,0,0,
"(7,7)",11,10507,32,1111111111000000000000000000000000000000
01110111000001110101101100001011,3504005358)
(8,0,0,0,,,,,,,,,)
(9,0,0,0,,,,,,,,,)
(10,7792,1,144,3137434817 <(313)%20743-4817>,0,0
,"(7,10)",11,10507,32,11111111110000000000000000000000000000
0001010110001101000011011111101000,401353834)
(11,7680,1,108,3137434818 <(313)%20743-4818>,0,0
,"(7,11)",11,10505,32,11111111100000000000000000000000000000
0001100011010010010001000001100001,2248708806 <(224)%20870-8806>)
(12,0,0,0,,,,,,,,,)
(13,0,0,0,,,,,,,,,)
(14,7568,1,108,3137434819 <(313)%20743-4819>,0,0
,"(7,14)",11,10505,32,11111111100000000000000000000000000000
0001110011110011011011100010100101,2770187214)
(15,7456,1,108,3137434820 <(313)%20743-4820>,0,0
,"(7,15)",11,10505,32,11111111100000000000000000000000000000
0011110001010001010011110010100001,2235343503 <(223)%20534-3503>)
(16,0,0,0,,,,,,,,,)
(17,0,0,0,,,,,,,,,)
(18,0,0,0,,,,,,,,,)
(19,0,0,0,,,,,,,,,)
(20,0,0,0,,,,,,,,,)
(21,0,0,0,,,,,,,,,)
(22,0,0,0,,,,,,,,,)
(23,0,0,0,,,,,,,,,)
(24,0,0,0,,,,,,,,,)
(25,0,0,0,,,,,,,,,)
(26,0,0,0,,,,,,,,,)
(27,0,0,0,,,,,,,,,)
(28,0,0,0,,,,,,,,,)
(29,0,0,0,,,,,,,,,)
(30,0,0,0,,,,,,,,,)
(31,0,0,0,,,,,,,,,)
(32,0,0,0,,,,,,,,,)
(33,0,0,0,,,,,,,,,)
(34,0,0,0,,,,,,,,,)
(35,0,0,0,,,,,,,,,)
(36,0,0,0,,,,,,,,,)
(37,0,0,0,,,,,,,,,)
(38,0,0,0,,,,,,,,,)
(39,0,0,0,,,,,,,,,)
(40,0,0,0,,,,,,,,,)
(41,0,0,0,,,,,,,,,)
(42,0,0,0,,,,,,,,,)
(43,0,0,0,,,,,,,,,)
(44,0,0,0,,,,,,,,,)
(45,0,0,0,,,,,,,,,)
(46,0,0,0,,,,,,,,,)
(47,0,0,0,,,,,,,,,)
(48,0,0,0,,,,,,,,,)
(49,0,0,0,,,,,,,,,)
(50,0,0,0,,,,,,,,,)
(51,0,0,0,,,,,,,,,)
(52,7344,1,108,3137434821 <(313)%20743-4821>,0,0
,"(7,52)",11,10505,32,11111111100000000000000000000000000000
0011011011111110001010010101000001,2191859675)
(53,7232,1,108,3137434822 <(313)%20743-4822>,0,0
,"(7,53)",11,10505,32,11111111100000000000000000000000000000
0001101011010111100110011011100100,661027542)
(54,0,0,0,,,,,,,,,)
(55,0,0,0,,,,,,,,,)
(56,0,0,0,,,,,,,,,)
(57,0,0,0,,,,,,,,,)
(58,0,0,0,,,,,,,,,)
(59,0,0,0,,,,,,,,,)
(60,0,0,0,,,,,,,,,)
(61,7120,1,108,3137434823 <(313)%20743-4823>,0,0
,"(7,61)",11,10505,32,11111111100000000000000000000000000000
0000010111010110000101010111010100,732568296)
(62,6976,1,144,3137434824 <(313)%20743-4824>,0,0
,"(7,62)",11,10507,32,11111111110000000000000000000000000000
0010100100001001001010110000010100,674571301)
(63,6864,1,108,3137434825 <(313)%20743-4825>,0,0
,"(7,63)",11,10505,32,11111111100000000000000000000000000000
0010011011011111011100001001010001,2319695577 <(231)%20969-5577>)
(64,6720,1,144,3137434826 <(313)%20743-4826>,0,0
,"(7,64)",11,10507,32,11111111110000000000000000000000000000
0001000010011001111011100100101000,345892418)
(65,6608,1,108,3137434827 <(313)%20743-4827>,0,0
,"(7,65)",11,10505,32,11111111100000000000000000000000000000
0001000001010100010010101011001010,1398049410)
(66,6496,1,108,3137434828 <(313)%20743-4828>,0,0
,"(7,66)",11,10505,32,11111111100000000000000000000000000000
0000101000111100110101001000101010,1414188820)
(67,6384,1,108,3137434829 <(313)%20743-4829>,0,0
,"(7,67)",11,10505,32,11111111100000000000000000000000000000
0000110101011101111011001110101001,2513301164 <(251)%20330-1164>)
(68,0,0,0,,,,,,,,,)
(69,0,0,0,,,,,,,,,)
(70,0,0,0,,,,,,,,,)
(70 rows)Thanks,
JeremyAny suggestions as to what I can do from this point? Is it feasible that
a server restart would fix this, or are you saying it would just mask the
problem? I can't reproduce it on a snapshot which is quite odd.FWIW, maybe this is obvious, but the pages look identical on the streamers
we have to the master.Thanks,
Jeremy
A server restart and upgrade to 9.5.12 (at the same time), as expected,
made the issue go away. Still doesn't give us any answers as to what
happened or if it would happen again! Thanks for the feeback.
Jeremy
On Wed, Mar 21, 2018 at 1:38 PM, Jeremy Finzel <finzelj@gmail.com> wrote:
A server restart and upgrade to 9.5.12 (at the same time), as expected, made
the issue go away. Still doesn't give us any answers as to what happened or
if it would happen again! Thanks for the feeback.
You may still want to use amcheck to look for problems. The version on
Github works with 9.5, and there are Redhat and Debian pgdg packages.
See:
https://github.com/petergeoghegan/amcheck
The "heapallindexed" option will be of particular interest to you -
that option verifies that the table has matching rows for a target
index (in addition to testing the structure of a target B-Tree index
itself). This is probably the best general test for corruption that is
available. There is a fair chance that this will reveal new
information.
--
Peter Geoghegan
On Wed, Mar 21, 2018 at 4:29 PM, Peter Geoghegan <pg@bowt.ie> wrote:
On Wed, Mar 21, 2018 at 1:38 PM, Jeremy Finzel <finzelj@gmail.com> wrote:
A server restart and upgrade to 9.5.12 (at the same time), as expected,
made
the issue go away. Still doesn't give us any answers as to what
happened or
if it would happen again! Thanks for the feeback.
You may still want to use amcheck to look for problems. The version on
Github works with 9.5, and there are Redhat and Debian pgdg packages.
See:https://github.com/petergeoghegan/amcheck
The "heapallindexed" option will be of particular interest to you -
that option verifies that the table has matching rows for a target
index (in addition to testing the structure of a target B-Tree index
itself). This is probably the best general test for corruption that is
available. There is a fair chance that this will reveal new
information.--
Peter Geoghegan
Thank you for the recommendation. I ran both amcheck functions on all 4
indexes of those 2 tables with heapallindexed = true, but no issues were
found.
Thanks,
Jeremy