ERROR invalid page header in block xxx of relation base/xxxxx/xxxxx/

Started by Sreekanth Palluruover 9 years ago4 messagesgeneral
Jump to latest
#1Sreekanth Palluru
sree4pg@gmail.com

Hi ,

I am getting invalid page header error and what I could observe is when I
select the table I get this error , where as if I select table and order by
primary key I can retrieve the rows from table.

And I don't see any dataloss ( based on total number records) after fixing
the blocks using zero_damaged_pages=on and then vacuum full on the table

Please note I have renamed few tables to avoid giving actual table names

We run Postgres 9.2 version on Windows and Dell Optiplex XE2 server and we
suspect that block corruption would have happen due to
hardware/memory/power failures reasons and I have gone through wikik
https://wiki.postgresql.org/wiki/Reliable_Writes.

I want to understand why we can read the table through where there exists
index and explain plan shows Index scan and with high cost compare to seq
scan .
I assume that since there no rows/data present in these corrupted blocks
index scan skips these blocks and hence it is not throwing the error .

Also , I want to know what would have caused the postgres to create these
corrupted blocks and
can I reproduce this error ? appreciate if you share any pointers to
blogs/mailing lists if this type of issue is already discussed ?

create table a.parametertable_bak as select * from a.parametertable order
by id;

labs=# select count(*) from a.parametertable_bak ;
count
-------
31415
(1 row)

labs=#
labs=# checkpoint;
CHECKPOINT
labs=# set zero_damaged_pages=on;
SET
labs=# vacuum full a.parametertable;
WARNING: invalid page header in block 204 of relation base/16413/16900;
zeroing out page
WARNING: invalid page header in block 205 of relation base/16413/16900;
zeroing out page
VACUUM
labs=# select count(*) from a.parametertable ;
count
-------
31415
(1 row)

labs=#
-Sreekanth

#2Jorge Torralba
jorge.torralba@gmail.com
In reply to: Sreekanth Palluru (#1)
Re: ERROR invalid page header in block xxx of relation base/xxxxx/xxxxx/

Look into this setting.

zero_damaged_pages = on;

The docs should explain it.

On Wed, Dec 7, 2016 at 2:47 PM, sreekanth Palluru <sree4pg@gmail.com> wrote:

Hi ,

I am getting invalid page header error and what I could observe is when I
select the table I get this error , where as if I select table and order by
primary key I can retrieve the rows from table.

And I don't see any dataloss ( based on total number records) after fixing
the blocks using zero_damaged_pages=on and then vacuum full on the table

Please note I have renamed few tables to avoid giving actual table names

We run Postgres 9.2 version on Windows and Dell Optiplex XE2 server and we
suspect that block corruption would have happen due to
hardware/memory/power failures reasons and I have gone through wikik
https://wiki.postgresql.org/wiki/Reliable_Writes.

I want to understand why we can read the table through where there exists
index and explain plan shows Index scan and with high cost compare to seq
scan .
I assume that since there no rows/data present in these corrupted blocks
index scan skips these blocks and hence it is not throwing the error .

Also , I want to know what would have caused the postgres to create these
corrupted blocks and
can I reproduce this error ? appreciate if you share any pointers to
blogs/mailing lists if this type of issue is already discussed ?

create table a.parametertable_bak as select * from a.parametertable order
by id;

labs=# select count(*) from a.parametertable_bak ;
count
-------
31415
(1 row)

labs=#
labs=# checkpoint;
CHECKPOINT
labs=# set zero_damaged_pages=on;
SET
labs=# vacuum full a.parametertable;
WARNING: invalid page header in block 204 of relation base/16413/16900;
zeroing out page
WARNING: invalid page header in block 205 of relation base/16413/16900;
zeroing out page
VACUUM
labs=# select count(*) from a.parametertable ;
count
-------
31415
(1 row)

labs=#
-Sreekanth

--
Thanks,

Jorge Torralba
----------------------------

Note: This communication may contain privileged or other confidential
information. If you are not the intended recipient, please do not print,
copy, retransmit, disseminate or otherwise use the information. Please
indicate to the sender that you have received this email in error and
delete the copy you received. Thank You.

#3Sreekanth Palluru
sree4pg@gmail.com
In reply to: Sreekanth Palluru (#1)
Fwd: ERROR invalid page header in block xxx of relation base/xxxxx/xxxxx/

Forgot to loop community

Hi Jorge/All,
Thanks for the reply.
As per the documentation , I think backend reads the this page header and
reports that it is damaged.
I am looking at ways we re-create this scenario of creation this blank
corrupted page ? do I have any control over Backend and after it
initializes a new page using function PageInit
<https://doxygen.postgresql.org/bufpage_8c.html#ab871202326b101c6ec24b7f628157c2c&gt;
(Page
<https://doxygen.postgresql.org/bufpage_8h.html#a2010e3258a7075b32ad5750134ab9c5c&gt;
page, Size
<https://doxygen.postgresql.org/c_8h.html#af9ecec2d692138fab9167164a457cbd4&gt;
pageSize, Size
<https://doxygen.postgresql.org/c_8h.html#af9ecec2d692138fab9167164a457cbd4&gt;
specialSize) and I want to halt/crash the backend by stopping postgres
database or through Hardware/system crash ?
Is this possible ?

Also I see that based on below comments from Source code, having such pages
in database is normal . is my understanding correct ?

/*
63 * PageIsVerified
64 * Check that the page header and checksum (if any) appear valid.
65 *
66 * This is called when a page has just been read in from disk. The idea
is
67 * to cheaply detect trashed pages before we go nuts following bogus item
68 * pointers, testing invalid transaction identifiers, etc.
69 *
70 * It turns out to be necessary to allow zeroed pages here too. Even
though
71 * this routine is *not* called when deliberately adding a page to a
relation,
72 * there are scenarios in which a zeroed page might be found in a table.
73 * (Example: a backend extends a relation, then crashes before it can
write
74 * any WAL entry about the new page. The kernel will already have the
75 * zeroed page in the file, and it will stay that way after restart.) So
we
76 * allow zeroed pages here, and are careful that the page access macros
77 * treat such a page as empty and without free space. Eventually, VACUUM
78 * will clean up such a page and make it usable.
79 */
zero_damaged_pages (boolean)

Detection of a damaged page header normally causes PostgreSQL to report an
error, aborting the current transaction. Setting zero_damaged_pages to on
causes the system to instead report a warning, zero out the damaged page in
memory, and continue processing. This behavior will destroy data, namely
all the rows on the damaged page. However, it does allow you to get past
the error and retrieve rows from any undamaged pages that might be present
in the table. It is useful for recovering data if corruption has occurred
due to a hardware or software error. You should generally not set this on
until you have given up hope of recovering data from the damaged pages of a
table. Zeroed-out pages are not forced to disk so it is recommended to
recreate the table or the index before turning this parameter off again.
The default setting is off, and it can only be changed by a superuser.

On Thu, Dec 8, 2016 at 9:50 AM, Jorge Torralba <jorge.torralba@gmail.com>
wrote:

Look into this setting.

zero_damaged_pages = on;

The docs should explain it.

On Wed, Dec 7, 2016 at 2:47 PM, sreekanth Palluru <sree4pg@gmail.com>
wrote:

Hi ,

I am getting invalid page header error and what I could observe is when I
select the table I get this error , where as if I select table and order by
primary key I can retrieve the rows from table.

And I don't see any dataloss ( based on total number records) after
fixing the blocks using zero_damaged_pages=on and then vacuum full on the
table

Please note I have renamed few tables to avoid giving actual table names

We run Postgres 9.2 version on Windows and Dell Optiplex XE2 server and
we suspect that block corruption would have happen due to
hardware/memory/power failures reasons and I have gone through wikik
https://wiki.postgresql.org/wiki/Reliable_Writes.

I want to understand why we can read the table through where there exists
index and explain plan shows Index scan and with high cost compare to seq
scan .
I assume that since there no rows/data present in these corrupted blocks
index scan skips these blocks and hence it is not throwing the error .

Also , I want to know what would have caused the postgres to create these
corrupted blocks and
can I reproduce this error ? appreciate if you share any pointers to
blogs/mailing lists if this type of issue is already discussed ?

create table a.parametertable_bak as select * from a.parametertable order
by id;

labs=# select count(*) from a.parametertable_bak ;
count
-------
31415
(1 row)

labs=#
labs=# checkpoint;
CHECKPOINT
labs=# set zero_damaged_pages=on;
SET
labs=# vacuum full a.parametertable;
WARNING: invalid page header in block 204 of relation base/16413/16900;
zeroing out page
WARNING: invalid page header in block 205 of relation base/16413/16900;
zeroing out page
VACUUM
labs=# select count(*) from a.parametertable ;
count
-------
31415
(1 row)

labs=#
-Sreekanth

--
Thanks,

Jorge Torralba
----------------------------

Note: This communication may contain privileged or other confidential
information. If you are not the intended recipient, please do not print,
copy, retransmit, disseminate or otherwise use the information. Please
indicate to the sender that you have received this email in error and
delete the copy you received. Thank You.

--
Regards
Sreekanth

--
Regards
Sreekanth

#4Sreekanth Palluru
sree4pg@gmail.com
In reply to: Sreekanth Palluru (#3)
Re: [ADMIN] ERROR invalid page header in block xxx of relation base/xxxxx/xxxxx/

Looping pgsql-general mail list

I see I am not clear in my question , below are the order of events we see
when we get a invalid page header in block corruption

-Windows server crashed/restarted due to power failure ( we believe) ( I
see that write-cache / write back cache / Disk cache are enabled and we
don't have battery backup )
-Posgres DB crashes (LOG: database system was not properly shut down;
automatic recovery in progress )
- After around 3hrs after crash we see below messages in log

2016-03-03 20:13:18 GMT ERROR: invalid page header in block 204 of
relation base/16413/16900
2016-03-03 20:13:18 GMT CONTEXT: automatic analyze of table "
a.parametertable" => this is a relation not Index

- select count(*) on table gives invalid page header message , where as
select count(*) order by primary key does not give any issue
- After clearing the blocks using zero_damaged_pages , total count of table
rows using the table and order by primary key matches which means there is
no dataloss ( I havent verified each record by record , I assume there is
no dataloss) please correct ??

I would like to know what was the cause of invalid page header and is there
any way I can reproduce this error ?

Appreciate your comments and suggestions on this ?

-Sreekanth

On Thu, Dec 8, 2016 at 12:55 PM, Sreekanth Palluru <sree4pg@gmail.com>
wrote:

Forgot to loop community

Hi Jorge/All,
Thanks for the reply.
As per the documentation , I think backend reads the this page header and
reports that it is damaged.
I am looking at ways we re-create this scenario of creation this blank
corrupted page ? do I have any control over Backend and after it
initializes a new page using function PageInit
<https://doxygen.postgresql.org/bufpage_8c.html#ab871202326b101c6ec24b7f628157c2c&gt;
(Page
<https://doxygen.postgresql.org/bufpage_8h.html#a2010e3258a7075b32ad5750134ab9c5c&gt;
page, Size
<https://doxygen.postgresql.org/c_8h.html#af9ecec2d692138fab9167164a457cbd4&gt;
pageSize, Size
<https://doxygen.postgresql.org/c_8h.html#af9ecec2d692138fab9167164a457cbd4&gt;
specialSize) and I want to halt/crash the backend by stopping postgres
database or through Hardware/system crash ?
Is this possible ?

Also I see that based on below comments from Source code, having such
pages in database is normal . is my understanding correct ?

/*
63 * PageIsVerified
64 * Check that the page header and checksum (if any) appear valid.
65 *
66 * This is called when a page has just been read in from disk. The
idea is
67 * to cheaply detect trashed pages before we go nuts following bogus
item
68 * pointers, testing invalid transaction identifiers, etc.
69 *
70 * It turns out to be necessary to allow zeroed pages here too. Even
though
71 * this routine is *not* called when deliberately adding a page to a
relation,
72 * there are scenarios in which a zeroed page might be found in a
table.
73 * (Example: a backend extends a relation, then crashes before it can
write
74 * any WAL entry about the new page. The kernel will already have the
75 * zeroed page in the file, and it will stay that way after restart.)
So we
76 * allow zeroed pages here, and are careful that the page access macros
77 * treat such a page as empty and without free space. Eventually,
VACUUM
78 * will clean up such a page and make it usable.
79 */
zero_damaged_pages (boolean)

Detection of a damaged page header normally causes PostgreSQL to report
an error, aborting the current transaction. Setting zero_damaged_pages to
on causes the system to instead report a warning, zero out the damaged page
in memory, and continue processing. This behavior will destroy data,
namely all the rows on the damaged page. However, it does allow you to get
past the error and retrieve rows from any undamaged pages that might be
present in the table. It is useful for recovering data if corruption has
occurred due to a hardware or software error. You should generally not set
this on until you have given up hope of recovering data from the damaged
pages of a table. Zeroed-out pages are not forced to disk so it is
recommended to recreate the table or the index before turning this
parameter off again. The default setting is off, and it can only be
changed by a superuser.

On Thu, Dec 8, 2016 at 9:50 AM, Jorge Torralba <jorge.torralba@gmail.com>
wrote:

Look into this setting.

zero_damaged_pages = on;

The docs should explain it.

On Wed, Dec 7, 2016 at 2:47 PM, sreekanth Palluru <sree4pg@gmail.com>
wrote:

Hi ,

I am getting invalid page header error and what I could observe is when
I select the table I get this error , where as if I select table and order
by primary key I can retrieve the rows from table.

And I don't see any dataloss ( based on total number records) after
fixing the blocks using zero_damaged_pages=on and then vacuum full on the
table

Please note I have renamed few tables to avoid giving actual table names

We run Postgres 9.2 version on Windows and Dell Optiplex XE2 server and
we suspect that block corruption would have happen due to
hardware/memory/power failures reasons and I have gone through wikik
https://wiki.postgresql.org/wiki/Reliable_Writes.

I want to understand why we can read the table through where there
exists index and explain plan shows Index scan and with high cost compare
to seq scan .
I assume that since there no rows/data present in these corrupted blocks
index scan skips these blocks and hence it is not throwing the error .

Also , I want to know what would have caused the postgres to create
these corrupted blocks and
can I reproduce this error ? appreciate if you share any pointers to
blogs/mailing lists if this type of issue is already discussed ?

create table a.parametertable_bak as select * from a.parametertable
order by id;

labs=# select count(*) from a.parametertable_bak ;
count
-------
31415
(1 row)

labs=#
labs=# checkpoint;
CHECKPOINT
labs=# set zero_damaged_pages=on;
SET
labs=# vacuum full a.parametertable;
WARNING: invalid page header in block 204 of relation base/16413/16900;
zeroing out page
WARNING: invalid page header in block 205 of relation base/16413/16900;
zeroing out page
VACUUM
labs=# select count(*) from a.parametertable ;
count
-------
31415
(1 row)

labs=#
-Sreekanth

--
Thanks,

Jorge Torralba
----------------------------

Note: This communication may contain privileged or other confidential
information. If you are not the intended recipient, please do not print,
copy, retransmit, disseminate or otherwise use the information. Please
indicate to the sender that you have received this email in error and
delete the copy you received. Thank You.

--
Regards
Sreekanth

--
Regards
Sreekanth

--
Regards
Sreekanth