corrupted tuple (header?), pg_filedump output

Started by Eric Paruselabout 21 years ago5 messageshackers
Jump to latest
#1Eric Parusel
lists@globalrelay.net

I've tracked down a row that is failing:

maia=# select id FROM table WHERE id = 1401765;
ERROR: could not access status of transaction 1634148473
DETAIL: could not open file "/data1/pgsql/data/pg_clog/0616": No such
file or directory

db=# vacuum maia_mail;
WARNING: relation "table" TID 28393/2: OID is invalid
ERROR: could not access status of transaction 1634148473
DETAIL: could not open file "/data1/pgsql/data/pg_clog/0616": No such
file or directory

I found the following post on pgsql-hackers and followed some of Tom
Lane's suggestions:
http://tinyurl.com/5bjf9

I ran pg_filedump, and here's a snippet:
(I would assume that TID 28393/2 means that "item" 2 has the problem,
correct?

./pg_filedump -i -f -R 28393 /data1/pgsql/data/base/17760/18004
--snip--
Item 2 -- Length: 1728 Offset: 6464 (0x1940) Flags: USED
XMIN: 12 CMIN: 196608 XMAX: 122552335 CMAX|XVAC: 177664675
Block Id: 0 linp Index: 47241 Attributes: 3692 Size: 0
infomask: 0x0900 (XMIN_COMMITTED|XMAX_INVALID)
Error: Computed header length not equal to header size.
Computed <28> Header: <0>
--snip--

I assume this is where my problem lies.

What do I do now? Can this be repaired?
This tuple is not needed, I can "get rid of it" or somehow dereference
it if need be.

How does an inconsistency like this typically arise?

Thanks very much,
Eric

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Eric Parusel (#1)
Re: corrupted tuple (header?), pg_filedump output

Eric Parusel <lists@globalrelay.net> writes:

./pg_filedump -i -f -R 28393 /data1/pgsql/data/base/17760/18004
--snip--
Item 2 -- Length: 1728 Offset: 6464 (0x1940) Flags: USED
XMIN: 12 CMIN: 196608 XMAX: 122552335 CMAX|XVAC: 177664675
Block Id: 0 linp Index: 47241 Attributes: 3692 Size: 0
infomask: 0x0900 (XMIN_COMMITTED|XMAX_INVALID)
Error: Computed header length not equal to header size.
Computed <28> Header: <0>
--snip--

None of those fields seem very sensible. I suspect what's actually
corrupt is the ItemId in the page header --- ie, it's pointing at
something that's not a tuple header. You might eyeball the ItemId
array and see if any pattern of corruption is visible.

(Oh btw: are you certain you have the right version of pg_filedump?
If it's older than your postmaster it might be giving you wrong
answers.)

What do I do now? Can this be repaired?
This tuple is not needed, I can "get rid of it" or somehow dereference
it if need be.

If you can do without the whole page, the easiest way is to zero out the
entire page with 'dd'. If you just want to kill that one tuple you can
reset the LP_USED bit in its ItemId. Either way the postmaster needs to
be stopped to be sure that it's not got the page buffered while you
change it.

regards, tom lane

#3Eric Parusel
lists@globalrelay.net
In reply to: Tom Lane (#2)
Re: corrupted tuple (header?), pg_filedump output

Tom Lane wrote:

Eric Parusel <lists@globalrelay.net> writes:

./pg_filedump -i -f -R 28393 /data1/pgsql/data/base/17760/18004
--snip--
Item 2 -- Length: 1728 Offset: 6464 (0x1940) Flags: USED
XMIN: 12 CMIN: 196608 XMAX: 122552335 CMAX|XVAC: 177664675
Block Id: 0 linp Index: 47241 Attributes: 3692 Size: 0
infomask: 0x0900 (XMIN_COMMITTED|XMAX_INVALID)
Error: Computed header length not equal to header size.
Computed <28> Header: <0>
--snip--

None of those fields seem very sensible. I suspect what's actually
corrupt is the ItemId in the page header --- ie, it's pointing at
something that's not a tuple header. You might eyeball the ItemId
array and see if any pattern of corruption is visible.

Ok, the other items in the dump seemed sensible to me.

Is this what you're speaking of?

Block 28393 ********************************************************
<Header> -----
Block Offset: 0x0ddd2000 Offsets: Lower 56 (0x0038)
Block: Size 8192 Version 2 Upper 1064 (0x0428)
LSN: logid 242 recoff 0x9387bd78 Special 8192 (0x2000)
Items: 9 Free Space: 1008
Length (including item array): 60

0000: f2000000 78bd8793 01000000 38002804 ....x.......8.(.
0010: 00200220 808cb80d 4099800d a485b80d . . ....@.......
0020: ec847001 28848401 9c984801 b8060000 ..p.(.....H.....
0030: 5c93800a 28040000 00000000 \...(.......

Or, this?:

<Data> ------
Item 1 -- Length: 1756 Offset: 3200 (0x0c80) Flags: USED
Item 2 -- Length: 1728 Offset: 6464 (0x1940) Flags: USED
Item 3 -- Length: 1756 Offset: 1444 (0x05a4) Flags: USED
Item 4 -- Length: 184 Offset: 1260 (0x04ec) Flags: USED
Item 5 -- Length: 194 Offset: 1064 (0x0428) Flags: USED
Item 6 -- Length: 164 Offset: 6300 (0x189c) Flags: USED
Item 7 -- Length: 0 Offset: 1720 (0x06b8) Flags: 0x00
Item 8 -- Length: 1344 Offset: 4956 (0x135c) Flags: USED
Item 9 -- Length: 0 Offset: 1064 (0x0428) Flags: 0x00

(Oh btw: are you certain you have the right version of pg_filedump?
If it's older than your postmaster it might be giving you wrong
answers.)

Yup, I do believe so, I've got the PGDG 8.0.1 rpms installed and I just
downloaded the PGDG 8.0.1 contrib srpm, and compiled pg_filedump
(downloaded from
http://sources.redhat.com/rhdb/tools/pg_filedump-4.0.tar)...

If you can do without the whole page, the easiest way is to zero out the
entire page with 'dd'. If you just want to kill that one tuple you can
reset the LP_USED bit in its ItemId. Either way the postmaster needs to
be stopped to be sure that it's not got the page buffered while you
change it.

Ok, how do I go about resetting the LP_USED bit? I'll make sure the
postmaster stopped for such an operation...

*******
*******
Oh crap, I just noticed that there is more than just that item that's
inconsistent...

Item 6 -- Length: 164 Offset: 6300 (0x189c) Flags: USED
XMIN: 1852400896 CMIN: 1970348133 XMAX: 543424609 CMAX|XVAC:
1869112179
Block Id: 1819244288 linp Index: 29295 Attributes: 2667 Size: 101
infomask: 0x7270
(HASOID|XMIN_INVALID|MARKED_FOR_UPDATE|UPDATED|MOVED_OFF)
Error: Computed header length not equal to header size.
Computed <32> Header: <101>

Item 8 -- Length: 1344 Offset: 4956 (0x135c) Flags: USED
XMIN: 2100285 CMIN: 1644759593 XMAX: 1634148473 CMAX|XVAC: 2003137536
Block Id: 2036411953 linp Index: 103 Attributes: 28524 Size: 108
infomask: 0x6162 (HASVARWIDTH|XMIN_COMMITTED|UPDATED|MOVED_OFF)
Error: Computed header length not equal to header size.
Computed <28> Header: <108>
*******
*******

Thanks again,
Eric

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Eric Parusel (#3)
Re: corrupted tuple (header?), pg_filedump output

Eric Parusel <lists@globalrelay.net> writes:

Is this what you're speaking of?

<Data> ------
Item 1 -- Length: 1756 Offset: 3200 (0x0c80) Flags: USED
Item 2 -- Length: 1728 Offset: 6464 (0x1940) Flags: USED
Item 3 -- Length: 1756 Offset: 1444 (0x05a4) Flags: USED
Item 4 -- Length: 184 Offset: 1260 (0x04ec) Flags: USED
Item 5 -- Length: 194 Offset: 1064 (0x0428) Flags: USED
Item 6 -- Length: 164 Offset: 6300 (0x189c) Flags: USED
Item 7 -- Length: 0 Offset: 1720 (0x06b8) Flags: 0x00
Item 8 -- Length: 1344 Offset: 4956 (0x135c) Flags: USED
Item 9 -- Length: 0 Offset: 1064 (0x0428) Flags: 0x00

Well, scratch that theory: those offsets and sizes all seem perfectly
consistent (eg, one ends where the next begins, modulo some alignment
padding in one case). So it's hard to credit that the itemid got
bombed; the corruption has to be in the tuple header itself. What
does the hex dump of the tuple header and vicinity look like?

Oh crap, I just noticed that there is more than just that item that's
inconsistent...

Item 6 -- Length: 164 Offset: 6300 (0x189c) Flags: USED
XMIN: 1852400896 CMIN: 1970348133 XMAX: 543424609 CMAX|XVAC:
1869112179
Block Id: 1819244288 linp Index: 29295 Attributes: 2667 Size: 101
infomask: 0x7270
(HASOID|XMIN_INVALID|MARKED_FOR_UPDATE|UPDATED|MOVED_OFF)
Error: Computed header length not equal to header size.
Computed <32> Header: <101>

Item 8 -- Length: 1344 Offset: 4956 (0x135c) Flags: USED
XMIN: 2100285 CMIN: 1644759593 XMAX: 1634148473 CMAX|XVAC: 2003137536
Block Id: 2036411953 linp Index: 103 Attributes: 28524 Size: 108
infomask: 0x6162 (HASVARWIDTH|XMIN_COMMITTED|UPDATED|MOVED_OFF)
Error: Computed header length not equal to header size.
Computed <28> Header: <108>

Those look kinda clobbered too. Maybe you should just send along the
whole -i -f page dump?

regards, tom lane

#5Eric Parusel
lists@globalrelay.net
In reply to: Eric Parusel (#1)
Re: corrupted tuple (header?), pg_filedump output

I've brought this back on-list, probably best that way..?

Eric Parusel wrote:

Tom Lane wrote:

What it kinda looks like from here is that you suffered a "page tear":
the itemid pointers at the front of the page may be self-consistent, but
they don't quite match the state of the rest of the page. For instance
the claimed item-2 header is obviously bogus but it looks like there is
a valid header starting a few bytes after where the itemid points.
I suspect that the itemid pointers are one generation earlier or later
than the remainder of the page. Since disks typically write in 512-byte
sectors and there is nothing else in the first 512 bytes except the
itemids, we could imagine that that sector got written and then the rest
of the page did not. Postgres is supposed to protect against this sort
of thing in case of a system crash, but I wouldn't want to swear that
the protections are completely bulletproof. Have you had any power
failures or system crashes lately? What sort of hardware and OS is this
on?

Hmm...
Here is some system information:

Dell PE1750, 2GB ECC ram, 2x73GB 10K scsi attached to Perc4/di
(raid-on-motherboard, LSI megaraid chipset, battery-backed cache,
write-back cache enabled), firmware/drivers is up to date as of a month
ago.

The OS is RHEL3, kept up to date with the newest kernel for it.

PgSQL 8.0.1 installed from RPMs on postgresql.org, it had 8.0.0
installed from DGPG RPMs initially until 8.0.1 came out.

No power failures or crashes since it's been up...

It's been up and running with moderate to heavy load for about 2 months
now.

I don't think there have been any pgsql backend (if that's the word for
them) processes crashing or anything of that sort...

Pretty heavy write load on the box, it will be getting a 14 disk raid10
array plugged into it soon to speed things up.

I can't remember and I couldn't find it, but is there a consistency
checking tool (pg_fsck or something?) for pgsql? Or I suppose a dump of
the whole database (which I do nightly) ensures all the data is readable...

If there's anything else I can do to help figure this out, let me know..

Thanks,
Eric

How would I go about double checking I don't have this problem on other
pages? As above, a successful db dump would verify everything's fine?
I suppose a dump and reload after that point would verify that my
indexes and anything else in base/ is fine?

How would I figure out where and how much to overwrite with dd if I was
to clear this page? Or how would I set the invalid item's itemid to empty?

Obviously, stuff like this tends not to be in the documentation :D

Thanks for the help,
Eric