db corruption/recovery help

Started by Ed L.almost 21 years ago9 messagesgeneral

pgsql@bluepolka.net

almost 21 years ago

Someone flipped a breaker switch, and evidently triggered
corruption in one of our major clusters:

$ cat server_log.Mon
postmaster successfully started
2005-06-06 14:31:11.950 [20124] LOG: database system was interrupted being in recovery at 2005-06-06 14:29:01 EDT
This probably means that some data blocks are corrupted
and you will have to use the last backup for recovery.
2005-06-06 14:31:11.950 [20124] LOG: checkpoint record is at EF/EBB7AFC8
2005-06-06 14:31:11.950 [20124] LOG: redo record is at EF/EBA91EF0; undo record is at 0/0; shutdown FALSE
2005-06-06 14:31:11.950 [20124] LOG: next transaction id: 577477594; next oid: 89750885
2005-06-06 14:31:11.951 [20124] LOG: database system was not properly shut down; automatic recovery in progress
2005-06-06 14:31:11.952 [20124] LOG: redo starts at EF/EBA91EF0
2005-06-06 14:31:11.984 [20124] PANIC: Invalid page header in block 22376 of 79189398
2005-06-06 14:31:12.275 [20121] LOG: startup process (pid 20124) was terminated by signal 6
2005-06-06 14:31:12.275 [20121] LOG: aborting startup due to startup process failure

We have backups from 10 hours earlier, but the obvious
question: what, if anything, can I do now to salvage those
10 hours of data from this?

I guess I could zero the block? I'm a little uncertain since
I don't know what I'm zeroing (pg_database? pg_class?), and
can't start up to see what that relfilenode maps to...

Going to look at it with pg_filedump, maybe oid2filename or
whatever that utility is...

Thanks,
Ed

Ed L.

pgsql@bluepolka.net

almost 21 years ago

In reply to: Ed L. (#1)

Re: db corruption/recovery help

On Monday June 6 2005 2:16 pm, Ed L. wrote:

Someone flipped a breaker switch, and evidently triggered
corruption in one of our major clusters:

BTW, this is a 7.3.4 cluster ...

Scott Marlowe

smarlowe@g2switchworks.com

almost 21 years ago

In reply to: Ed L. (#1)

Re: db corruption/recovery help

On Mon, 2005-06-06 at 15:16, Ed L. wrote:

Someone flipped a breaker switch, and evidently triggered
corruption in one of our major clusters:

$ cat server_log.Mon
postmaster successfully started
2005-06-06 14:31:11.950 [20124] LOG: database system was interrupted being in recovery at 2005-06-06 14:29:01 EDT
This probably means that some data blocks are corrupted
and you will have to use the last backup for recovery.
2005-06-06 14:31:11.950 [20124] LOG: checkpoint record is at EF/EBB7AFC8
2005-06-06 14:31:11.950 [20124] LOG: redo record is at EF/EBA91EF0; undo record is at 0/0; shutdown FALSE
2005-06-06 14:31:11.950 [20124] LOG: next transaction id: 577477594; next oid: 89750885
2005-06-06 14:31:11.951 [20124] LOG: database system was not properly shut down; automatic recovery in progress
2005-06-06 14:31:11.952 [20124] LOG: redo starts at EF/EBA91EF0
2005-06-06 14:31:11.984 [20124] PANIC: Invalid page header in block 22376 of 79189398
2005-06-06 14:31:12.275 [20121] LOG: startup process (pid 20124) was terminated by signal 6
2005-06-06 14:31:12.275 [20121] LOG: aborting startup due to startup process failure

We have backups from 10 hours earlier, but the obvious
question: what, if anything, can I do now to salvage those
10 hours of data from this?

I guess I could zero the block? I'm a little uncertain since
I don't know what I'm zeroing (pg_database? pg_class?), and
can't start up to see what that relfilenode maps to...

Going to look at it with pg_filedump, maybe oid2filename or
whatever that utility is...

OK, if postgresql is running on hardware that doe NOT lie about
fsyncing, and it is set to fsync, this should NEVER happen.

The fact that it happened means either A: the hardware is lying / broken
(like default IDE drives or a RAID controller with a non-battery backed
cache) or B: the database was set to not fsync. Since the default IS to
fsync, this would require effort on your part to turn it off.

Now, once the database comes up like this, your data may or may not be
salvagable, and may or may not be coherent. I.e. you may have orphaned
records where you shouldn't, or duplicate entries in some unique index
or something like that.

Look for pg_resetxlog. There's a man page for it on my machine, but I'm
not sure about 7.3.x

Before doing anything, make a file system level backup of all your data
so you have a pristine set to play about with.

Ed L.

pgsql@bluepolka.net

almost 21 years ago

In reply to: Scott Marlowe (#3)

Re: db corruption/recovery help

On Monday June 6 2005 3:17 pm, Scott Marlowe wrote:

On Mon, 2005-06-06 at 15:16, Ed L. wrote:

Someone flipped a breaker switch, and evidently triggered
corruption in one of our major clusters:

OK, if postgresql is running on hardware that doe NOT lie
about fsyncing, and it is set to fsync, this should NEVER
happen.

This is 7.3.4 running on an HP-UX 11.00 9000/800 PA-RISC box with
fsync = TRUE, built with gcc 3.2.2. Database is entirely on a
SAN.

We got very lucky: the corrupted database was expendable
(essentially a log database). I was able to just move the
data/base/NNNN directory off to the side, restart, drop the
corrupted db, and recreate schema...

Thanks,
Ed

Ed L.

pgsql@bluepolka.net

almost 21 years ago

In reply to: Ed L. (#4)

Re: db corruption/recovery help

On Monday June 6 2005 3:29 pm, Ed L. wrote:

On Monday June 6 2005 3:17 pm, Scott Marlowe wrote:

On Mon, 2005-06-06 at 15:16, Ed L. wrote:

Someone flipped a breaker switch, and evidently triggered
corruption in one of our major clusters:

OK, if postgresql is running on hardware that doe NOT lie
about fsyncing, and it is set to fsync, this should NEVER
happen.

This is 7.3.4 running on an HP-UX 11.00 9000/800 PA-RISC box
with fsync = TRUE, built with gcc 3.2.2. Database is entirely
on a SAN.

We got very lucky: the corrupted database was expendable
(essentially a log database). I was able to just move the
data/base/NNNN directory off to the side, restart, drop the
corrupted db, and recreate schema...

The SAN never lost power, only the system itself. I'd really
like to chase this to the root if possible. Ideas?

Scott Marlowe

smarlowe@g2switchworks.com

almost 21 years ago

In reply to: Ed L. (#5)

Re: db corruption/recovery help

On Mon, 2005-06-06 at 16:39, Ed L. wrote:

On Monday June 6 2005 3:29 pm, Ed L. wrote:

On Monday June 6 2005 3:17 pm, Scott Marlowe wrote:

On Mon, 2005-06-06 at 15:16, Ed L. wrote:

Someone flipped a breaker switch, and evidently triggered
corruption in one of our major clusters:

OK, if postgresql is running on hardware that doe NOT lie
about fsyncing, and it is set to fsync, this should NEVER
happen.

This is 7.3.4 running on an HP-UX 11.00 9000/800 PA-RISC box
with fsync = TRUE, built with gcc 3.2.2. Database is entirely
on a SAN.

We got very lucky: the corrupted database was expendable
(essentially a log database). I was able to just move the
data/base/NNNN directory off to the side, restart, drop the
corrupted db, and recreate schema...

The SAN never lost power, only the system itself. I'd really
like to chase this to the root if possible. Ideas?

It sounds like somewhere between postgresql and the SAN connector going
out the back, something is lying about fsync. I'm not that familiar with
lots of different SAN setups, so you might want to describe how things
are set up and see if anyone else knows more about them than me.

Tom Lane

tgl@sss.pgh.pa.us

almost 21 years ago

In reply to: Ed L. (#1)

Re: db corruption/recovery help

"Ed L." <pgsql@bluepolka.net> writes:

Someone flipped a breaker switch, and evidently triggered
corruption in one of our major clusters:

2005-06-06 14:31:11.984 [20124] PANIC: Invalid page header in block 22376 of 79189398

It's possible that you are one minor release short of having dodged this
problem, as I see in the 7.3.5 CVS log

2003-12-01 11:53 tgl

* src/backend/storage/buffer/bufmgr.c (REL7_3_STABLE): Force
zero_damaged_pages to be effectively ON during recovery from WAL,
since there is no need to worry about damaged pages when we are
going to overwrite them anyway from the WAL. Per recent
discussion.

This doesn't really address the question of how the page header got
clobbered in the first place, though. Did you by any chance make a dump
to see what data was in there?

regards, tom lane

Ed L.

pgsql@bluepolka.net

almost 21 years ago

In reply to: Tom Lane (#7)

Re: db corruption/recovery help

On Monday June 6 2005 11:15 pm, Tom Lane wrote:

It's possible that you are one minor release short of having
dodged this problem, as I see in the 7.3.5 CVS log

* src/backend/storage/buffer/bufmgr.c (REL7_3_STABLE): Force
zero_damaged_pages to be effectively ON during recovery from
WAL, since there is no need to worry about damaged pages when
we are going to overwrite them anyway from the WAL. Per
recent discussion.

I remember all too well.

This doesn't really address the question of how the page
header got clobbered in the first place, though. Did you by
any chance make a dump to see what data was in there?

Well, I do have a copy of the corrupted database/file, if that's
what you mean. I rebuilt 7.3.4 source (which I don't normally
delete) so I could build pg_filedump 2.0, but then pg_filedump
wouldn't build. Not sure why; I didn't have the luxury of
digging deeper. Maybe I can find a pg_filedump laying around
somewhere here, or sort out the build issue...

Ed L.

pgsql@bluepolka.net

almost 21 years ago

In reply to: Tom Lane (#7)

Re: db corruption/recovery help

On Monday June 6 2005 11:15 pm, Tom Lane wrote:

This doesn't really address the question of how the page
header got clobbered in the first place, though. Did you by
any chance make a dump to see what data was in there?

I couldn't start the postmaster at all with that data in the
cluster, so no dump. But I do have the corrupted file. Haven't
gotten free yet to setup a pg_filedump on it.

The hardware setup is an HP RP5470 4-way running 11.00, with dual
HP Tachyon XL2 Fibre Channel Mass Storage Adapters going into
dual Cisco 9509 SAN switches to dual EVA 5000 disk arrays,
CPQswsp A.3.0B.01F.00F Patch upgrade for Sanworks Secure Path
Device Driver and utilities. Not sure that reveals much of the
problem...