Any way to bring up a PG instance with corrupted data in it?

Started by Keaton Adamsalmost 17 years ago4 messagesgeneral
Jump to latest
#1Keaton Adams
kadams@mxlogic.com

This is a QA system and unfortunately there is no recent backup.... So as a last resort I am looking for any way to bring up Postgres when it has corrupt data in it:

FATAL: could not remove old lock file "postmaster.pid": Read-only file system
HINT: The file seems accidentally left over, but it could not be removed. Please remove the file by hand and try again.
Jun 8 06:43:16 mxlqa401 postgres[21401]: [1-1] FATAL: could not remove old lock file "postmaster.pid": Read-only file system
Jun 8 06:43:16 mxlqa401 postgres[21401]: [1-2] HINT: The file seems accidentally left over, but it could not be removed. Please remove the file by hand and try again
.
FATAL: could not remove old lock file "postmaster.pid": Read-only file system
HINT: The file seems accidentally left over, but it could not be removed. Please remove the file by hand and try again.
Jun 8 06:43:29 mxlqa401 postgres[21476]: [1-1] FATAL: could not remove old lock file "postmaster.pid": Read-only file system
Jun 8 06:43:29 mxlqa401 postgres[21476]: [1-2] HINT: The file seems accidentally left over, but it could not be removed. Please remove the file by hand and try again
.
Jun 8 06:44:23 mxlqa401 postgres[21520]: [1-1] LOG: database system was interrupted at 2009-06-05 21:52:54 MDT
Jun 8 06:44:24 mxlqa401 postgres[21520]: [2-1] LOG: checkpoint record is at 134/682530F0
Jun 8 06:44:24 mxlqa401 postgres[21520]: [3-1] LOG: redo record is at 134/682530F0; undo record is at 0/0; shutdown FALSE
Jun 8 06:44:24 mxlqa401 postgres[21520]: [4-1] LOG: next transaction ID: 3005778382; next OID: 103111004
Jun 8 06:44:24 mxlqa401 postgres[21520]: [5-1] LOG: next MultiXactId: 93647; next MultiXactOffset: 190825
Jun 8 06:44:24 mxlqa401 postgres[21520]: [6-1] LOG: database system was not properly shut down; automatic recovery in progress
Jun 8 06:44:24 mxlqa401 postgres[21520]: [7-1] LOG: redo starts at 134/68253134
Jun 8 06:44:24 mxlqa401 postgres[21520]: [8-1] PANIC: could not access status of transaction 3005778383
Jun 8 06:44:24 mxlqa401 postgres[21520]: [8-2] DETAIL: could not read from file "pg_clog/0B32" at offset 139264: Success
Jun 8 06:44:29 mxlqa401 postgres[21518]: [1-1] LOG: startup process (PID 21520) was terminated by signal 6
Jun 8 06:44:29 mxlqa401 postgres[21518]: [2-1] LOG: aborting startup due to startup process failure
Jun 8 06:44:36 mxlqa401 postgres[21574]: [1-1] LOG: database system was interrupted while in recovery at 2009-06-08 06:44:24 MDT
Jun 8 06:44:36 mxlqa401 postgres[21574]: [1-2] HINT: This probably means that some data is corrupted and you will have to use the last backup for recovery.
Jun 8 06:44:36 mxlqa401 postgres[21574]: [2-1] LOG: checkpoint record is at 134/682530F0
Jun 8 06:44:36 mxlqa401 postgres[21574]: [3-1] LOG: redo record is at 134/682530F0; undo record is at 0/0; shutdown FALSE
Jun 8 06:44:36 mxlqa401 postgres[21574]: [4-1] LOG: next transaction ID: 3005778382; next OID: 103111004
Jun 8 06:44:36 mxlqa401 postgres[21574]: [5-1] LOG: next MultiXactId: 93647; next MultiXactOffset: 190825
Jun 8 06:44:36 mxlqa401 postgres[21574]: [6-1] LOG: database system was not properly shut down; automatic recovery in progress

I tried to bring up a postgres backend process to get into the database in single-user mode and that won't work either:

bash-3.2$ postgres -D /mxl/var/pgsql/data
PANIC: could not access status of transaction 3005778382
DETAIL: could not read from file "pg_clog/0B32" at offset 139264: Success
Aborted

bash-3.2$ postgres -D /mxl/var/pgsql/data -d 5 postgres
PANIC: could not access status of transaction 3005778382
DETAIL: could not read from file "pg_clog/0B32" at offset 139264: Success
Aborted

Any suggestions other than the obvious (restore from backup) would be appreciated.

Thanks,

Keaton

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Keaton Adams (#1)
Re: Any way to bring up a PG instance with corrupted data in it?

Keaton Adams <kadams@mxlogic.com> writes:

This is a QA system and unfortunately there is no recent backup.... So as a last resort I am looking for any way to bring up Postgres when it has corrupt data in it:

pg_resetxlog?

regards, tom lane

#3Boszormenyi Zoltan
zb@cybertec.at
In reply to: Keaton Adams (#1)
Re: Any way to bring up a PG instance with corrupted data in it?

Hi,

Keaton Adams �rta:

This is a QA system and unfortunately there is no recent backup.... So
as a last resort I am looking for any way to bring up Postgres when it
has corrupt data in it:

FATAL: could not remove old lock file "postmaster.pid": Read-only file
system
HINT: The file seems accidentally left over, but it could not be
removed. Please remove the file by hand and try again.

The message above should give you a clue.
Repair the file system first and remount read-write.
Then try again to bring up the postmaster.

Jun 8 06:43:16 mxlqa401 postgres[21401]: [1-1] FATAL: could not remove
old lock file "postmaster.pid": Read-only file system
Jun 8 06:43:16 mxlqa401 postgres[21401]: [1-2] HINT: The file seems
accidentally left over, but it could not be removed. Please remove the
file by hand and try again
.
FATAL: could not remove old lock file "postmaster.pid": Read-only file
system
HINT: The file seems accidentally left over, but it could not be
removed. Please remove the file by hand and try again.
Jun 8 06:43:29 mxlqa401 postgres[21476]: [1-1] FATAL: could not remove
old lock file "postmaster.pid": Read-only file system
Jun 8 06:43:29 mxlqa401 postgres[21476]: [1-2] HINT: The file seems
accidentally left over, but it could not be removed. Please remove the
file by hand and try again
.
Jun 8 06:44:23 mxlqa401 postgres[21520]: [1-1] LOG: database system
was interrupted at 2009-06-05 21:52:54 MDT
Jun 8 06:44:24 mxlqa401 postgres[21520]: [2-1] LOG: checkpoint record
is at 134/682530F0
Jun 8 06:44:24 mxlqa401 postgres[21520]: [3-1] LOG: redo record is at
134/682530F0; undo record is at 0/0; shutdown FALSE
Jun 8 06:44:24 mxlqa401 postgres[21520]: [4-1] LOG: next transaction
ID: 3005778382; next OID: 103111004
Jun 8 06:44:24 mxlqa401 postgres[21520]: [5-1] LOG: next MultiXactId:
93647; next MultiXactOffset: 190825
Jun 8 06:44:24 mxlqa401 postgres[21520]: [6-1] LOG: database system
was not properly shut down; automatic recovery in progress
Jun 8 06:44:24 mxlqa401 postgres[21520]: [7-1] LOG: redo starts at
134/68253134
Jun 8 06:44:24 mxlqa401 postgres[21520]: [8-1] PANIC: could not access
status of transaction 3005778383
Jun 8 06:44:24 mxlqa401 postgres[21520]: [8-2] DETAIL: could not read
from file "pg_clog/0B32" at offset 139264: Success
Jun 8 06:44:29 mxlqa401 postgres[21518]: [1-1] LOG: startup process
(PID 21520) was terminated by signal 6
Jun 8 06:44:29 mxlqa401 postgres[21518]: [2-1] LOG: aborting startup
due to startup process failure
Jun 8 06:44:36 mxlqa401 postgres[21574]: [1-1] LOG: database system
was interrupted while in recovery at 2009-06-08 06:44:24 MDT
Jun 8 06:44:36 mxlqa401 postgres[21574]: [1-2] HINT: This probably
means that some data is corrupted and you will have to use the last
backup for recovery.
Jun 8 06:44:36 mxlqa401 postgres[21574]: [2-1] LOG: checkpoint record
is at 134/682530F0
Jun 8 06:44:36 mxlqa401 postgres[21574]: [3-1] LOG: redo record is at
134/682530F0; undo record is at 0/0; shutdown FALSE
Jun 8 06:44:36 mxlqa401 postgres[21574]: [4-1] LOG: next transaction
ID: 3005778382; next OID: 103111004
Jun 8 06:44:36 mxlqa401 postgres[21574]: [5-1] LOG: next MultiXactId:
93647; next MultiXactOffset: 190825
Jun 8 06:44:36 mxlqa401 postgres[21574]: [6-1] LOG: database system
was not properly shut down; automatic recovery in progress

I tried to bring up a postgres backend process to get into the
database in single-user mode and that won�t work either:

bash-3.2$ postgres -D /mxl/var/pgsql/data
PANIC: could not access status of transaction 3005778382
DETAIL: could not read from file "pg_clog/0B32" at offset 139264: Success
Aborted

bash-3.2$ postgres -D /mxl/var/pgsql/data -d 5 postgres
PANIC: could not access status of transaction 3005778382
DETAIL: could not read from file "pg_clog/0B32" at offset 139264: Success
Aborted

Any suggestions other than the obvious (restore from backup) would be
appreciated.

Thanks,

Keaton

--
Bible has answers for everything. Proof:
"But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil." (Matthew 5:37) - basics of digital technology.
"May your kingdom come" - superficial description of plate tectonics

----------------------------------
Zolt�n B�sz�rm�nyi
Cybertec Sch�nig & Sch�nig GmbH
http://www.postgresql.at/

#4Keaton Adams
kadams@mxlogic.com
In reply to: Tom Lane (#2)
Re: Any way to bring up a PG instance with corrupted data in it?

I had to calculate out the next transaction ID and -f (force) the change, but once I did this the DB came back up. So, thanks for the info, and now I know how this works.

The plan now is to dump the databases and reload them to ensure overall database integrity.

Thanks for the reply,

Keaton

On 6/8/09 11:46 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Keaton Adams <kadams@mxlogic.com> writes:

This is a QA system and unfortunately there is no recent backup.... So as a last resort I am looking for any way to bring up Postgres when it has corrupt data in it:

pg_resetxlog?

regards, tom lane