after using pg_resetxlog, db lost

Started by zhicheng wangalmost 22 years ago20 messagesgeneral
Jump to latest
#1zhicheng wang
wang_zc@yahoo.co.uk

dear all
after we shutdown the rh_postgres-server 7.3.6, rhdb
could not start. we tried

pg_resetxlog -f PGDATA

then the server can be started, but only template0 and
template1 db available.

our database not listed.

please any help

thanks

cheng

____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

#2Richard Huxton
dev@archonet.com
In reply to: zhicheng wang (#1)
Re: after using pg_resetxlog, db lost

zhicheng wang wrote:

dear all
after we shutdown the rh_postgres-server 7.3.6, rhdb
could not start. we tried

pg_resetxlog -f PGDATA

then the server can be started, but only template0 and
template1 db available.

our database not listed.

Was there a crash?
What do your logs say?
How much disk space does your /data/base directory use and is that
enough for your data?
Do you have your backups available?

--
Richard Huxton
Archonet Ltd

#3zhicheng wang
wang_zc@yahoo.co.uk
In reply to: Richard Huxton (#2)
Re: after using pg_resetxlog, db lost

Dear Richard
it was not a crash. we issued poweroff command, then
we used a dos floppy to upgrade bios on the fibrecard.
then when we reboot into the redhat AS3, the rhdb
could not start.

the log is attached.

after using pg_resetxlog, we cannot see our db, only
template0/1 listed by psql -l

please help

cheng

 --- Richard Huxton <dev@archonet.com> wrote: >
zhicheng wang wrote:

dear all
after we shutdown the rh_postgres-server 7.3.6,

rhdb

could not start. we tried

pg_resetxlog -f PGDATA

then the server can be started, but only template0

and

template1 db available.

our database not listed.

Was there a crash?
What do your logs say?
How much disk space does your /data/base directory
use and is that
enough for your data?
Do you have your backups available?

--
Richard Huxton
Archonet Ltd

=====
Best wishes
Z C Wang

____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

Attachments:

rhdb.logapplication/octet-stream; name=rhdb.logDownload
#4zhicheng wang
wang_zc@yahoo.co.uk
In reply to: Richard Huxton (#2)
Re: after using pg_resetxlog, db lost

following my last email:

disk only 50% used

cheng

 --- Richard Huxton <dev@archonet.com> wrote: >
zhicheng wang wrote:

dear all
after we shutdown the rh_postgres-server 7.3.6,

rhdb

could not start. we tried

pg_resetxlog -f PGDATA

then the server can be started, but only template0

and

template1 db available.

our database not listed.

Was there a crash?
What do your logs say?
How much disk space does your /data/base directory
use and is that
enough for your data?
Do you have your backups available?

--
Richard Huxton
Archonet Ltd

=====
Best wishes
Z C Wang

____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

#5Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: zhicheng wang (#3)
Re: after using pg_resetxlog, db lost

On Tue, Jun 01, 2004 at 01:49:36PM +0100, zhicheng wang wrote:

it was not a crash. we issued poweroff command, then
we used a dos floppy to upgrade bios on the fibrecard.
then when we reboot into the redhat AS3, the rhdb
could not start.

the log is attached.

after using pg_resetxlog, we cannot see our db, only
template0/1 listed by psql -l

Why did you issue the pg_resetxlog command at all? Did the database
refuse to start?

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Java is clearly an example of a money oriented programming" (A. Stepanov)

#6zhicheng wang
wang_zc@yahoo.co.uk
In reply to: Alvaro Herrera (#5)
Re: after using pg_resetxlog, db lost

hi, thanks

rhdb is the postgresql for redhat ELAS3

it could not start and the error is attached

thanks

cheng

 --- Alvaro Herrera <alvherre@dcc.uchile.cl> wrote: >
On Tue, Jun 01, 2004 at 01:49:36PM +0100, zhicheng

wang wrote:

it was not a crash. we issued poweroff command,

then

we used a dos floppy to upgrade bios on the

fibrecard.

then when we reboot into the redhat AS3, the rhdb
could not start.

the log is attached.

after using pg_resetxlog, we cannot see our db,

only

template0/1 listed by psql -l

Why did you issue the pg_resetxlog command at all?
Did the database
refuse to start?

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Java is clearly an example of a money oriented
programming" (A. Stepanov)

=====
Best wishes
Z C Wang

____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

Attachments:

rhdb.logapplication/octet-stream; name=rhdb.logDownload
#7Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: zhicheng wang (#6)
Re: after using pg_resetxlog, db lost

On Tue, Jun 01, 2004 at 02:43:03PM +0100, zhicheng wang wrote:

hi, thanks

rhdb is the postgresql for redhat ELAS3

it could not start and the error is attached

Your database seems completely busted. Are you running with fsync off?
IDE drives with write caching enabled? NFS or some other weirdness?

What was your procedure to shut the server down anyway? Any normal
procedure should have terminated the Postgres processes before closing
shop, althought failing to do so does not normally corrupt databases.

I assume Redhat did not produce an unreliable Postgres version!

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Postgres is bloatware by design: it was built to house
PhD theses." (Joey Hellerstein, SIGMOD annual conference 2002)

#8Richard Huxton
dev@archonet.com
In reply to: zhicheng wang (#3)
Re: after using pg_resetxlog, db lost

zhicheng wang wrote:

Dear Richard
it was not a crash. we issued poweroff command, then
we used a dos floppy to upgrade bios on the fibrecard.
then when we reboot into the redhat AS3, the rhdb
could not start.

the log is attached.

Thanks. The first line was:

Jun 1 10:43:55 linux708 postgres[5537]: [30] LOG: database system
shutdown was interrupted at 2004-05-28 16:32:08 BST

This suggests the poweroff closed down your server before PG had
finished shutting down. You probably want to inspect /var/log/messages
at around this time and see if there is anything else of value.

This shouldn't happen, especially since you are using RedHat's version
of the database on their enterprise server - probably worth logging a
bug (unless there was a good reason why PG couldn't shut down in a
reasonable time).

First thing we should do though is halt the database and backup the
/var/lib/pgsql/data/base directory (or wherever PGDATA is). Once we have
a backup we can restart the database and see what is going on.

after using pg_resetxlog, we cannot see our db, only
template0/1 listed by psql -l

I'm puzzled why this should affect what databases you can see. AFAIK the
pg_resetxlog utility should just affect transactions that were in
progress.

Look in your /var/lib/pgsql/data/base directory (or wherever PGDATA is)
and you should see one directory for each database, the name is the OID
of that database. As the "postgres" user you should be able to run the
"oid2name" utility to display the names of each. Of course, there might
be problems.

Finally, connect to template1 as user postgres and run:
SELECT oid,datname FROM pg_database;
Which will probably list the same databases as oid2name/psql -l.

If the directories are there, but the databases aren't listed then there
might be a damaged system-table index. To fix this:
1. Make sure your backups are still there.
2. Halt the database server
3. Start a single backend (connected to template0/1) and reindex the
database as described in the REINDEX command reference.

The docs are online and describe the required settings quite well. Once
reindexed, exit the single backend and restart the database. Any better?

Good luck
--
Richard Huxton
Archonet Ltd

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: zhicheng wang (#3)
Re: after using pg_resetxlog, db lost

=?iso-8859-1?q?zhicheng=20wang?= <wang_zc@yahoo.co.uk> writes:

Jun 1 10:43:55 linux708 postgres[5537]: [30] LOG: database system shutdown was interrupted at 2004-05-28 16:32:08 BST
Jun 1 10:43:55 linux708 postgres[5537]: [31] LOG: open of /var/lib/pgsql/data/pg_xlog/0000000000000000 (log file 0, segment 0) failed: No such file or directory
Jun 1 10:43:55 linux708 postgres[5537]: [32] LOG: invalid primary checkpoint record
Jun 1 10:43:55 linux708 postgres[5537]: [33] LOG: open of /var/lib/pgsql/data/pg_xlog/0000000000000000 (log file 0, segment 0) failed: No such file or directory
Jun 1 10:43:55 linux708 postgres[5537]: [34] LOG: invalid secondary checkpoint record
Jun 1 10:43:55 linux708 postgres[5537]: [35] PANIC: unable to locate a valid checkpoint record

Hm, was this a very new Postgres installation? The links to log file
0/0 suggest that it was so new as to not yet have accumulated 16Mb worth
of WAL traffic ... which is not a lot of traffic.

If the links are accurate then what must have happened is that your disk
subsystem lost the physical xlog file.

If the links are not accurate then this suggests corruption of the
pg_control file (i.e., overwriting those fields with zeroes). I find
this idea a bit improbable, though, because the pg_control file has
a CRC64 checksum. It seems very unlikely that corruption of the
pg_control file wouldn't have been noticed and complained of.

In any case, it seems that your upgrade to new disk hardware did not go
as smoothly as you thought. I'd be pretty surprised if the Postgres
files are the only ones that got corrupted --- you'd better look around
and find out what else is broken :-(

regards, tom lane

#10zhicheng wang
wang_zc@yahoo.co.uk
In reply to: Richard Huxton (#8)
Re: after using pg_resetxlog, db lost

Dear Richard

you have pointed me to a very good direction.
under /var/lib/pgsql/data/base there three directoies:

1
16975
4205811

i think that the first two are template0/1 and the
third one is our db.

SELECT oid,datname FROM pg_database;

only listed template0/1 as you have preducted.

can you please help me with more details;

how do i Start a single backend (connected to
template0/1) and reindex the

thanks

cheng

 --- Richard Huxton <dev@archonet.com> wrote: >
zhicheng wang wrote:

Dear Richard
it was not a crash. we issued poweroff command,

then

we used a dos floppy to upgrade bios on the

fibrecard.

then when we reboot into the redhat AS3, the rhdb
could not start.

the log is attached.

Thanks. The first line was:

Jun 1 10:43:55 linux708 postgres[5537]: [30] LOG:
database system
shutdown was interrupted at 2004-05-28 16:32:08 BST

This suggests the poweroff closed down your server
before PG had
finished shutting down. You probably want to inspect
/var/log/messages
at around this time and see if there is anything
else of value.

This shouldn't happen, especially since you are
using RedHat's version
of the database on their enterprise server -
probably worth logging a
bug (unless there was a good reason why PG couldn't
shut down in a
reasonable time).

First thing we should do though is halt the database
and backup the
/var/lib/pgsql/data/base directory (or wherever
PGDATA is). Once we have
a backup we can restart the database and see what is
going on.

after using pg_resetxlog, we cannot see our db,

only

template0/1 listed by psql -l

I'm puzzled why this should affect what databases
you can see. AFAIK the
pg_resetxlog utility should just affect
transactions that were in
progress.

Look in your /var/lib/pgsql/data/base directory (or
wherever PGDATA is)
and you should see one directory for each database,
the name is the OID
of that database. As the "postgres" user you should
be able to run the
"oid2name" utility to display the names of each. Of
course, there might
be problems.

Finally, connect to template1 as user postgres and
run:
SELECT oid,datname FROM pg_database;
Which will probably list the same databases as
oid2name/psql -l.

If the directories are there, but the databases
aren't listed then there
might be a damaged system-table index. To fix this:
1. Make sure your backups are still there.
2. Halt the database server
3. Start a single backend (connected to template0/1)
and reindex the
database as described in the REINDEX command
reference.

The docs are online and describe the required
settings quite well. Once
reindexed, exit the single backend and restart the
database. Any better?

Good luck
--
Richard Huxton
Archonet Ltd

---------------------------(end of
broadcast)---------------------------
TIP 8: explain analyze is your friend

=====
Best wishes
Z C Wang

____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

#11Richard Huxton
dev@archonet.com
In reply to: zhicheng wang (#10)
Re: after using pg_resetxlog, db lost

zhicheng wang wrote:

Dear Richard

you have pointed me to a very good direction.
under /var/lib/pgsql/data/base there three directoies:

1
16975
4205811

i think that the first two are template0/1 and the
third one is our db.

SELECT oid,datname FROM pg_database;

only listed template0/1 as you have preducted.

can you please help me with more details;

how do i Start a single backend (connected to
template0/1) and reindex the

thanks

cheng

--- Richard Huxton <dev@archonet.com> wrote: >
zhicheng wang wrote:

Dear Richard
it was not a crash. we issued poweroff command,

then

we used a dos floppy to upgrade bios on the

fibrecard.

then when we reboot into the redhat AS3, the rhdb
could not start.

the log is attached.

Thanks. The first line was:

Jun 1 10:43:55 linux708 postgres[5537]: [30] LOG:
database system
shutdown was interrupted at 2004-05-28 16:32:08 BST

This suggests the poweroff closed down your server
before PG had
finished shutting down. You probably want to inspect
/var/log/messages
at around this time and see if there is anything
else of value.

This shouldn't happen, especially since you are
using RedHat's version
of the database on their enterprise server -
probably worth logging a
bug (unless there was a good reason why PG couldn't
shut down in a
reasonable time).

First thing we should do though is halt the database
and backup the
/var/lib/pgsql/data/base directory (or wherever
PGDATA is). Once we have
a backup we can restart the database and see what is
going on.

after using pg_resetxlog, we cannot see our db,

only

template0/1 listed by psql -l

I'm puzzled why this should affect what databases
you can see. AFAIK the
pg_resetxlog utility should just affect
transactions that were in
progress.

Look in your /var/lib/pgsql/data/base directory (or
wherever PGDATA is)
and you should see one directory for each database,
the name is the OID
of that database. As the "postgres" user you should
be able to run the
"oid2name" utility to display the names of each. Of
course, there might
be problems.

Finally, connect to template1 as user postgres and
run:
SELECT oid,datname FROM pg_database;
Which will probably list the same databases as
oid2name/psql -l.

If the directories are there, but the databases
aren't listed then there
might be a damaged system-table index. To fix this:
1. Make sure your backups are still there.
2. Halt the database server
3. Start a single backend (connected to template0/1)
and reindex the
database as described in the REINDEX command
reference.

The docs are online and describe the required
settings quite well. Once
reindexed, exit the single backend and restart the
database. Any better?

Follow the step-by-step instructions in the REINDEX section of the docs.
The manuals are online at http://www.postgresql.org/docs/ and you want
to look in the "SQL Command reference" section.

No guarantee your data is OK though, I can't think why the system index
should be damaged unless you were e.g. creating a new database as you
were shutting down the machine.

--
Richard Huxton
Archonet Ltd

#12zhicheng wang
wang_zc@yahoo.co.uk
In reply to: Richard Huxton (#11)
Re: after using pg_resetxlog, db lost

Hi Richard Huxton

sorry to have bothered you with trivial things.

the reindex give these error:

backend> REINDEX DATABASE miamevice;
ERROR:  XLogFlush: request 0/BB4C3560 is not satisfied
--- flushed only to 0/20001D8
WARNING:  write error may be permanent: cannot write
block 29 for 4205811/1249
backend> \q;
ERROR:  parser: parse error at or near "\" at
character 1
backend> q\;
ERROR:  parser: parse error at or near "q" at
character 1
backend> LOG:  shutting down
PANIC:  XLogFlush: request 0/BB4C3560 is not satisfied
--- flushed only to 0/20001D8
Aborted

does this mean that we cannot recover our data?

cheng

<dev@archonet.com> wrote: > zhicheng wang wrote:

Dear Richard

you have pointed me to a very good direction.
under /var/lib/pgsql/data/base there three

directoies:

1
16975
4205811

i think that the first two are template0/1 and the
third one is our db.

SELECT oid,datname FROM pg_database;

only listed template0/1 as you have preducted.

can you please help me with more details;

how do i Start a single backend (connected to
template0/1) and reindex the

thanks

cheng

--- Richard Huxton <dev@archonet.com> wrote: >
zhicheng wang wrote:

Dear Richard
it was not a crash. we issued poweroff command,

then

we used a dos floppy to upgrade bios on the

fibrecard.

then when we reboot into the redhat AS3, the

rhdb

could not start.

the log is attached.

Thanks. The first line was:

Jun 1 10:43:55 linux708 postgres[5537]: [30] LOG:

database system
shutdown was interrupted at 2004-05-28 16:32:08

BST

This suggests the poweroff closed down your server
before PG had
finished shutting down. You probably want to

inspect

/var/log/messages
at around this time and see if there is anything
else of value.

This shouldn't happen, especially since you are
using RedHat's version
of the database on their enterprise server -
probably worth logging a
bug (unless there was a good reason why PG

couldn't

shut down in a
reasonable time).

First thing we should do though is halt the

database

and backup the
/var/lib/pgsql/data/base directory (or wherever
PGDATA is). Once we have
a backup we can restart the database and see what

is

going on.

after using pg_resetxlog, we cannot see our db,

only

template0/1 listed by psql -l

I'm puzzled why this should affect what databases
you can see. AFAIK the
pg_resetxlog utility should just affect
transactions that were in
progress.

Look in your /var/lib/pgsql/data/base directory

(or

wherever PGDATA is)
and you should see one directory for each

database,

the name is the OID
of that database. As the "postgres" user you

should

be able to run the
"oid2name" utility to display the names of each.

Of

course, there might
be problems.

Finally, connect to template1 as user postgres and
run:
SELECT oid,datname FROM pg_database;
Which will probably list the same databases as
oid2name/psql -l.

If the directories are there, but the databases
aren't listed then there
might be a damaged system-table index. To fix

this:

1. Make sure your backups are still there.
2. Halt the database server
3. Start a single backend (connected to

template0/1)

and reindex the
database as described in the REINDEX command
reference.

The docs are online and describe the required
settings quite well. Once
reindexed, exit the single backend and restart the
database. Any better?

Follow the step-by-step instructions in the REINDEX
section of the docs.
The manuals are online at
http://www.postgresql.org/docs/ and you want
to look in the "SQL Command reference" section.

No guarantee your data is OK though, I can't think
why the system index
should be damaged unless you were e.g. creating a
new database as you
were shutting down the machine.

--
Richard Huxton
Archonet Ltd

=====
Best wishes
Z C Wang

____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

#13Richard Huxton
dev@archonet.com
In reply to: zhicheng wang (#12)
Re: after using pg_resetxlog, db lost

zhicheng wang wrote:

Hi Richard Huxton

sorry to have bothered you with trivial things.

the reindex give these error:

backend> REINDEX DATABASE miamevice;
ERROR:  XLogFlush: request 0/BB4C3560 is not satisfied
--- flushed only to 0/20001D8
WARNING:  write error may be permanent: cannot write
block 29 for 4205811/1249
backend> \q;
ERROR:  parser: parse error at or near "\" at
character 1
backend> q\;
ERROR:  parser: parse error at or near "q" at
character 1
backend> LOG:  shutting down
PANIC:  XLogFlush: request 0/BB4C3560 is not satisfied
--- flushed only to 0/20001D8
Aborted

does this mean that we cannot recover our data?

Well the problems with "\q" are because you need to press CTRL+D to end
the session. The inability to write is something I've not seen before,
so I've cc'd Tom Lane on this.

It doesn't look good though.

--
Richard Huxton
Archonet Ltd

#14zhicheng wang
wang_zc@yahoo.co.uk
In reply to: Richard Huxton (#13)
Re: after using pg_resetxlog, db lost

thanks

i can now connect to my db (miamevice)

but nothing can be listed. the error

bash-2.05b$ psql template1
ERROR: Index pg_statistic_relid_att_index is not a
btree
Welcome to psql 7.3.6-RH, the PostgreSQL interactive
terminal.

bash-2.05b$ psql miamevice
Welcome to psql 7.3.6-RH, the PostgreSQL interactive
terminal.

any indications?

=====
Best wishes
Z C Wang

____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

#15zhicheng wang
wang_zc@yahoo.co.uk
In reply to: Richard Huxton (#13)
Re: after using pg_resetxlog, db lost

Hi Richard Huxton

below is the rhdb part of the shutdown log

any indications for the failed restart?

thanks
cheng

May 28 15:43:37 sanlinux rhdb: Stopping PostgreSQL -
Red Hat Edition service:
May 28 15:43:37 sanlinux su(pam_unix)[12400]: session
opened for user postgres by (uid=0)
May 28 15:43:40 sanlinux su(pam_unix)[12400]: session
closed for user postgres
May 28 15:43:40 sanlinux rhdb: ^[[60G[
May 28 15:43:40 sanlinux rhdb:
May 28 15:43:40 sanlinux rc: Stopping rhdb: succeeded

=====
Best wishes
Z C Wang

____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

#16Richard Huxton
dev@archonet.com
In reply to: zhicheng wang (#15)
Re: after using pg_resetxlog, db lost

zhicheng wang wrote:

Hi Richard Huxton

below is the rhdb part of the shutdown log

any indications for the failed restart?

thanks
cheng

May 28 15:43:37 sanlinux rhdb: Stopping PostgreSQL -
Red Hat Edition service:
May 28 15:43:37 sanlinux su(pam_unix)[12400]: session
opened for user postgres by (uid=0)
May 28 15:43:40 sanlinux su(pam_unix)[12400]: session
closed for user postgres
May 28 15:43:40 sanlinux rhdb: ^[[60G[
May 28 15:43:40 sanlinux rhdb:
May 28 15:43:40 sanlinux rc: Stopping rhdb: succeeded

Not here - what do the postgresql logs show?

--
Richard Huxton
Archonet Ltd

#17zhicheng wang
wang_zc@yahoo.co.uk
In reply to: Richard Huxton (#16)
Re: after using pg_resetxlog, db lost

Hi,

the /var/log/pgsql is empty
but on this message log, what is "rhdb: ^[[60G["?
i do not see this before

thanks
cheng

 --- Richard Huxton <dev@archonet.com> wrote: >
zhicheng wang wrote:

Hi Richard Huxton

below is the rhdb part of the shutdown log

any indications for the failed restart?

thanks
cheng

May 28 15:43:37 sanlinux rhdb: Stopping PostgreSQL

-

Red Hat Edition service:
May 28 15:43:37 sanlinux su(pam_unix)[12400]:

session

opened for user postgres by (uid=0)
May 28 15:43:40 sanlinux su(pam_unix)[12400]:

session

closed for user postgres
May 28 15:43:40 sanlinux rhdb: ^[[60G[
May 28 15:43:40 sanlinux rhdb:
May 28 15:43:40 sanlinux rc: Stopping rhdb:

succeeded

Not here - what do the postgresql logs show?

--
Richard Huxton
Archonet Ltd

---------------------------(end of
broadcast)---------------------------
TIP 2: you can get off all lists at once with the
unregister command
(send "unregister YourEmailAddressHere" to

majordomo@postgresql.org)

=====
Best wishes
Z C Wang

____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

#18zhicheng wang
wang_zc@yahoo.co.uk
In reply to: Tom Lane (#9)
Re: after using pg_resetxlog, db lost

sorry for the late reply

in case it is useful to any one. the db server uses
san to store the data. the update is only to the bios
of the fibre card. if this is wrong, many other files
should also go wrong, which is not the case.

cheng

 --- Tom Lane <tgl@sss.pgh.pa.us> wrote: >
=?iso-8859-1?q?zhicheng=20wang?=

<wang_zc@yahoo.co.uk> writes:

Jun 1 10:43:55 linux708 postgres[5537]: [30] LOG:

database system shutdown was interrupted at
2004-05-28 16:32:08 BST

Jun 1 10:43:55 linux708 postgres[5537]: [31] LOG:

open of
/var/lib/pgsql/data/pg_xlog/0000000000000000 (log
file 0, segment 0) failed: No such file or directory

Jun 1 10:43:55 linux708 postgres[5537]: [32] LOG:

invalid primary checkpoint record

Jun 1 10:43:55 linux708 postgres[5537]: [33] LOG:

open of
/var/lib/pgsql/data/pg_xlog/0000000000000000 (log
file 0, segment 0) failed: No such file or directory

Jun 1 10:43:55 linux708 postgres[5537]: [34] LOG:

invalid secondary checkpoint record

Jun 1 10:43:55 linux708 postgres[5537]: [35]

PANIC: unable to locate a valid checkpoint record

Hm, was this a very new Postgres installation? The
links to log file
0/0 suggest that it was so new as to not yet have
accumulated 16Mb worth
of WAL traffic ... which is not a lot of traffic.

If the links are accurate then what must have
happened is that your disk
subsystem lost the physical xlog file.

If the links are not accurate then this suggests
corruption of the
pg_control file (i.e., overwriting those fields with
zeroes). I find
this idea a bit improbable, though, because the
pg_control file has
a CRC64 checksum. It seems very unlikely that
corruption of the
pg_control file wouldn't have been noticed and
complained of.

In any case, it seems that your upgrade to new disk
hardware did not go
as smoothly as you thought. I'd be pretty surprised
if the Postgres
files are the only ones that got corrupted --- you'd
better look around
and find out what else is broken :-(

regards, tom lane

=====
Best wishes
Z C Wang

____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: zhicheng wang (#18)
Re: after using pg_resetxlog, db lost

=?iso-8859-1?q?zhicheng=20wang?= <wang_zc@yahoo.co.uk> writes:

in case it is useful to any one. the db server uses
san to store the data. the update is only to the bios
of the fibre card. if this is wrong, many other files
should also go wrong, which is not the case.

What you should be looking at is files that were written just before
the shutdown. AFAICS the symptoms you've reported can only be explained
by assuming that the disk failed to record quite a number of writes that
were issued by Postgres just before shutdown, and it did not respect the
write/fsync order in deciding which writes it did record. This is
unfortunately fairly common behavior in IDE disks with write caching
enabled...

BTW, you never answered my question about how much data the installation
had (ie, whether it could still really be using xlog segment 0).

regards, tom lane

#20zhicheng wang
wang_zc@yahoo.co.uk
In reply to: Tom Lane (#19)
Re: after using pg_resetxlog, db lost

it may be not that bad - thanks to Martijn van
Oosterhout's tool, we can recover all the tables apart
from three which type could not be recognised.

this was a test db. but i have learnt a lot and also
have a feel of the seriousness if the real thing
broken.

Thanks to every one and i hope that no is caught in
this situation.

cheng

--- Tom Lane <tgl@sss.pgh.pa.us> wrote: > > what is
the indication?

I think you're out of luck :-(. Judging from your
messages over the
past few days, your disk drive committed multiple
major corruptions
of the data it was entrusted with. I hope you have
a reasonably
recent backup to go back to.

regards, tom lane

=====
Best wishes
Z C Wang

____________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html