7.1 startup recovery failure

Started by Hiroshi Inoueover 24 years ago8 messages
#1Hiroshi Inoue
Inoue@tpf.co.jp

Hi,
There's a report of startup recovery failure in Japan.
Redo done but ...
Unfortunately I have no time today.

regards,
Hiroshi Inoue

KAMI wrote:

Show quoted text

DEBUG: database system shutdown was interrupted at 2001-04-26 22:15:00 JST
DEBUG: CheckPoint record at (1, 3923829232)
DEBUG: Redo record at (1, 3923829232); Undo record at (0, 0); Shutdown TRUE
DEBUG: NextTransactionId: 7473265; NextOid: 2550911
DEBUG: database system was not properly shut down; automatic recovery in
progress...
DEBUG: redo starts at (1, 3923829296)
DEBUG: ReadRecord: record with zero len at (1, 3923880136)
DEBUG: redo done at (1, 3923880100)
FATAL 2: XLogFlush: request is not satisfied
postmaster: Startup proc 4228 exited with status 512 - abort

#2Vadim Mikheev
vmikheev@sectorbase.com
In reply to: Hiroshi Inoue (#1)
Re: 7.1 startup recovery failure

There's a report of startup recovery failure in Japan.
Redo done but ...
Unfortunately I have no time today.

Please ask to start up with wal_debug = 1...

Vadim

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hiroshi Inoue (#1)
Re: 7.1 startup recovery failure

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

There's a report of startup recovery failure in Japan.

DEBUG: redo done at (1, 3923880100)
FATAL 2: XLogFlush: request is not satisfied
postmaster: Startup proc 4228 exited with status 512 - abort

Is this person using 7.1 release, or a beta/RC version? That looks
just like the last WAL bug Vadim fixed before final ...

regards, tom lane

#4Mikheev, Vadim
vmikheev@SECTORBASE.COM
In reply to: Tom Lane (#3)
RE: 7.1 startup recovery failure

There's a report of startup recovery failure in Japan.

DEBUG: redo done at (1, 3923880100)
FATAL 2: XLogFlush: request is not satisfied
postmaster: Startup proc 4228 exited with status 512 - abort

Is this person using 7.1 release, or a beta/RC version? That looks
just like the last WAL bug Vadim fixed before final ...

No, it doesn't. That bug was related to cases when there is no room
on last log page for startup checkpoint. ~5k is free in this case.

Vadim

#5Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Mikheev, Vadim (#4)
Re: 7.1 startup recovery failure

"Mikheev, Vadim" wrote:

There's a report of startup recovery failure in Japan.

DEBUG: redo done at (1, 3923880100)
FATAL 2: XLogFlush: request is not satisfied
postmaster: Startup proc 4228 exited with status 512 - abort

Is this person using 7.1 release, or a beta/RC version? That looks
just like the last WAL bug Vadim fixed before final ...

No, it doesn't. That bug was related to cases when there is no room
on last log page for startup checkpoint. ~5k is free in this case.

I haven't gotten any reply from him yet.
Many people are on vacation now in Japan.
Probably we couldn't expect too much of him.

regards,
Hiroshi Inoue

#6Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Hiroshi Inoue (#1)
Re: 7.1 startup recovery failure

Vadim Mikheev wrote:

There's a report of startup recovery failure in Japan.
Redo done but ...
Unfortunately I have no time today.

Please ask to start up with wal_debug = 1...

Isn't it very difficult for dbas to leave the
corrupted database as it is ?
ISTM we could hardly expect to get the log with
wal_debug = 1 unless we automatically force the
log in case of recovery failures.

regards,
Hiroshi Inoue

#7Rod Taylor
rbt@barchord.com
In reply to: Hiroshi Inoue (#1)
Re: 7.1 startup recovery failure

Corrupted or not, after a crash take a snapshot of the data tree
before firing it back up again. Doesn't take that much time
(especially with a netapp filer) and it allows for a virtually
unlimited number of attempts to solve the trouble or debug.

--
Rod Taylor
BarChord Entertainment Inc.
----- Original Message -----
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
To: "Vadim Mikheev" <vmikheev@sectorbase.com>
Cc: "pgsql-hackers" <pgsql-hackers@postgresql.org>
Sent: Monday, April 30, 2001 11:02 PM
Subject: Re: [HACKERS] 7.1 startup recovery failure

Vadim Mikheev wrote:

There's a report of startup recovery failure in Japan.
Redo done but ...
Unfortunately I have no time today.

Please ask to start up with wal_debug = 1...

Isn't it very difficult for dbas to leave the
corrupted database as it is ?
ISTM we could hardly expect to get the log with
wal_debug = 1 unless we automatically force the
log in case of recovery failures.

regards,
Hiroshi Inoue

---------------------------(end of

broadcast)---------------------------

Show quoted text

TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl

#8Alfred Perlstein
bright@wintelcom.net
In reply to: Rod Taylor (#7)
Re: 7.1 startup recovery failure

* Rod Taylor <rbt@barchord.com> [010430 22:10] wrote:

Corrupted or not, after a crash take a snapshot of the data tree
before firing it back up again. Doesn't take that much time
(especially with a netapp filer) and it allows for a virtually
unlimited number of attempts to solve the trouble or debug.

You run your database over NFS? They must be made of steel. :)

--
-Alfred Perlstein - [alfred@freebsd.org]
Daemon News Magazine in your snail-mail! http://magazine.daemonnews.org/