7.1 startup recovery failure

Started by Hiroshi Inoueabout 25 years ago8 messageshackers

Jump to latest

Hiroshi Inoue

Inoue@tpf.co.jp

about 25 years ago

Hi,
There's a report of startup recovery failure in Japan.
Redo done but ...
Unfortunately I have no time today.

regards,
Hiroshi Inoue

KAMI wrote:

Show quoted text

DEBUG: database system shutdown was interrupted at 2001-04-26 22:15:00 JST
DEBUG: CheckPoint record at (1, 3923829232)
DEBUG: Redo record at (1, 3923829232); Undo record at (0, 0); Shutdown TRUE
DEBUG: NextTransactionId: 7473265; NextOid: 2550911
DEBUG: database system was not properly shut down; automatic recovery in
progress...
DEBUG: redo starts at (1, 3923829296)
DEBUG: ReadRecord: record with zero len at (1, 3923880136)
DEBUG: redo done at (1, 3923880100)
FATAL 2: XLogFlush: request is not satisfied
postmaster: Startup proc 4228 exited with status 512 - abort

Mikheev, Vadim

vmikheev@SECTORBASE.COM

about 25 years ago

In reply to: Hiroshi Inoue (#1)

Re: 7.1 startup recovery failure

There's a report of startup recovery failure in Japan.
Redo done but ...
Unfortunately I have no time today.

Please ask to start up with wal_debug = 1...

Vadim

Tom Lane

tgl@sss.pgh.pa.us

about 25 years ago

In reply to: Hiroshi Inoue (#1)

Re: 7.1 startup recovery failure

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

There's a report of startup recovery failure in Japan.

DEBUG: redo done at (1, 3923880100)
FATAL 2: XLogFlush: request is not satisfied
postmaster: Startup proc 4228 exited with status 512 - abort

Is this person using 7.1 release, or a beta/RC version? That looks
just like the last WAL bug Vadim fixed before final ...

regards, tom lane

Mikheev, Vadim

vmikheev@SECTORBASE.COM

about 25 years ago

In reply to: Tom Lane (#3)

RE: 7.1 startup recovery failure

There's a report of startup recovery failure in Japan.

DEBUG: redo done at (1, 3923880100)
FATAL 2: XLogFlush: request is not satisfied
postmaster: Startup proc 4228 exited with status 512 - abort

Is this person using 7.1 release, or a beta/RC version? That looks
just like the last WAL bug Vadim fixed before final ...

No, it doesn't. That bug was related to cases when there is no room
on last log page for startup checkpoint. ~5k is free in this case.

Vadim

Import Notes

Resolved by subject fallback

Hiroshi Inoue

Inoue@tpf.co.jp

about 25 years ago

In reply to: Mikheev, Vadim (#4)

Re: 7.1 startup recovery failure

"Mikheev, Vadim" wrote:

There's a report of startup recovery failure in Japan.

DEBUG: redo done at (1, 3923880100)
FATAL 2: XLogFlush: request is not satisfied
postmaster: Startup proc 4228 exited with status 512 - abort

Is this person using 7.1 release, or a beta/RC version? That looks
just like the last WAL bug Vadim fixed before final ...

No, it doesn't. That bug was related to cases when there is no room
on last log page for startup checkpoint. ~5k is free in this case.

I haven't gotten any reply from him yet.
Many people are on vacation now in Japan.
Probably we couldn't expect too much of him.

regards,
Hiroshi Inoue

Hiroshi Inoue

Inoue@tpf.co.jp

about 25 years ago

In reply to: Hiroshi Inoue (#1)

Re: 7.1 startup recovery failure

Vadim Mikheev wrote:

There's a report of startup recovery failure in Japan.
Redo done but ...
Unfortunately I have no time today.

Please ask to start up with wal_debug = 1...

Isn't it very difficult for dbas to leave the
corrupted database as it is ?
ISTM we could hardly expect to get the log with
wal_debug = 1 unless we automatically force the
log in case of recovery failures.

regards,
Hiroshi Inoue

Rod Taylor

rbt@rbt.ca

about 25 years ago

In reply to: Hiroshi Inoue (#1)

Re: 7.1 startup recovery failure

Corrupted or not, after a crash take a snapshot of the data tree
before firing it back up again. Doesn't take that much time
(especially with a netapp filer) and it allows for a virtually
unlimited number of attempts to solve the trouble or debug.

--
Rod Taylor
BarChord Entertainment Inc.
----- Original Message -----
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
To: "Vadim Mikheev" <vmikheev@sectorbase.com>
Cc: "pgsql-hackers" <pgsql-hackers@postgresql.org>
Sent: Monday, April 30, 2001 11:02 PM
Subject: Re: [HACKERS] 7.1 startup recovery failure

Vadim Mikheev wrote:

There's a report of startup recovery failure in Japan.
Redo done but ...
Unfortunately I have no time today.

Please ask to start up with wal_debug = 1...

Isn't it very difficult for dbas to leave the
corrupted database as it is ?
ISTM we could hardly expect to get the log with
wal_debug = 1 unless we automatically force the
log in case of recovery failures.

regards,
Hiroshi Inoue

---------------------------(end of

broadcast)---------------------------

Show quoted text

TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl

Alfred Perlstein

bright@wintelcom.net

about 25 years ago

In reply to: Rod Taylor (#7)

Re: 7.1 startup recovery failure

* Rod Taylor <rbt@barchord.com> [010430 22:10] wrote:

Corrupted or not, after a crash take a snapshot of the data tree
before firing it back up again. Doesn't take that much time
(especially with a netapp filer) and it allows for a virtually
unlimited number of attempts to solve the trouble or debug.

You run your database over NFS? They must be made of steel. :)

--
-Alfred Perlstein - [alfred@freebsd.org]
Daemon News Magazine in your snail-mail! http://magazine.daemonnews.org/