9.1.1 crash

Started by Mike Blackwellover 14 years ago2 messagesgeneral

mike.blackwell@rrd.com

over 14 years ago

The following are the relevant log entries from a recent crash of v9.1.1
running on an older RHEL Linux box. This is the first crash we've
experienced in a lot of years of running Pg. Any assistance in how to
determine what might have caused this is welcome.

2012-02-10 13:55:59 CST [15949]: [37-1] @ LOG: 00000: server process (PID
32670) was terminated by signal 11: Segmentation fault
2012-02-10 13:55:59 CST [15949]: [38-1] @ LOCATION: LogChildExit,
postmaster.c:2881
2012-02-10 13:55:59 CST [15949]: [39-1] @ LOG: 00000: terminating any
other active server processes
2012-02-10 13:55:59 CST [15949]: [40-1] @ LOCATION: HandleChildCrash,
postmaster.c:2695
2012-02-10 13:55:59 CST [15949]: [41-1] @ LOG: 00000: all server processes
terminated; reinitializing
2012-02-10 13:55:59 CST [15949]: [42-1] @ LOCATION:
PostmasterStateMachine, postmaster.c:3116
2012-02-10 13:56:00 CST [3303]: [1-1] @ LOG: 00000: database system was
interrupted; last known up at 2012-02-10 13:54:18 CST
2012-02-10 13:56:00 CST [3303]: [2-1] @ LOCATION: StartupXLOG, xlog.c:6046
2012-02-10 13:56:00 CST [3303]: [3-1] @ LOG: 00000: database system was
not properly shut down; automatic recovery in progress
2012-02-10 13:56:00 CST [3303]: [4-1] @ LOCATION: StartupXLOG, xlog.c:6299
2012-02-10 13:56:00 CST [3303]: [5-1] @ LOG: 00000: consistent recovery
state reached at F/FC9C7588
2012-02-10 13:56:00 CST [3303]: [6-1] @ LOCATION:
CheckRecoveryConsistency, xlog.c:6958
2012-02-10 13:56:00 CST [3303]: [7-1] @ LOG: 00000: redo starts at
F/FC9A5BA8
2012-02-10 13:56:00 CST [3303]: [8-1] @ LOCATION: StartupXLOG, xlog.c:6506
2012-02-10 13:56:00 CST [3303]: [9-1] @ LOG: 00000: record with zero
length at F/FCC716F0
2012-02-10 13:56:00 CST [3303]: [10-1] @ LOCATION: ReadRecord, xlog.c:3829
2012-02-10 13:56:00 CST [3303]: [11-1] @ LOG: 00000: redo done at
F/FCC716B4
2012-02-10 13:56:00 CST [3303]: [12-1] @ LOCATION: StartupXLOG, xlog.c:6621
2012-02-10 13:56:00 CST [3303]: [13-1] @ LOG: 00000: last completed
transaction was at log time 2012-02-10 13:55:59.452228-06
2012-02-10 13:56:00 CST [3303]: [14-1] @ LOCATION: StartupXLOG, xlog.c:6626
2012-02-10 13:56:02 CST [3319]: [1-1] @ LOG: 00000: autovacuum launcher
started
2012-02-10 13:56:02 CST [3319]: [2-1] @ LOCATION: AutoVacLauncherMain,
autovacuum.c:404
2012-02-10 13:56:02 CST [15949]: [43-1] @ LOG: 00000: database system is
ready to accept connections
2012-02-10 13:56:02 CST [15949]: [44-1] @ LOCATION: reaper,
postmaster.c:2435

Laurenz Albe

laurenz.albe@cybertec.at

over 14 years ago

In reply to: Mike Blackwell (#1)

Re: 9.1.1 crash

Mike Blackwell wrote:

The following are the relevant log entries from a recent crash of v9.1.1 running on an older RHEL
Linux box. This is the first crash we've experienced in a lot of years of running Pg. Any assistance
in how to determine what might have caused this is welcome.

2012-02-10 13:55:59 CST [15949]: [37-1] @ LOG: 00000: server process (PID 32670) was terminated by
signal 11: Segmentation fault

[...]

It is difficult to find out anything after the crash if the problem
cannot be reproduced.

If you happen to have changed the core file ulimit setting away from the
default zero, you should have a core file in the data directory which
can be used to create a backtrace which shows you where the server
crashed. And even that only really helps with a debug build.

Other than that, you could make sure that hard disk and memory have
no problem (you write that it is an older box). You can try to find
out what the server was doing at the time and if you can reproduce it.

Crashes are also often caused by nonstandard C funxtions that have
been loaded into the database.

Yours,
Laurenz Albe