Backend died abnormally - postgresql 7.2.1-5

Started by Rick Eicher IIover 23 years ago5 messagesgeneral
Jump to latest
#1Rick Eicher II
rick@pbol.net

Hello all,

I am happy to report that this is the first time I have had a moment of
trouble with postgresql.

I have upgraded to 7.2.1-5 from version 7.1.3-2. (Redhat rpms)

I did a pg_dumpall on the old version, installed the new version and
then restored the databases.

The problem seems to be that the backend is dieing. This gives this
error in apache logs.

###############################################################
NOTICE: Message from PostgreSQL backend:
The Postmaster has informed me that some other backend
died abnormally and possibly corrupted shared memory.
I have rolled back the current transaction and am
going to terminate your database system connection and exit.
Please reconnect to the database system and repeat your query.
DBD::Pg::db disconnect failed: rollback failed at
/usr/local/lib/perl5/site_perl/5.6.1//FS/UID.pm line 68.
NOTICE: Message from PostgreSQL backend:
The Postmaster has informed me that some other backend
died abnormally and possibly corrupted shared memory.
I have rolled back the current transaction and am
going to terminate your database system connection and exit.
Please reconnect to the database system and repeat your query.
DBD::Pg::db commit failed: begin failed at
/usr/local/lib/perl5/site_perl/5.6.1//FS/Record.pm line 270.

#################################################################

I have search the archives/docs and see some talk on a lack of memory
and/or swap space might be the cause. I see no indication of this with
the 'free' command take right after errors accorded.

################################################################
[root@nemisis httpd]# free
total used free shared buffers cached
Mem: 512440 504104 8336 1240 130028
303896
-/+ buffers/cache: 70180 442260
Swap: 80284 312 79972

################################################################

1. Does this system need more memory?

2. What should be my next step in finding this problem?

Thank you for your time,
Rick Eicher II

#2Neil Conway
neilc@samurai.com
In reply to: Rick Eicher II (#1)
Re: Backend died abnormally - postgresql 7.2.1-5

On Tue, Jul 16, 2002 at 09:22:47AM -0500, Rick Eicher II wrote:

NOTICE: Message from PostgreSQL backend:
The Postmaster has informed me that some other backend
died abnormally and possibly corrupted shared memory.
I have rolled back the current transaction and am
going to terminate your database system connection and exit.
Please reconnect to the database system and repeat your query.

[root@nemisis httpd]# free
total used free shared buffers cached
Mem: 512440 504104 8336 1240 130028
303896
-/+ buffers/cache: 70180 442260
Swap: 80284 312 79972

1. Does this system need more memory?

Doesn't look like it. In general, it might be wise to use a bit more
swap, but that doesn't appear to be causing the problem.

2. What should be my next step in finding this problem?

Is the crash reproducible, and if so, can you post the query or
situation that causes the crash to occur? (you can enable query
logging with debug_print_query in postgresql.conf)

Is there a core file in one of your database directories -- and if
so, can you get a backtrace from it using gdb? It might also be
useful to get a backtrace from a debugging build (--enable-debug).

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

#3Neil Conway
neilc@samurai.com
In reply to: Neil Conway (#2)
Re: Backend died abnormally - postgresql 7.2.1-5

[Cc'ed to -general so that others can help]

On Tue, Jul 16, 2002 at 09:51:48AM -0500, Rick Eicher II wrote:

Is the crash reproducible, and if so, can you post the query or
situation that causes the crash to occur? (you can enable query
logging with debug_print_query in postgresql.conf)

The crash is reproducible. Some examples of a query would be:

Select * from cust_main where last='smith';
Select * from cust_main;

Are there any additional errors in the logs?

With an error that fundamental, I'd suspect hardware problems, namely
bad RAM. Would it be possible to run memtest86 on the machine?

I do have some joins queries but I seem to get this error with any of
query. If I issue the same query four times I will get the error one
time. I have uncommented this line (and others) in postgresql.conf but
do not get any log entries after restart.

Is there a core file in one of your database directories -- and if
so, can you get a backtrace from it using gdb? It might also be
useful to get a backtrace from a debugging build (--enable-debug).

No core file.

Are you sure your system is setup to allow core dumps -- i.e.
does "ulimit -c" produce "unlimited"?

Also, make sure you're looking in the right place for
core files ($PGDATA/base/$oid_of_db/core)

Should I get the source and build it instead of using rpms?

Might be a good bet -- at the least, it should produce a more
helpful backtrace.

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

#4Rick Eicher II
rick@pbol.net
In reply to: Neil Conway (#3)
Re: Backend died abnormally - postgresql 7.2.1-5

On Tue, Jul 16, 2002 at 09:51:48AM -0500, Rick Eicher II wrote:

Is the crash reproducible, and if so, can you post the query or
situation that causes the crash to occur? (you can enable query
logging with debug_print_query in postgresql.conf)

The crash is reproducible. Some examples of a query would be:

Select * from cust_main where last='smith';
Select * from cust_main;

Are there any additional errors in the logs?

With an error that fundamental, I'd suspect hardware problems, namely
bad RAM. Would it be possible to run memtest86 on the machine?

I do have some joins queries but I seem to get this error with any

of

query. If I issue the same query four times I will get the error one
time. I have uncommented this line (and others) in postgresql.conf

but

do not get any log entries after restart.

Is there a core file in one of your database directories -- and if
so, can you get a backtrace from it using gdb? It might also be
useful to get a backtrace from a debugging build (--enable-debug).

No core file.

Are you sure your system is setup to allow core dumps -- i.e.
does "ulimit -c" produce "unlimited"?

[root@nemisis root]# ulimit -c
1000000

Also, make sure you're looking in the right place for
core files ($PGDATA/base/$oid_of_db/core)

I was not looking in the right place before. But still no core files
found.

Should I get the source and build it instead of using rpms?

Trying to decide what is the best plan of attack since this is a
production machine. But I think I will run the memtest86 on it first.

Show quoted text

Might be a good bet -- at the least, it should produce a more
helpful backtrace.

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rick Eicher II (#4)
Re: Backend died abnormally - postgresql 7.2.1-5

There should be more information in the postmaster log than you've shown
us, too. The messages you reported are all from backends *other* than
the one that actually crashed. At the very least the log should have
the postmaster's report of an unexpected child death, with a signal
code.

Also, if you can reproducibly provoke the error, try attaching to the
backend process with gdb before you do so. gdb should be able to give a
backtrace from the crash point even if no core file results.

regards, tom lane