Big Delete Consistently Causes a Crash

Started by Tony Webbalmost 17 years ago2 messagesgeneral
Jump to latest
#1Tony Webb
amw@sanger.ac.uk

Hi all,

I have a 8.3 cluster running under VM. It seems fine for most activities
(a bit slow but error free) but if the developer issues a delete
statement it consistently kills the database.

If the database is running in archive mode then it kills the archiver,
else it kills the client session. Sample info is below.

The delete is a nasty one, and will potentially delete almost half of
the data in the database(!) but it crashes consistently after about 1
minute. Small deletes work fine.

I can understand it taking ages and giving me various warnings as it is
an unfriendly statement but it shouldn't die.

Any ideas? Is this a bug perhaps? I've tried restating the cluster with
various memory parameters set, from default to fairly greedy settings
and the behaviour is consistent.

All suggestions gratefully received.

Log entry below:

2009-06-25 03:33:34 BST LOCATION: exec_execute_message, postgres.c:

1947
2009-06-25 03:34:26 BST LOG: 00000: server process (PID 8379) was
terminated by signal 9: Killed
2009-06-25 03:34:26 BST LOCATION: LogChildExit, postmaster.c:2529
2009-06-25 03:34:26 BST LOG: 00000: terminating any other active
server processes
2009-06-25 03:34:26 BST LOCATION: HandleChildCrash, postmaster.c:2374
2009-06-25 03:34:26 BST WARNING: 57P02: terminating connection
because of crash of another server process
2009-06-25 03:34:26 BST DETAIL: The postmaster has commanded this
server process to roll back the current transaction and exit,
because another
server process exited abnormally and possibly corrupted shared memory.
2009-06-25 03:34:26 BST HINT: In a moment you should be able to
reconnect to the database and repeat your command.
2009-06-25 03:34:26 BST LOCATION: quickdie, postgres.c:2454
2009-06-25 03:34:26 BST FATAL: 57P03: the database system is in
recovery mode
2009-06-25 03:34:26 BST LOCATION: ProcessStartupPacket,
postmaster.c:1648

Cheers

Ton

--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tony Webb (#1)
Re: Big Delete Consistently Causes a Crash

Tony Webb <amw@sanger.ac.uk> writes:

I have a 8.3 cluster running under VM. It seems fine for most activities
(a bit slow but error free) but if the developer issues a delete
statement it consistently kills the database.

2009-06-25 03:34:26 BST LOG: 00000: server process (PID 8379) was
terminated by signal 9: Killed

Something is issuing kill -9 against random Postgres processes.
If you didn't do it yourself, the odds are about 100% that it was
the Linux kernel's "OOM kill" mechanism, which is best disabled
on any server box.
http://www.postgresql.org/docs/8.3/static/kernel-resources.html#AEN22235

regards, tom lane