Total crash of my db-server

Started by Henrik Steffenover 23 years ago30 messagesgeneral
Jump to latest
#1Henrik Steffen
steffen@city-map.de

Hello all,

sometimes I experience a total crash of my
db-server while e.g. doing automated maintainance tasks:

At 2:30 am every night the webserver is shut
down, so there won't be any concurrent accesses to the
db-server. then there will be done a
VACUUM FULL

This is what happened tonight while fully vacuuming:

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

Then, the script selects all user tables and starts
reindexing them. Tonight, reindexeing the first table
started and seconds later the whole server crashed.

No ping, nothing else possible....

This is the list of recent crashes:
Tonight 02:42 am
Yesterday night 02:39 am
Tuesday at 10:34 am
Last saturday at 10:44 am
Last Tuesday at 02:19 am
The saturday before at 04:01 am
The thursday before at 04:02 am
the tuesday before at 02:25 am

Always complete crashes... only reset helped.

Most crashes occur while maintainance tasks.
However, there are some other crashes, too.

There are never any hints in /var/log/messages

I upgraded to postgresql 7.3 recently, but it doesn't
seem to help either.

I am almost desperate.

We are running some mysql-servers here, too, and I
more and more often try to imagine to move my whole
system to a mysql-server... my collegues NEVER have
had such trouble with their mysql-servers yet....

Do you have any hints for me? What can I do? My last
choice would be to move to mysql, but I am almost
desperate....

thanks for your help

--

Mit freundlichem Gru�

Henrik Steffen
Gesch�ftsf�hrer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com Tel. +49 4141 991230
mail: steffen@topconcepts.com Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline: +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------

#2Justin Clift
justin@postgresql.org
In reply to: Henrik Steffen (#1)
Re: Total crash of my db-server

Hi Henrik,

This *really* sounds like you have a system wide problem, not just a PostgreSQL problem.

Can't imagine how moving to MySQL will help with that. ;-)

What Operating System are you using, and when was the last time you patched/updated it with the vendor recommended patches?

Regards and best wishes,

Justin Clift

Henrik Steffen wrote:

Hello all,

sometimes I experience a total crash of my
db-server while e.g. doing automated maintainance tasks:

At 2:30 am every night the webserver is shut
down, so there won't be any concurrent accesses to the
db-server. then there will be done a
VACUUM FULL

This is what happened tonight while fully vacuuming:

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

Then, the script selects all user tables and starts
reindexing them. Tonight, reindexeing the first table
started and seconds later the whole server crashed.

No ping, nothing else possible....

This is the list of recent crashes:
Tonight 02:42 am
Yesterday night 02:39 am
Tuesday at 10:34 am
Last saturday at 10:44 am
Last Tuesday at 02:19 am
The saturday before at 04:01 am
The thursday before at 04:02 am
the tuesday before at 02:25 am

Always complete crashes... only reset helped.

Most crashes occur while maintainance tasks.
However, there are some other crashes, too.

There are never any hints in /var/log/messages

I upgraded to postgresql 7.3 recently, but it doesn't
seem to help either.

I am almost desperate.

We are running some mysql-servers here, too, and I
more and more often try to imagine to move my whole
system to a mysql-server... my collegues NEVER have
had such trouble with their mysql-servers yet....

Do you have any hints for me? What can I do? My last
choice would be to move to mysql, but I am almost
desperate....

thanks for your help

--

Mit freundlichem Gruß

Henrik Steffen
Geschäftsführer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com Tel. +49 4141 991230
mail: steffen@topconcepts.com Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline: +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi

#3Ian Lawrence Barwick
barwick@gmail.com
In reply to: Justin Clift (#2)
Re: Total crash of my db-server

On Sunday 15 December 2002 15:16, Justin Clift wrote:

Hi Henrik,

This *really* sounds like you have a system wide problem, not just a
PostgreSQL problem.

Can't imagine how moving to MySQL will help with that. ;-)

What Operating System are you using, and when was the last time you
patched/updated it with the vendor recommended patches?

Addtionally, have you considered the possibility of a hardware
problem? I had a fileserver once which worked perfectly in "normal"
service, but died regularly and inexplicably whenever large amounts
of data were transferred over the network to the backup machine.
Turned out to be a motherboard problem, possibly in combination
with some of the other components, because we were never able
to reproduce the problem outside of that particular machine...

Ian Barwick
barwick@gmx.net

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ian Lawrence Barwick (#3)
Re: Total crash of my db-server

This *really* sounds like you have a system wide problem, not just a
PostgreSQL problem.

Can't imagine how moving to MySQL will help with that. ;-)

Actually, moving to MySQL will make it worse. We can say with
confidence that a system lockup is not Postgres' fault because Postgres
does not (and will not) run as root. I'm not sure whether MySQL *must*
be root, but that seems to be a pretty common way of setting it up ...
and when you do that, you can't entirely exclude it from consideration
when you're looking at problems that would require root privileges to
cause.

Addtionally, have you considered the possibility of a hardware
problem?

I tend to agree with Ian on that --- it sounds more like flaky hardware
than anything else. Time for memtest86 and some disk testing too.

regards, tom lane

#5Lee Harr
missive@frontiernet.net
In reply to: Henrik Steffen (#1)
Re: Total crash of my db-server

In article <00d601c2a443$7b7b7dd0$7100a8c0@henrik>, "Henrik Steffen" wrote:

sometimes I experience a total crash of my
db-server while e.g. doing automated maintainance tasks:

The computer crashes or just the database?
It is not clear from your description.

Always complete crashes... only reset helped.

reset postgres? or are you resetting the computer?

Most crashes occur while maintainance tasks.
However, there are some other crashes, too.

Is there any commonality between crashes? Are the
others maybe during daily/ weekly OS reporting?
(Generally, heavy disk activity)

We are running some mysql-servers here, too, and I
more and more often try to imagine to move my whole
system to a mysql-server... my collegues NEVER have
had such trouble with their mysql-servers yet....

Do you have any hints for me? What can I do? My last
choice would be to move to mysql, but I am almost
desperate....

You are running mysql on the same machine? Or are these
separate systems running mysql?

My first reaction is "hardware trouble" but without
more specifics it is tough to make a diagnosis. If
you have a spare box, that might be a quick way to
see if the problem is hardware related.

#6Kevin Brown
kevin@sysexperts.com
In reply to: Henrik Steffen (#1)
Re: Total crash of my db-server

Henrik Steffen wrote:

Hello all,

sometimes I experience a total crash of my
db-server while e.g. doing automated maintainance tasks:

[...]

Then, the script selects all user tables and starts
reindexing them. Tonight, reindexeing the first table
started and seconds later the whole server crashed.

No ping, nothing else possible....

If you can't ping the system then it means that the operating system
itself has stopped working properly (the networking stack is managed
solely by the operating system).

That means that you've either managed to tickle a bug in the operating
system itself or you have a hardware problem.

You didn't mention what OS you're running under but it's more likely
that you have a hardware problem than an OS bug.

Moving to MySQL won't help you here, I'm afraid. Only fixing your
hardware will.

If this is a system that you depend on for production, I recommend
that you use ECC memory if at all possible. At least then you won't
have to worry nearly as much about the possibility of bad RAM silently
causing errors...

--
Kevin Brown kevin@sysexperts.com

#7Henrik Steffen
steffen@city-map.de
In reply to: Henrik Steffen (#1)
Re: Total crash of my db-server

Dear Justin,

I am not sure whether it's really a hardware problem,
because I have had similar problems with different machines
and different os- and pgsql-versions before... If you
browse the archive you will find postings from me about
crashes and problems the last 2-3 years...

I can only tell, that the mysql-servers we are running
have never had similar trouble - and they are run on identical
hardware and os-types under almost identical load.

Currently, I am running postgres 7.3 on a Redhat Linux
(Kernel 2.4.19). Most important software packages are
always up2date.

--

Mit freundlichem Gru�

Henrik Steffen
Gesch�ftsf�hrer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com Tel. +49 4141 991230
mail: steffen@topconcepts.com Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline: +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------

----- Original Message -----
From: "Justin Clift" <justin@postgresql.org>
To: "Henrik Steffen" <steffen@city-map.de>
Cc: <pgsql-general@postgresql.org>
Sent: Sunday, December 15, 2002 3:16 PM
Subject: Re: [GENERAL] Total crash of my db-server

Hi Henrik,

This *really* sounds like you have a system wide problem, not just a
PostgreSQL problem.

Can't imagine how moving to MySQL will help with that. ;-)

What Operating System are you using, and when was the last time you
patched/updated it with the vendor recommended patches?

Regards and best wishes,

Justin Clift

Henrik Steffen wrote:

Hello all,

sometimes I experience a total crash of my
db-server while e.g. doing automated maintainance tasks:

At 2:30 am every night the webserver is shut
down, so there won't be any concurrent accesses to the
db-server. then there will be done a
VACUUM FULL

This is what happened tonight while fully vacuuming:

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

Then, the script selects all user tables and starts
reindexing them. Tonight, reindexeing the first table
started and seconds later the whole server crashed.

No ping, nothing else possible....

This is the list of recent crashes:
Tonight 02:42 am
Yesterday night 02:39 am
Tuesday at 10:34 am
Last saturday at 10:44 am
Last Tuesday at 02:19 am
The saturday before at 04:01 am
The thursday before at 04:02 am
the tuesday before at 02:25 am

Always complete crashes... only reset helped.

Most crashes occur while maintainance tasks.
However, there are some other crashes, too.

There are never any hints in /var/log/messages

I upgraded to postgresql 7.3 recently, but it doesn't
seem to help either.

I am almost desperate.

We are running some mysql-servers here, too, and I
more and more often try to imagine to move my whole
system to a mysql-server... my collegues NEVER have
had such trouble with their mysql-servers yet....

Do you have any hints for me? What can I do? My last
choice would be to move to mysql, but I am almost
desperate....

thanks for your help

--

Mit freundlichem Gru�

Henrik Steffen
Gesch�ftsf�hrer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com Tel. +49 4141 991230
mail: steffen@topconcepts.com Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline: +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi

#8Henrik Steffen
steffen@city-map.de
In reply to: Henrik Steffen (#1)
Re: Total crash of my db-server

yes, I have thought about it...

I am not sure if it's a hardware problem.

We upgraded to ECC-RAM recently and hoped it would
help, but it didn't.

It's a hardware raid 1 system (mirroring) on IDE
harddrives.

--

Mit freundlichem Gru�

Henrik Steffen
Gesch�ftsf�hrer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com Tel. +49 4141 991230
mail: steffen@topconcepts.com Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline: +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------

----- Original Message -----
From: "Ian Barwick" <barwick@gmx.net>
To: "Henrik Steffen" <steffen@city-map.de>
Cc: <pgsql-general@postgresql.org>; "Justin Clift" <justin@postgresql.org>
Sent: Sunday, December 15, 2002 4:47 PM
Subject: Re: [GENERAL] Total crash of my db-server

On Sunday 15 December 2002 15:16, Justin Clift wrote:

Hi Henrik,

This *really* sounds like you have a system wide problem, not just a
PostgreSQL problem.

Can't imagine how moving to MySQL will help with that. ;-)

What Operating System are you using, and when was the last time you
patched/updated it with the vendor recommended patches?

Addtionally, have you considered the possibility of a hardware
problem? I had a fileserver once which worked perfectly in "normal"
service, but died regularly and inexplicably whenever large amounts
of data were transferred over the network to the backup machine.
Turned out to be a motherboard problem, possibly in combination
with some of the other components, because we were never able
to reproduce the problem outside of that particular machine...

Ian Barwick
barwick@gmx.net

#9Henrik Steffen
steffen@city-map.de
In reply to: Henrik Steffen (#1)
Re: Total crash of my db-server

hi tom,

ok, I understand this.

But: There is ONLY postgres running on this particular
machine. And it's mostly when backup (dumpall) and/or
vacuuming/reindexing is going on.

In my opinion, postgresql does something on my machine
that leads to these complete system lockups.

--

Mit freundlichem Gru�

Henrik Steffen
Gesch�ftsf�hrer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com Tel. +49 4141 991230
mail: steffen@topconcepts.com Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline: +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------

----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Ian Barwick" <barwick@gmx.net>
Cc: "Henrik Steffen" <steffen@city-map.de>; <pgsql-general@postgresql.org>;
"Justin Clift" <justin@postgresql.org>
Sent: Sunday, December 15, 2002 5:29 PM
Subject: Re: [GENERAL] Total crash of my db-server

Show quoted text

This *really* sounds like you have a system wide problem, not just a
PostgreSQL problem.

Can't imagine how moving to MySQL will help with that. ;-)

Actually, moving to MySQL will make it worse. We can say with
confidence that a system lockup is not Postgres' fault because Postgres
does not (and will not) run as root. I'm not sure whether MySQL *must*
be root, but that seems to be a pretty common way of setting it up ...
and when you do that, you can't entirely exclude it from consideration
when you're looking at problems that would require root privileges to
cause.

Addtionally, have you considered the possibility of a hardware
problem?

I tend to agree with Ian on that --- it sounds more like flaky hardware
than anything else. Time for memtest86 and some disk testing too.

regards, tom lane

#10Henrik Steffen
steffen@city-map.de
In reply to: Henrik Steffen (#1)
Re: Total crash of my db-server

the whole computer crashes.

it'S mostly during dumpalls (backup) and/or vacuuming
or reindexing...

--

Mit freundlichem Gru�

Henrik Steffen
Gesch�ftsf�hrer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com Tel. +49 4141 991230
mail: steffen@topconcepts.com Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline: +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------

----- Original Message -----
From: "Lee Harr" <missive@frontiernet.net>
To: <pgsql-general@postgresql.org>
Sent: Sunday, December 15, 2002 11:25 PM
Subject: Re: [GENERAL] Total crash of my db-server

In article <00d601c2a443$7b7b7dd0$7100a8c0@henrik>, "Henrik Steffen"

wrote:

Show quoted text

sometimes I experience a total crash of my
db-server while e.g. doing automated maintainance tasks:

The computer crashes or just the database?
It is not clear from your description.

Always complete crashes... only reset helped.

reset postgres? or are you resetting the computer?

Most crashes occur while maintainance tasks.
However, there are some other crashes, too.

Is there any commonality between crashes? Are the
others maybe during daily/ weekly OS reporting?
(Generally, heavy disk activity)

We are running some mysql-servers here, too, and I
more and more often try to imagine to move my whole
system to a mysql-server... my collegues NEVER have
had such trouble with their mysql-servers yet....

Do you have any hints for me? What can I do? My last
choice would be to move to mysql, but I am almost
desperate....

You are running mysql on the same machine? Or are these
separate systems running mysql?

My first reaction is "hardware trouble" but without
more specifics it is tough to make a diagnosis. If
you have a spare box, that might be a quick way to
see if the problem is hardware related.

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

#11Justin Clift
justin@postgresql.org
In reply to: Henrik Steffen (#1)
Re: Total crash of my db-server

Henrik Steffen wrote:

hi tom,

ok, I understand this.

But: There is ONLY postgres running on this particular
machine. And it's mostly when backup (dumpall) and/or
vacuuming/reindexing is going on.

In my opinion, postgresql does something on my machine
that leads to these complete system lockups.

It sounds like the system lockups are occuring perhaps due to disk I/O, with PostgreSQL being the program causing the
disk load past what the system handles.

How much load does this system normally have, when there aren't dumps/vacuums/reindexes going on? Trying to understand
how much load your system normally copes with before locking up.

?

As a thought, if this is really being caused by disk I/O loads, then it might be able to trigger it on demand with disk
benchmarking programs (just an idle thought). That could be useful to know about.

Regards and best wishes,

Justin Clift

--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi

#12Jan Weerts
j.weerts@i-views.de
In reply to: Henrik Steffen (#10)
Re: Total crash of my db-server

Hi Steffen!

the whole computer crashes.

it'S mostly during dumpalls (backup) and/or vacuuming
or reindexing...

From my experience with two different machines: We had this behaviour

on two servers under linux. Both of them were running postgres, but
database load did not necessarily coincide with a dead system.

After some problems we found out, that both cases could be solved with
different RAM configurations. The first machine was two years in use
and suddenly started to reboot during the day (not after hours). We
suspected an attack or broken hard drives. In the end we changed the
RAM and since then it is happily humming in its rack.

The second machine was brand-new and we wanted to put one gig of ram
in two dimm sockets. The machine was set up and postgres installed.
When we started to test the database and load the system we got kernel
panics or a totally unresponsive machine. In the end after a lot of
testing we removed one of the RAM modules and since then it is running
with just half a gig (which suffices for the application we will be
using it for). Different software based RAM tests showed varying
results on each run, not reproducable. We suspect the chipset to be
broken in this respect despite its claimed ability to use these
modules.

So my guess here is, that since postgres is not running as root, it
cannot really "destroy" the kernel or anything vital. For this kind of
breakdown I usually blame Windows, but since this is Linux, I really
do suspect the hardware. Even if you are not experiencing this the
first time as you said in another post. Are the other machines loaded
(cpu and ram) by other applications or only postgres? If only
postgres, try some other ram and cpu consuming app and load the
machine heavily.

HTH
Jan
p.s.: we once had a temp, who we supect to have zapped two ram
modules and two mainboards in just one month. And since the
first case proves aging of ram, I am prepared to blame hardware
in some cases.

#13Henrik Steffen
steffen@city-map.de
In reply to: Henrik Steffen (#1)
Re: Total crash of my db-server

Hi Justin,

average load is usually somewhat around 0.5,
at higher load there is sometimes even 3.0 or up to 7.0

it's a dedicated postgresql-machine. all accesses are made
by a webserver in the same subnet. There are about 15.000
daily users. Each request to the webserver triggers one or
more accesses to the database (using persistent connections,
mod_perl, squid as a proxy, etc.)

The webserver is set to MaxClients == 40 ... this limit has
as far as I can say never been reached before. So there should
never be more than 40 concurrent postgresql-processes.

When dumpall or reindexing / vacuum full is run at nights,
the webserver is shut down first.

disk benchmarking programs would perhaps be interesting
(which one do you suggest?)... but note: it's a production
server, and I have had allready too much downtime this
month...

--

Mit freundlichem Gru�

Henrik Steffen
Gesch�ftsf�hrer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com Tel. +49 4141 991230
mail: steffen@topconcepts.com Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline: +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------

----- Original Message -----
From: "Justin Clift" <justin@postgresql.org>
To: "Henrik Steffen" <steffen@city-map.de>
Cc: "Tom Lane" <tgl@sss.pgh.pa.us>; <pgsql-general@postgresql.org>
Sent: Monday, December 16, 2002 1:59 PM
Subject: Re: [GENERAL] Total crash of my db-server

Henrik Steffen wrote:

hi tom,

ok, I understand this.

But: There is ONLY postgres running on this particular
machine. And it's mostly when backup (dumpall) and/or
vacuuming/reindexing is going on.

In my opinion, postgresql does something on my machine
that leads to these complete system lockups.

It sounds like the system lockups are occuring perhaps due to disk I/O,

with PostgreSQL being the program causing the

disk load past what the system handles.

How much load does this system normally have, when there aren't

dumps/vacuums/reindexes going on? Trying to understand

how much load your system normally copes with before locking up.

?

As a thought, if this is really being caused by disk I/O loads, then it

might be able to trigger it on demand with disk

benchmarking programs (just an idle thought). That could be useful to

know about.

Show quoted text

Regards and best wishes,

Justin Clift

--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
- Indira Gandhi

#14Shridhar Daithankar
shridhar_daithankar@persistent.co.in
In reply to: Henrik Steffen (#13)
Re: Total crash of my db-server

On Monday 16 December 2002 07:18 pm, you wrote:

disk benchmarking programs would perhaps be interesting
(which one do you suggest?)... but note: it's a production
server, and I have had allready too much downtime this
month...

I suggest you run pgbench with 10M records/100,000 transactions/100 users. If
it is hardware error, it should go belly up for that.

I guess it should roughly take 2GB space for this test. Just FYI..

HTH

Shridhar

#15Thomas Beutin
tyrone@laokoon.IN-Berlin.DE
In reply to: Henrik Steffen (#9)
Re: Total crash of my db-server

Hi,

On Mon, Dec 16, 2002 at 01:45:07PM +0100, Henrik Steffen wrote:

But: There is ONLY postgres running on this particular
machine. And it's mostly when backup (dumpall) and/or
vacuuming/reindexing is going on.

In my opinion, postgresql does something on my machine
that leads to these complete system lockups.

May be the problem is related to the old sig11 problem:
http://www.bitwizard.nl/sig11/

Greetings,
-tb

----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Ian Barwick" <barwick@gmx.net>
Cc: "Henrik Steffen" <steffen@city-map.de>; <pgsql-general@postgresql.org>;
"Justin Clift" <justin@postgresql.org>
Sent: Sunday, December 15, 2002 5:29 PM
Subject: Re: [GENERAL] Total crash of my db-server

This *really* sounds like you have a system wide problem, not just a
PostgreSQL problem.

Can't imagine how moving to MySQL will help with that. ;-)

Actually, moving to MySQL will make it worse. We can say with
confidence that a system lockup is not Postgres' fault because Postgres
does not (and will not) run as root. I'm not sure whether MySQL *must*
be root, but that seems to be a pretty common way of setting it up ...
and when you do that, you can't entirely exclude it from consideration
when you're looking at problems that would require root privileges to
cause.

Addtionally, have you considered the possibility of a hardware
problem?

I tend to agree with Ian on that --- it sounds more like flaky hardware
than anything else. Time for memtest86 and some disk testing too.

regards, tom lane

--
Thomas Beutin tb@laokoon.IN-Berlin.DE
Beam me up, Scotty. There is no intelligent live down in Redmond.

#16Tino Wildenhain
tino@wildenhain.de
In reply to: Henrik Steffen (#9)
Re: Total crash of my db-server

Hi Henrik,

--On Montag, 16. Dezember 2002 13:45 +0100 Henrik Steffen
<steffen@city-map.de> wrote:

hi tom,

ok, I understand this.

But: There is ONLY postgres running on this particular
machine. And it's mostly when backup (dumpall) and/or
vacuuming/reindexing is going on.

In my opinion, postgresql does something on my machine
that leads to these complete system lockups.

When you drive on a road and fell in a big hole, is
it your cars fault?

SCNR ;)

Regards
Tino

#17James Thompson
jamest@math.ksu.edu
In reply to: Justin Clift (#11)
Re: Total crash of my db-server

As a thought, if this is really being caused by disk I/O loads, then
it might be able to trigger it on demand with disk benchmarking
programs (just an idle thought). That could be useful to know about.

Sorry if this was mentioned previously, I didn't catch the start of this
thread.

I had a server that locked up about the same time everyday. It wound up
being a weak cpu cooling fan was causing gradual overheat. No clue why
the thermal protection wasn't kicking in. A replacement fan and all my
issues went away. Drove me nuts for about a month. :)

Take Care

->->->->->->->->->->->->->->->->->->---<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<
James Thompson 138 Cardwell Hall Manhattan, Ks 66506 785-532-0561
Kansas State University Department of Mathematics
->->->->->->->->->->->->->->->->->->---<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<

#18Nigel J. Andrews
nandrews@investsystems.co.uk
In reply to: Justin Clift (#11)
Re: Total crash of my db-server

On Mon, 16 Dec 2002, Justin Clift wrote:

Henrik Steffen wrote:

hi tom,

ok, I understand this.

But: There is ONLY postgres running on this particular
machine. And it's mostly when backup (dumpall) and/or
vacuuming/reindexing is going on.

In my opinion, postgresql does something on my machine
that leads to these complete system lockups.

It sounds like the system lockups are occuring perhaps due to disk I/O, with PostgreSQL being the program causing the
disk load past what the system handles.

[edited]
As a thought, if this is really being caused by disk I/O loads, then it might be able to trigger it on demand with disk
benchmarking programs (just an idle thought). That could be useful to know about.

I'm coming into this late, don't know what's been said before in this thread
and considering the above mention of dumping I'm probably completely off the
charts on the uselessness of this question/suggestion but...

Are you using a 'lazy' memory allocation setup. You could find that suddenly
finding the requested memory isn't really there when told it was when
requesting it has nasty effects.

I presume the normal talk of core dumps etc has happened.

--
Nigel Andrews

#19Tino Wildenhain
tino@wildenhain.de
In reply to: Henrik Steffen (#7)
Re: Total crash of my db-server

Hi Henrik,

--On Montag, 16. Dezember 2002 13:40 +0100 Henrik Steffen
<steffen@city-map.de> wrote:

Dear Justin,

I am not sure whether it's really a hardware problem,
because I have had similar problems with different machines
and different os- and pgsql-versions before... If you
browse the archive you will find postings from me about
crashes and problems the last 2-3 years...

I can only tell, that the mysql-servers we are running
have never had similar trouble - and they are run on identical
hardware and os-types under almost identical load.

Currently, I am running postgres 7.3 on a Redhat Linux
(Kernel 2.4.19). Most important software packages are
always up2date.

The situation is, there are many many people out there who use
this RDBMS with big or even large databases. In our case we
are on about 18gig.

If the DB would crash (which it does not in our case) I'd
eventually blame the DB software. If the OS crashes, I'd
for sure blame the OS or the hardware. Whatever the software
does - it can not crash the system unless its running in
kernel space. Postgresql is not a hardware accessing driver.

I't might be that postgresql can trigger problematic details in
your setup (use large memory areas, depends on task switching,
and signal handling) but even then, the setup is problematic,
not postgresql.

Regards
Tino

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Henrik Steffen (#9)
Re: Total crash of my db-server

"Henrik Steffen" <steffen@city-map.de> writes:

In my opinion, postgresql does something on my machine
that leads to these complete system lockups.

Once again: postgres is an unprivileged application. It can *not* lock
up the machine that way. You're dealing with either a hardware fault
or a kernel bug --- evidently one that only appears under heavy load,
but that doesn't make it postgres' fault.

I'd suggest asking some kernel hackers for debugging help.

regards, tom lane

#21scott.marlowe
scott.marlowe@ihs.com
In reply to: Henrik Steffen (#1)
#22Brian Hirt
bhirt@mobygames.com
In reply to: scott.marlowe (#21)
#23Kevin Brown
kevin@sysexperts.com
In reply to: Nigel J. Andrews (#18)
#24Kevin Brown
kevin@sysexperts.com
In reply to: Henrik Steffen (#9)
#25Terry Fielder
terry@ashtonwoodshomes.com
In reply to: Kevin Brown (#24)
#26scott.marlowe
scott.marlowe@ihs.com
In reply to: Kevin Brown (#24)
#27Ken Godee
ken@perfect-image.com
In reply to: scott.marlowe (#26)
#28Lincoln Yeoh
lyeoh@pop.jaring.my
In reply to: scott.marlowe (#26)
#29scott.marlowe
scott.marlowe@ihs.com
In reply to: Lincoln Yeoh (#28)
#30Shridhar Daithankar
shridhar_daithankar@persistent.co.in
In reply to: scott.marlowe (#29)