Postmaster won't -HUP

Started by Jerry Lyndealmost 26 years ago14 messagesgeneral
Jump to latest
#1Jerry Lynde
jlynde@diligence.com

Hello out there,

I'm having a problem with a production server. Actually, there are two
problems. The semi-trivial problem is that Postgres won't die using the
service mechanism. As root, I "service postgres stop" and then "service
postgres start" after a reasonable wait. The restart will earn me a
"StreamServerPort: cannot bind to port" which indicates tht the process
never died. A ps ax confirms the persistance of postmaster. When I kill -9
the processes (postmaster, the /bin/sh -c postgres, and logger) they
process will claim to start with "service postgres start" but it reports no
PID and doesn't show up in ps ax. It is clearly not running at this point.

The real problem, which caused all this debugging, is that twice so far,
for no apparent reason, I have pegged the processors on the server. The
machine has two 500mHz processors with 256 MB ram. I have a hard time
believing that one small query can bring that machine to its knees, but it
has, twice. The queries were run through a hard coded php front end (for
testing purposes). Any insight on these two problems would be appreciated
greatly.

thanks,

Jerry Lynde

Jerry Lynde
IT - Invocation/Evocation, Banishing, et al.
"In E-commerce, the E stands for Evil."
Due Diligence Inc.
http://www.diligence.com
Phone: (406) 728-0001 x232
Fax: (406) 728-0006

#2Joseph Shraibman
jks@selectacast.net
In reply to: Jerry Lynde (#1)
Re: Postmaster won't -HUP

In version 7.0 postgres waits for all clients to close their connections
before exiting. Before it just quit.

Jerry Lynde wrote:

Show quoted text

Hello out there,

I'm having a problem with a production server. Actually, there are two
problems. The semi-trivial problem is that Postgres won't die using the
service mechanism. As root, I "service postgres stop" and then "service
postgres start" after a reasonable wait. The restart will earn me a
"StreamServerPort: cannot bind to port" which indicates tht the process
never died. A ps ax confirms the persistance of postmaster. When I kill -9
the processes (postmaster, the /bin/sh -c postgres, and logger) they
process will claim to start with "service postgres start" but it reports no
PID and doesn't show up in ps ax. It is clearly not running at this point.

The real problem, which caused all this debugging, is that twice so far,
for no apparent reason, I have pegged the processors on the server. The
machine has two 500mHz processors with 256 MB ram. I have a hard time
believing that one small query can bring that machine to its knees, but it
has, twice. The queries were run through a hard coded php front end (for
testing purposes). Any insight on these two problems would be appreciated
greatly.

thanks,

Jerry Lynde

Jerry Lynde
IT - Invocation/Evocation, Banishing, et al.
"In E-commerce, the E stands for Evil."
Due Diligence Inc.
http://www.diligence.com
Phone: (406) 728-0001 x232
Fax: (406) 728-0006

#3Ed Loehr
eloehr@austin.rr.com
In reply to: Jerry Lynde (#1)
Re: Postmaster won't -HUP

Jerry Lynde wrote:

I'm having a problem with a production server. Actually, there are two
problems. The semi-trivial problem is that Postgres won't die using the
service mechanism. As root, I "service postgres stop" and then "service
postgres start" after a reasonable wait. The restart will earn me a
"StreamServerPort: cannot bind to port" which indicates tht the process
never died. A ps ax confirms the persistance of postmaster. When I kill -9
the processes (postmaster, the /bin/sh -c postgres, and logger) they
process will claim to start with "service postgres start" but it reports no
PID and doesn't show up in ps ax. It is clearly not running at this point.
The real problem, which caused all this debugging, is that twice so far,
for no apparent reason, I have pegged the processors on the server. The
machine has two 500mHz processors with 256 MB ram. I have a hard time
believing that one small query can bring that machine to its knees, but it
has, twice. The queries were run through a hard coded php front end (for
testing purposes). Any insight on these two problems would be appreciated
greatly.

Hello Jerry,

What version are you running?

What does your server log show? Any other logs?

Can you show us the small query, # of rows, etc?

Regards,
Ed Loehr

#4Martijn van Oosterhout
kleptog@svana.org
In reply to: Jerry Lynde (#1)
Re: Postmaster won't -HUP

Jerry Lynde wrote:

Hello out there,

I'm having a problem with a production server. Actually, there are two
problems. The semi-trivial problem is that Postgres won't die using the
service mechanism. As root, I "service postgres stop" and then "service
postgres start" after a reasonable wait. The restart will earn me a
"StreamServerPort: cannot bind to port" which indicates tht the process
never died. A ps ax confirms the persistance of postmaster. When I kill -9
the processes (postmaster, the /bin/sh -c postgres, and logger) they
process will claim to start with "service postgres start" but it reports no
PID and doesn't show up in ps ax. It is clearly not running at this point.

Hmm, on debian it always dies, though it does takes a while sometimes.
Also, try to avoid kill -9 because then it can't clean up shared memory,
locks, temporary files, etc. Try kill -INT or -TERM.

The real problem, which caused all this debugging, is that twice so far,
for no apparent reason, I have pegged the processors on the server. The
machine has two 500mHz processors with 256 MB ram. I have a hard time
believing that one small query can bring that machine to its knees, but it
has, twice. The queries were run through a hard coded php front end (for
testing purposes). Any insight on these two problems would be appreciated
greatly.

Heh. Yes, I also wonder why "select * from bigtable" where bigtable is
a 400,000 row table crashes the machine also :). (This is pg_dump btw).
Maybe you should post the query.

Actually, there is one other situation I've killed the machine nearly.
I have two big tables, "bigtable" and "largetable". Then do this
query:

select * from bigtable where largetable.a = 1;

(ofcourse the actual query was much longer but this one demonstrates the
problem). It actually does a join between those two tables even though
only one is mentioned in the from part. Running explain over this told
that this would take a *very* long time to complete.

This is probably not a problem in newer versions though. What version
are you using?
--
Martijn van Oosterhout <kleptog@cupid.suninternet.com>
http://cupid.suninternet.com/~kleptog/

#5Jerry Lynde
jlynde@diligence.com
In reply to: Martijn van Oosterhout (#4)
Re: Postmaster won't -HUP

Wow! Thanks for all the replies!

The version I'm running is 7.0.0

It could certainly be that the connections are remaining open and the
server isn't dying due to that. I'll play with it a little when I get the
chance, probably this weekend (when clients aren't hitting the server.)
Thanks for the tip on -INT and -TERM.

As for the query I'm running, it was simply select * from bigtable (about
2-300k lines) where
firstname=<obscure fname> and
lastname=<obscure lname> and
DOB=<the exact DOB for the above named individual>

Essentially, the query was hard-coded to pull a specific record from the
data and disply it on the page. While the dataset is medium to large
(IMHO) it shouldn't peg two 500's. The problem happened the first time
reliably when I ran the query. I was using phpdb for the php interface (I
have stopped using that for the time being, because it maxes the system
every time.)

Jerry Lynde
IT - Invocation/Evocation, Banishing, et al.
"In E-commerce, the E stands for Evil."
Due Diligence Inc.
http://www.diligence.com
Phone: (406) 728-0001 x232
Fax: (406) 728-0006

#6Ed Loehr
eloehr@austin.rr.com
In reply to: Jerry Lynde (#1)
Re: Postmaster won't -HUP

Jerry Lynde wrote:

As for the query I'm running, it was simply select * from bigtable (about
2-300k lines) where
firstname=<obscure fname> and
lastname=<obscure lname> and
DOB=<the exact DOB for the above named individual>

What indices do you have on those fields?

Regards,
Ed Loehr

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ed Loehr (#6)
Re: Postmaster won't -HUP

Ed Loehr <eloehr@austin.rr.com> writes:

Jerry Lynde wrote:

As for the query I'm running, it was simply select * from bigtable (about
2-300k lines) where
firstname=<obscure fname> and
lastname=<obscure lname> and
DOB=<the exact DOB for the above named individual>

What indices do you have on those fields?

And is it using them --- ie, what does EXPLAIN say about the query?

regards, tom lane

#8Jerry Lynde
jlynde@diligence.com
In reply to: Ed Loehr (#6)
Re: Postmaster won't -HUP

At 12:11 PM 6/1/00 -0500, Ed Loehr wrote:

Jerry Lynde wrote:

As for the query I'm running, it was simply select * from

bigtable (about

2-300k lines) where
firstname=<obscure

fname> and

lastname=<obscure

lname> and

DOB=<the exact DOB for

the above named individual>

What indices do you have on those fields?

Regards,
Ed Loehr

They are all indexed, the DOB index is actually DOBYear DOBDay and
DOBMonth and all 5 fields are indexed

#9Jerry Lynde
jlynde@diligence.com
In reply to: Tom Lane (#7)
Re: Postmaster won't -HUP

At 01:21 PM 6/1/00 -0400, Tom Lane wrote:

Ed Loehr <eloehr@austin.rr.com> writes:

Jerry Lynde wrote:

As for the query I'm running, it was simply select * from bigtable (about
2-300k lines) where
firstname=<obscure fname> and
lastname=<obscure lname> and
DOB=<the exact DOB for the above named individual>

What indices do you have on those fields?

And is it using them --- ie, what does EXPLAIN say about the query?

regards, tom lane

It started indexing on DOBMonth until I did a VACUUM ANALYZE, after which
it uses the lastname index

#10Joseph Shraibman
jks@selectacast.net
In reply to: Jerry Lynde (#1)
Re: Postmaster won't -HUP

Jerry Lynde wrote:

At 12:11 PM 6/1/00 -0500, Ed Loehr wrote:

Jerry Lynde wrote:

As for the query I'm running, it was simply select * from

bigtable (about

2-300k lines) where
firstname=<obscure

fname> and

lastname=<obscure

lname> and

DOB=<the exact DOB for

the above named individual>

What indices do you have on those fields?

Regards,
Ed Loehr

They are all indexed, the DOB index is actually DOBYear DOBDay and
DOBMonth and all 5 fields are indexed

Do you have 5 indexes or do you have an index that spans more than one
field?

#11Jerry Lynde
jlynde@diligence.com
In reply to: Joseph Shraibman (#10)
Re: Postmaster won't -HUP

At 05:19 PM 6/1/00 -0400, Joseph Shraibman wrote:

Jerry Lynde wrote:

At 12:11 PM 6/1/00 -0500, Ed Loehr wrote:

Jerry Lynde wrote:

As for the query I'm running, it was simply select * from

bigtable (about

2-300k lines) where
firstname=<obscure

fname> and

lastname=<obscure

lname> and

DOB=<the exact DOB for

the above named individual>

What indices do you have on those fields?

Regards,
Ed Loehr

They are all indexed, the DOB index is actually DOBYear DOBDay and
DOBMonth and all 5 fields are indexed

Do you have 5 indexes or do you have an index that spans more than one
field?

Sorry for being less than explicit. There are 5 separate indices, one per
field.

Jerry Lynde
IT - Invocation/Evocation, Banishing, et al.
"In E-commerce, the E stands for Evil."
Due Diligence Inc.
http://www.diligence.com
Phone: (406) 728-0001 x232
Fax: (406) 728-0006

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jerry Lynde (#11)
Re: Postmaster won't -HUP

Jerry Lynde <jlynde@diligence.com> writes:

They are all indexed, the DOB index is actually DOBYear DOBDay and
DOBMonth and all 5 fields are indexed

Do you have 5 indexes or do you have an index that spans more than one
field?

Sorry for being less than explicit. There are 5 separate indices, one per
field.

So your query is really something more like

... WHERE firstname = 'joe' AND lastname = 'blow' AND
DOByear = 1999 AND DOBmonth = 1 AND DOBday = 1

?

The problem here is that only one index can be used in any individual
scan. If I were the optimizer I'd probably figure that lastname is
going to be the most selective of the five available choices, too.

I'd suggest storing the DOB as *one* field of type 'date'. You can
pull out the subparts for display with date_part() when you need to,
but for searches you'll be a lot better off with

WHERE DOB = '1999-01-01'

regards, tom lane

#13Jerry Lynde
jlynde@diligence.com
In reply to: Tom Lane (#12)
Re: Postmaster won't -HUP

At 05:58 PM 6/1/00 -0400, you wrote:
Jerry Lynde <jlynde@diligence.com> writes:

They are all indexed, the DOB index is actually DOBYear DOBDay and
DOBMonth and all 5 fields are indexed

Do you have 5 indexes or do you have an index that spans more than one
field?

Sorry for being less than explicit. There are 5 separate indices, one per
field.

So your query is really something more like

... WHERE firstname = 'joe' AND lastname = 'blow' AND
DOByear = 1999 AND DOBmonth = 1 AND DOBday = 1

?

yes

The problem here is that only one index can be used in any individual
scan. If I were the optimizer I'd probably figure that lastname is
going to be the most selective of the five available choices, too.

and it did, and that's ok

I'd suggest storing the DOB as *one* field of type 'date'. You can
pull out the subparts for display with date_part() when you need to,
but for searches you'll be a lot better off with

WHERE DOB = '1999-01-01'

regards, tom lane

Thanks for the tip. I might indeed take that approach in the future,
however that's not really the problem I'm trying to tackle right now.
Indexing by Last Name is fine with me, currently. What's not working for me
is the part where the dual pentium 500 machine with 256MB RAM goes into
deep thought indefinitely for one simple hard-coded query.
I used to think that the problem was due to the phpdb module that I was
invoking, since the behavior exhibited itself consistently doing the
aforementioned query with the phpdb module. Using nothing but straight php
I have been able to make the query run smoothly.
The reason I no longer believe the problem was tied to phpdb is that the
behavior with the processors (all processor time devoted to user processes)
happened when I was not making use of phpdb anymore. In fact I wasn't even
making queries at the time, so it may not be tied to postgres at all, but I
suspect it might, since the problem happens at random currently, but was
consistent and predictable with the phpdb-driven postgres query.

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jerry Lynde (#13)
Re: Postmaster won't -HUP

Jerry Lynde <jlynde@diligence.com> writes:

Thanks for the tip. I might indeed take that approach in the future,
however that's not really the problem I'm trying to tackle right now.
Indexing by Last Name is fine with me, currently. What's not working for me
is the part where the dual pentium 500 machine with 256MB RAM goes into
deep thought indefinitely for one simple hard-coded query.

Ah, sorry ... I've been seeing so many optimizer questions lately that
I tend to zero right in on anything that looks like a misoptimization
issue.

I'm not aware of any reason that a query such as you describe would
tend to hang up the machine. It would be useful to know what you see
in "top" or some other monitoring program when the problem happens.
Is there just one backend process sucking all the CPU time? More than
one? Is the process(es) memory usage stable, or climbing?

An even more useful bit of info is a stack trace from a backend that's
suffering the problem: if you do a "kill -ABORT" on it you should get
a coredump and be able to backtrace with gdb. (Note this will cause
a database system restart, ie all the other backends will commit
harakiri too, so I wouldn't advise doing it during normal usage of the
system.)

regards, tom lane