Re: killing process question

Started by Johnson, Shaunnover 23 years ago12 messageshackersgeneral
Jump to latest
#1Johnson, Shaunn
SJohnson6@bcbsm.com
hackersgeneral

--howdy:

--not that the process is doing a lot or taking up
--a lot of resources, it's just something
--that i allow the users to kill and then
--it get's passed to me for correction if the
--simple 'kill <pid>' thing doesn't work.

--what i'm trying to understand is if there
--is a way to do this without having to restart
--the database (remember, it's still production)
--everytime there is a runaway process AND not
--kill -9 <pid>.

--how can i do this?

-X

-----Original Message-----
From: Shridhar Daithankar [mailto:shridhar_daithankar@persistent.co.in]
Sent: Thursday, September 19, 2002 10:45 AM
To: 'pgsql-general@postgresql.org'
Subject: Re: [GENERAL] killing process question

On 19 Sep 2002 at 10:39, Johnson, Shaunn wrote:

--thanks for the reply:

--no, I don't see anything like that. this is what I have:

[snip]

postgres 3488 5.6 0.0 11412 4 pts/4 T Sep18 88:53 postgres:
joetestdb 16.xx.xx.xx SELECT
[/snip]

--this tells me that this proc had been running once upon a time (since

the

18th) and
--has stopped (the 'T'). the user has said that he had since killed the

tool

--that connected to the database and booted his machine.

--so ... when I do a 'kill pid' or 'kill -TERM pid' ... *poof* ... nothing

happens ...

Does restarting database helps? It may just make the thing go away..

Or stop the database, kill the pid with -9 and start it again.. Nothing
lost..

Bye
Shridhar

--
Shedenhelm's Law: All trails have more uphill sections than they have
downhill
sections.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

#2Shridhar Daithankar
shridhar_daithankar@persistent.co.in
In reply to: Johnson, Shaunn (#1)
hackersgeneral

On 19 Sep 2002 at 11:19, Johnson, Shaunn wrote:

--howdy:
--not that the process is doing a lot or taking up
--a lot of resources, it's just something
--that i allow the users to kill and then
--it get's passed to me for correction if the
--simple 'kill <pid>' thing doesn't work.
--what i'm trying to understand is if there
--is a way to do this without having to restart
--the database (remember, it's still production)
--everytime there is a runaway process AND not
--kill -9 <pid>.
--how can i do this?

I did a quick 'grep -rin' on postgresql source code I have(CVS, a week old).
Looks like postgresql backend is ignoring the SISPIPE which is delivered to
backend process when other end is closed. Obviously this is going to cause
hanging back-ends.

I guess a backend should terminate as if connection is closed. What say?

Bye
Shridhar

--
Guillotine, n.: A French chopping center.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Shridhar Daithankar (#2)
hackersgeneral

"Shridhar Daithankar" <shridhar_daithankar@persistent.co.in> writes:

I guess a backend should terminate as if connection is closed. What say?

No.

It will terminate when it tries to read the next query from the client.

regards, tom lane

#4Johnson, Shaunn
SJohnson6@bcbsm.com
In reply to: Tom Lane (#3)
hackersgeneral

--okay, but the client has since terminated
--it's session (if i understand you correctly).

--is this just something that will just have to
--hang around until i shutdown the database / boot
--the machine?

-X

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Thursday, September 19, 2002 11:50 AM
To: shridhar_daithankar@persistent.co.in
Cc: 'pgsql-general@postgresql.org'; pgsql-hackers@postgresql.org
Subject: Re: [GENERAL] killing process question

"Shridhar Daithankar" <shridhar_daithankar@persistent.co.in> writes:

I guess a backend should terminate as if connection is closed. What say?

No.

It will terminate when it tries to read the next query from the client.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

#5Shridhar Daithankar
shridhar_daithankar@persistent.co.in
In reply to: Tom Lane (#3)
hackersgeneral

On 19 Sep 2002 at 11:49, Tom Lane wrote:

"Shridhar Daithankar" <shridhar_daithankar@persistent.co.in> writes:

I guess a backend should terminate as if connection is closed. What say?

No.

It will terminate when it tries to read the next query from the client.

OK. But what if it never reads anything? I mean if the client dies after a
complete transaction i.e. no input pending for either back end or client, will
it just sit around waiting for select to signal that fd?(AFAIU, that's how
things goes in there..)

Clearly we have a case where backend is hung persumably. Either it has to have
an explanation(OK client did aborted abruptly) and/or a possible corrective
action..

Just some thoughts..

Bye
Shridhar

--
QOTD: "I won't say he's untruthful, but his wife has to call the dog for
dinner."

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Johnson, Shaunn (#4)
hackersgeneral

"Johnson, Shaunn" <SJohnson6@bcbsm.com> writes:

--okay, but the client has since terminated
--it's session (if i understand you correctly).
--is this just something that will just have to
--hang around until i shutdown the database / boot
--the machine?

I dunno. Are you sure this is a backend process? What is it doing
(or not doing) ... is it chewing any CPU cycles? What status does it
show in ps?

regards, tom lane

#7Johnson, Shaunn
SJohnson6@bcbsm.com
In reply to: Tom Lane (#6)
general

--howdy:

--that's just it: it's not doing a lot of anything.
--not munching on CPU cycles or anything like that.
--'ps auwx' shows that the process is stopped
--(displays a 'T' in the line, i. e.:)

[snip]
postgres 3488 5.3 0.0 11412 4 pts/4 T Sep18 88:53 postgres: joe
testdb 16.xx.xx.xx SELECT
[/snip]

--to clarify, the pid 3488 is the backend connection to the
--database that i should be killing, right? that's what i've
--BEEN killing, so, if i am missing something, steer me in the
--right way, please.

--having said that, there are processes that does NOT have
--the 'T' (stopped) code (just 'S' - sleep ... i have yet
--to see any 'Z' zombie processes). some of these are
--also no-responsive to the user or to me trying to use
--kill <pid>.

--any suggestions?

-X

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]

"Johnson, Shaunn" <SJohnson6@bcbsm.com> writes:

--okay, but the client has since terminated
--it's session (if i understand you correctly).
--is this just something that will just have to
--hang around until i shutdown the database / boot
--the machine?

I dunno. Are you sure this is a backend process? What is it doing
(or not doing) ... is it chewing any CPU cycles? What status does it
show in ps?

regards, tom lane

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Johnson, Shaunn (#7)
general

"Johnson, Shaunn" <SJohnson6@bcbsm.com> writes:

--that's just it: it's not doing a lot of anything.
--not munching on CPU cycles or anything like that.
--'ps auwx' shows that the process is stopped
--(displays a 'T' in the line, i. e.:)

[snip]
postgres 3488 5.3 0.0 11412 4 pts/4 T Sep18 88:53 postgres: joe
testdb 16.xx.xx.xx SELECT
[/snip]

Interesting. Maybe it's waiting for a lock that someone else has?

Can you attach to the process with gdb and get a stack trace to show
where it is, exactly?

$ gdb /path/to/postgres-executable
gdb> attach 3488
gdb> bt
gdb> quit
okay to detach? y

regards, tom lane

#9Johnson, Shaunn
SJohnson6@bcbsm.com
In reply to: Tom Lane (#8)
general

--sorry, i meant to send this to the group.

--generally speaking, how long should this run?

[snip]

(gdb) attach 3488
Attaching to program: /usr/bin/postgres, process 3488

[/snip]

--i know i should be patient, but i'm trying to figure
--out if this should take more than 20 minutes or if
--i've done something wrong. (OR, should that be postmaster
--and not postgres in the above line?)

-X

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]

--that's just it: it's not doing a lot of anything.
--not munching on CPU cycles or anything like that.
--'ps auwx' shows that the process is stopped
--(displays a 'T' in the line, i. e.:)

[snip]
postgres 3488 5.3 0.0 11412 4 pts/4 T Sep18 88:53 postgres:

joe

testdb 16.xx.xx.xx SELECT
[/snip]

Interesting. Maybe it's waiting for a lock that someone else has?

Can you attach to the process with gdb and get a stack trace to show
where it is, exactly?

$ gdb /path/to/postgres-executable
gdb> attach 3488
gdb> bt
gdb> quit
okay to detach? y

regards, tom lane

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Johnson, Shaunn (#9)
general

"Johnson, Shaunn" <SJohnson6@bcbsm.com> writes:

--generally speaking, how long should this run?

(gdb) attach 3488
Attaching to program: /usr/bin/postgres, process 3488

Not very long --- it takes a couple seconds, for me, on a machine that's
not fast by today's standards.

--i know i should be patient, but i'm trying to figure
--out if this should take more than 20 minutes or if
--i've done something wrong. (OR, should that be postmaster
--and not postgres in the above line?)

postgres is correct. But are you sure that you are pointing to the same
postgres executable that the process is running? Perhaps gdb could get
confused if you point to the wrong version.

regards, tom lane

#11Johnson, Shaunn
SJohnson6@bcbsm.com
In reply to: Tom Lane (#10)
general

--i knew this was going to happen.

--tried to kill a proc and the database went
--into recovery mode.

--i tried to put the rotatelogs portion in the
--startup script and got a message about
--putting the rotatelogs into the httpd.conf file.

--i can see that today is going to be a long day.

--thanks for all the help and good info.

-X

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]

"Johnson, Shaunn" <SJohnson6@bcbsm.com> writes:

--generally speaking, how long should this run?

(gdb) attach 3488
Attaching to program: /usr/bin/postgres, process 3488

Not very long --- it takes a couple seconds, for me, on a machine that's
not fast by today's standards.

--i know i should be patient, but i'm trying to figure
--out if this should take more than 20 minutes or if
--i've done something wrong. (OR, should that be postmaster
--and not postgres in the above line?)

postgres is correct. But are you sure that you are pointing to the same
postgres executable that the process is running? Perhaps gdb could get
confused if you point to the wrong version.

regards, tom lane

#12Nigel J. Andrews
nandrews@investsystems.co.uk
In reply to: Tom Lane (#10)
general

On Thu, 19 Sep 2002, Tom Lane wrote:

"Johnson, Shaunn" <SJohnson6@bcbsm.com> writes:

--generally speaking, how long should this run?

(gdb) attach 3488
Attaching to program: /usr/bin/postgres, process 3488

Not very long --- it takes a couple seconds, for me, on a machine that's
not fast by today's standards.

--i know i should be patient, but i'm trying to figure
--out if this should take more than 20 minutes or if
--i've done something wrong. (OR, should that be postmaster
--and not postgres in the above line?)

postgres is correct. But are you sure that you are pointing to the same
postgres executable that the process is running? Perhaps gdb could get
confused if you point to the wrong version.

Another shot in the dark - could the process have blocked the relevent signal
(SIGTRAP) sent by the debugger?

--
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants