outdated (bad) information in pg_stat_activity

Started by hubert depesz lubaczewskiover 20 years ago12 messagesgeneral
Jump to latest

hi
my pg_stat_activity is raporting process working on a query for 12 hours
straight! (query_start).
2 problems:
1. this particular query runs typically between 80 and 200 miliseconds!
2. procpid of this backend does not exists anymore in system

i found this:
http://www.issociate.de/board/post/264560/possibly_outdated_info_in_pg_stat_activity.html

is there any workaround? what can be done to fix the data - i am using
pg_stat_activity a lot, and this thing really ruined my day.

best regards

depesz

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: hubert depesz lubaczewski (#1)
Re: outdated (bad) information in pg_stat_activity

hubert depesz lubaczewski <depesz@gmail.com> writes:

my pg_stat_activity is raporting process working on a query for 12 hours
straight! (query_start).
2 problems:
1. this particular query runs typically between 80 and 200 miliseconds!
2. procpid of this backend does not exists anymore in system

It's possible that the "backend exited" message got dropped by the stats
mechanism --- the stats code is deliberately designed to lose messages
rather than delay the rest of the system, so it's not 100% reliable.

regards, tom lane

In reply to: Tom Lane (#2)
Re: outdated (bad) information in pg_stat_activity

On 12/14/05, Tom Lane <tgl@sss.pgh.pa.us> wrote:

hubert depesz lubaczewski <depesz@gmail.com> writes:

my pg_stat_activity is raporting process working on a query for 12 hours
straight! (query_start).
2 problems:
1. this particular query runs typically between 80 and 200 miliseconds!
2. procpid of this backend does not exists anymore in system

It's possible that the "backend exited" message got dropped by the stats
mechanism --- the stats code is deliberately designed to lose messages
rather than delay the rest of the system, so it's not 100% reliable.

is it possible to "by hand" remove bad entries from this stats?

depesz

#4Chris Browne
cbbrowne@acm.org
In reply to: hubert depesz lubaczewski (#1)
Re: outdated (bad) information in pg_stat_activity

depesz@gmail.com (hubert depesz lubaczewski) writes:

On 12/14/05, Tom Lane <tgl@sss.pgh.pa.us> wrote:

hubert depesz lubaczewski <depesz@gmail.com> writes:

my pg_stat_activity is raporting process working on a query for 12 hours
straight! (query_start).
2 problems:
1. this particular query runs typically between 80 and 200 miliseconds!
2. procpid of this backend does not exists anymore in system

It's possible that the "backend exited" message got dropped by the stats
mechanism --- the stats code is deliberately designed to lose messages
rather than delay the rest of the system, so it's not 100% reliable.

is it possible to "by hand" remove bad entries from this stats?
depesz

The only answer I have been able to come to is that restarting the
postmaster will clear this all up.

If there is some less intrusive way of accomplishing this, I'd be keen
on hearing about it...
--
"cbbrowne","@","cbbrowne.com"
http://cbbrowne.com/info/nonrdbms.html
"Funny, the only thing that makes me go Keanu about Microsoft is the
fact that they are constantly behind the times and yet claim to be
innovating." -- Steve Lamb <morpheus@despair.rpglink.com>

#5Gregory S. Williamson
gsw@globexplorer.com
In reply to: Chris Browne (#4)
Re: outdated (bad) information in pg_stat_activity

I too have had issues with "stuck" entries in the pg_stat_activity view -- the real pid is long gone but postgres still thinks the process is there.

It would be real sweet to have a way of cleaning this table with bringing postgres off-line.

My $0.02 worth ...

Greg Williamson
DBA
GlobeXplorer LLC

-----Original Message-----
From: pgsql-general-owner@postgresql.org on behalf of Chris Browne
Sent: Wed 12/14/2005 11:06 AM
To: pgsql-general@postgresql.org
Cc:
Subject: Re: [GENERAL] outdated (bad) information in pg_stat_activity
depesz@gmail.com (hubert depesz lubaczewski) writes:

On 12/14/05, Tom Lane <tgl@sss.pgh.pa.us> wrote:

hubert depesz lubaczewski <depesz@gmail.com> writes:

my pg_stat_activity is raporting process working on a query for 12 hours
straight! (query_start).
2 problems:
1. this particular query runs typically between 80 and 200 miliseconds!
2. procpid of this backend does not exists anymore in system

It's possible that the "backend exited" message got dropped by the stats
mechanism --- the stats code is deliberately designed to lose messages
rather than delay the rest of the system, so it's not 100% reliable.

is it possible to "by hand" remove bad entries from this stats?
depesz

The only answer I have been able to come to is that restarting the
postmaster will clear this all up.

If there is some less intrusive way of accomplishing this, I'd be keen
on hearing about it...
--
"cbbrowne","@","cbbrowne.com"
http://cbbrowne.com/info/nonrdbms.html
"Funny, the only thing that makes me go Keanu about Microsoft is the
fact that they are constantly behind the times and yet claim to be
innovating." -- Steve Lamb <morpheus@despair.rpglink.com>

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

!DSPAM:43a07aed148321697067737!

In reply to: Chris Browne (#4)
Re: outdated (bad) information in pg_stat_activity

On 12/14/05, Chris Browne <cbbrowne@acm.org> wrote:

The only answer I have been able to come to is that restarting the
postmaster will clear this all up.

this is actually not an option for me.
of course i can make myself a function which will check if pid exists, but
it would generally be better if there was some way to fix the problem in
postgresql itself.
a series of delete's from some specific place perhaps?

depesz

#7Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Gregory S. Williamson (#5)
Re: outdated (bad) information in pg_stat_activity

On Wed, Dec 14, 2005 at 02:02:02PM -0800, Gregory S. Williamson wrote:

I too have had issues with "stuck" entries in the pg_stat_activity view -- the real pid is long gone but postgres still thinks the process is there.

It would be real sweet to have a way of cleaning this table with bringing postgres off-line.

Rather than trying to clean things up by hand, could the stats system
instead periodically check to make sure that all the PIDs it knows about
actually still exist? I think that should be a pretty cheap check to
perform...

Greg Williamson
DBA
GlobeXplorer LLC

On a side note, is GlobeXplorer using PostgreSQL? Would they be willing
to let us publicize that fact? Better yet, would they be willing to do a
case study?
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#8Gregory S. Williamson
gsw@globexplorer.com
In reply to: Jim Nasby (#7)
Re: outdated (bad) information in pg_stat_activity

Jim Nasby wrote:

I too have had issues with "stuck" entries in the pg_stat_activity view -- > the real pid is long gone but postgres still thinks the process is there.

It would be real sweet to have a way of cleaning this table with bringing > postgres off-line.

Rather than trying to clean things up by hand, could the stats system
instead periodically check to make sure that all the PIDs it knows about
actually still exist? I think that should be a pretty cheap check to
perform...

This would certainly work for me, even as a function that a DBA might call manually.
<...>

On a side note, is GlobeXplorer using PostgreSQL? Would they be willing
to let us publicize that fact? Better yet, would they be willing to do a
case study?

We are indeed using postgres (and postGIS) for both runtime data access and billing and other misc. data processing requirements.

We're be delighted to be publicized as happy users -- the transition from Informix was fairly smooth and performance is solid.

Feel free to contact me off-list if you like for more substantive comments. I am gsw @ globexplorer.com

Greg W.

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: hubert depesz lubaczewski (#6)
Re: outdated (bad) information in pg_stat_activity

hubert depesz lubaczewski <depesz@gmail.com> writes:

of course i can make myself a function which will check if pid exists, but
it would generally be better if there was some way to fix the problem in
postgresql itself.

Try updating to PG 8.1 ...

2005-08-09 17:14 tgl

* src/backend/postmaster/pgstat.c: Make backends that are reading
the pgstats file verify each backend PID against the PGPROC array.
Anything in the file that isn't in PGPROC gets rejected as being a
stale entry. This should solve complaints about stale entries in
pg_stat_activity after a BETERM message has been dropped due to
overload.

regards, tom lane

In reply to: Tom Lane (#9)
Re: outdated (bad) information in pg_stat_activity

On 12/16/05, Tom Lane <tgl@sss.pgh.pa.us> wrote:

2005-08-09 17:14 tgl
* src/backend/postmaster/pgstat.c: Make backends that are reading
the pgstats file verify each backend PID against the PGPROC array.
Anything in the file that isn't in PGPROC gets rejected as being a
stale entry. This should solve complaints about stale entries in
pg_stat_activity after a BETERM message has been dropped due to
overload.

must have missed that one. thanks - we are moving to 8.1 anyway (50-150%
increase in performance for some specific tasks!).

depesz

#11Harry Jackson
harryjackson@gmail.com
In reply to: hubert depesz lubaczewski (#10)
Re: outdated (bad) information in pg_stat_activity

On 12/17/05, hubert depesz lubaczewski <depesz@gmail.com> wrote:

must have missed that one. thanks - we are moving to 8.1 anyway (50-150%
increase in performance for some specific tasks!).

For which tasks in particular are you seeing this king of gain?

Harry

--
http://www.hjackson.org
http://www.uklug.co.uk

In reply to: Harry Jackson (#11)
Re: outdated (bad) information in pg_stat_activity

On 12/17/05, Harry Jackson <harryjackson@gmail.com> wrote:

On 12/17/05, hubert depesz lubaczewski <depesz@gmail.com> wrote:

must have missed that one. thanks - we are moving to 8.1 anyway

(50-150%

increase in performance for some specific tasks!).

For which tasks in particular are you seeing this king of gain?

so called "imports".
we get data from external source, check it's id number, fetch the same
object from database (if it exists), compare, update changes. if it didn't
exist earlier - insert.
such a procedure "touched" directly 7 tables, and about 20 other using
triggers and other means of automation.

depesz