Backend process that won't die

Started by Susan Cassidyover 14 years ago3 messagesgeneral
Jump to latest
#1Susan Cassidy
scassidy@edgewave.com

I have a couple of backend processes that are "stuck", and do not respond to a pg_cancel_backend. This is PostgreSQL 8.3.5. The pg_cancel_backend returns true, but the process keeps running. I have also done a "kill 12345" from the command-line, with no effect.

The processes are running a "select function_x" statement that normally takes a fraction of a second to run.

No locks are shown when I do:
select relname,pg_locks.* from pg_class,pg_locks where relfilenode=relation and not granted;

We had a database crash last week, and had to reindex a bunch of tables, but this function has been working for several days on the same tables that should be being used by the function_x function.

Any ideas on how to get the processes to go away?

They are eating cpu cycles, for no good reason:
postgres 28396 85.0 1.4 4420768 242224 ? Ss Sep03 3193:40 postgres: userxx dbname1 172.27.43.9(1160) SELECT

Thanks,
Susan

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Susan Cassidy (#1)
Re: Backend process that won't die

Susan Cassidy <scassidy@edgewave.com> writes:

I have a couple of backend processes that are "stuck", and do not respond to a pg_cancel_backend. This is PostgreSQL 8.3.5. The pg_cancel_backend returns true, but the process keeps running. I have also done a "kill 12345" from the command-line, with no effect.

We had a database crash last week, and had to reindex a bunch of tables, but this function has been working for several days on the same tables that should be being used by the function_x function.

By "this function" you mean that the reindex is not finished, but
nonetheless you have got regular queries running with the corrupted
indexes?

Any ideas on how to get the processes to go away?

It seems like a good bet that they're chasing circular links in the
corrupted indexes. "kill -9" would get rid of them, but it would force
a database-wide restart, which would also take out your reindex process,
so maybe that wouldn't be a good idea.

If they're significantly interfering with the progress of the reindex
then maybe you should bite the bullet and kill them anyway. Otherwise
I'd be inclined to let them go until you can afford a restart.

regards, tom lane

#3Susan Cassidy
scassidy@edgewave.com
In reply to: Tom Lane (#2)
Re: Backend process that won't die

-----Original Message-----

From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, September 06, 2011 9:57 AM
To: Susan Cassidy
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Backend process that won't die

Susan Cassidy <scassidy@edgewave.com> writes:

I have a couple of backend processes that are "stuck", and do not respond to a pg_cancel_backend. This is PostgreSQL 8.3.5. The pg_cancel_backend returns true, but the process keeps running. I have also done a "kill 12345" from the command-line, with no effect.

We had a database crash last week, and had to reindex a bunch of tables, but this function has been working for several days on the same tables that should be being used by the function_x function.

By "this function" you mean that the reindex is not finished, but

nonetheless you have got regular queries running with the corrupted
indexes?

No, the reindexes that I knew were needed have already been done.

Any ideas on how to get the processes to go away?

It seems like a good bet that they're chasing circular links in the

corrupted indexes. "kill -9" would get rid of them, but it would force
a database-wide restart, which would also take out your reindex process,
so maybe that wouldn't be a good idea.

If they're significantly interfering with the progress of the reindex

then maybe you should bite the bullet and kill them anyway. Otherwise
I'd be inclined to let them go until you can afford a restart.

regards, tom lane

Without any error messages about indexes, which I have not seen lately, I have no idea which indexes still might need rebuilding.

So, you think I should go ahead and kill -9 the "stuck" processes, and let the database restart? It is a 2-system cluster, with failover, so I'll let the IT guy handle that, I guess.

Thanks,
Susan