Cannot terminate backend

Started by Arnaud L.about 3 years ago5 messagesgeneral
Jump to latest
#1Arnaud L.
arnaud.listes@codata.eu

Hi all,

I have a long running query that I seem unable to either cancel or terminate.
What could be the reason for this, and what is the bet way to terminate this kind of query ?

The query is a CTE using postgis ST_ClusterDBSCAN function. The CTE returns approximately 150k rows.
The SQL is as follows :

EXPLAIN ANALYZE

WITH subq AS (

SELECT id, geom, ST_ClusterDBSCAN(geom, eps := 1000, minpoints := 1) OVER() AS cluster_id
FROM mytable
)

SELECT cluster_id, count(id), ST_Collect(geom)

FROM subq

GROUP BY cluster_id;

pg_stat_activity show no wait event.
pg_cancel_backend(mypid) returns true but the state does not change in pg_stat_activity.
pg_terminate_backend(mypid) yields the same result (as superuser)
Pg_stat_activity show no wait_event.

SELECT version();

PostgreSQL 14.5 (Ubuntu 14.5-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit

Any idea ? Thanks a lot !

Regards
Arnaud

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Arnaud L. (#1)
Re: Cannot terminate backend

On Fri, 2023-03-31 at 13:46 +0000, Arnaud Lesauvage wrote:

I have a long running query that I seem unable to either cancel or terminate.
What could be the reason for this, and what is the bet way to terminate this kind of query ?
 
The query is a CTE using postgis ST_ClusterDBSCAN function. The CTE returns approximately 150k rows.
The SQL is as follows :

EXPLAIN ANALYZE
WITH subq AS (
        SELECT id, geom, ST_ClusterDBSCAN(geom, eps := 1000, minpoints := 1) OVER() AS cluster_id
        FROM mytable
)
SELECT cluster_id, count(id), ST_Collect(geom)
FROM subq
GROUP BY cluster_id;
 
pg_stat_activity show no wait event.
pg_cancel_backend(mypid) returns true but the state does not change in pg_stat_activity.
pg_terminate_backend(mypid) yields the same result (as superuser)
Pg_stat_activity show no wait_event.
 
SELECT version();
PostgreSQL 14.5 (Ubuntu 14.5-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit

The most likely explanation is that one of the PostGIS functions runs for
a long time without checking CHECK_FOR_INTERRUPTS().
That would be a PostGIS bug. Try to construct a reproducible test case
that you can share!

Perhaps this trick can help:
https://www.cybertec-postgresql.com/en/cancel-hanging-postgresql-query/

Yours,
Laurenz Albe

#3Arnaud L.
arnaud.listes@codata.eu
In reply to: Laurenz Albe (#2)
RE: Cannot terminate backend

On Fri, 2023-03-31 at 13:46 +0000, Arnaud Lesauvage wrote:

I have a long running query that I seem unable to either cancel or

terminate.

What could be the reason for this, and what is the bet way to terminate

this kind of query ?

The query is a CTE using postgis ST_ClusterDBSCAN function. The CTE

returns approximately 150k rows.

The SQL is as follows :

EXPLAIN ANALYZE
WITH subq AS (
        SELECT id, geom, ST_ClusterDBSCAN(geom, eps := 1000, minpoints
:= 1) OVER() AS cluster_id
        FROM mytable
)
SELECT cluster_id, count(id), ST_Collect(geom) FROM subq GROUP BY
cluster_id;

pg_stat_activity show no wait event.
pg_cancel_backend(mypid) returns true but the state does not change in

pg_stat_activity.

pg_terminate_backend(mypid) yields the same result (as superuser)
Pg_stat_activity show no wait_event.

SELECT version();
PostgreSQL 14.5 (Ubuntu 14.5-1.pgdg20.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit

The most likely explanation is that one of the PostGIS functions runs for a
long time without checking CHECK_FOR_INTERRUPTS().
That would be a PostGIS bug. Try to construct a reproducible test case that
you can share!

Perhaps this trick can help:
https://www.cybertec-postgresql.com/en/cancel-hanging-postgresql-
query/

Thanks Laurenz
Unfortunately, I don't have a shell access to the server, so I guess I'll have to ask to sysadmin to kill -9 ?

Regards
Arnaud

#4Paul Ramsey
pramsey@cleverelephant.ca
In reply to: Arnaud L. (#3)
Re: Cannot terminate backend

Afraid so. You can wait longer, I guess. You may have found two bugs… the lack of an interrupt in the dbscan loop, which I’m working on now. And maybe an infinite looping case? In which case if you want a fix on that, you’ll have to share your data and query.

P.

Show quoted text

On Mar 31, 2023, at 7:41 AM, Arnaud Lesauvage <arnaud.listes@codata.eu> wrote:

On Fri, 2023-03-31 at 13:46 +0000, Arnaud Lesauvage wrote:

I have a long running query that I seem unable to either cancel or

terminate.

What could be the reason for this, and what is the bet way to terminate

this kind of query ?

The query is a CTE using postgis ST_ClusterDBSCAN function. The CTE

returns approximately 150k rows.

The SQL is as follows :

EXPLAIN ANALYZE
WITH subq AS (
SELECT id, geom, ST_ClusterDBSCAN(geom, eps := 1000, minpoints
:= 1) OVER() AS cluster_id
FROM mytable
)
SELECT cluster_id, count(id), ST_Collect(geom) FROM subq GROUP BY
cluster_id;

pg_stat_activity show no wait event.
pg_cancel_backend(mypid) returns true but the state does not change in

pg_stat_activity.

pg_terminate_backend(mypid) yields the same result (as superuser)
Pg_stat_activity show no wait_event.

SELECT version();
PostgreSQL 14.5 (Ubuntu 14.5-1.pgdg20.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit

The most likely explanation is that one of the PostGIS functions runs for a
long time without checking CHECK_FOR_INTERRUPTS().
That would be a PostGIS bug. Try to construct a reproducible test case that
you can share!

Perhaps this trick can help:
https://www.cybertec-postgresql.com/en/cancel-hanging-postgresql-
query/

Thanks Laurenz
Unfortunately, I don't have a shell access to the server, so I guess I'll have to ask to sysadmin to kill -9 ?

Regards
Arnaud

#5Arnaud L.
arnaud.listes@codata.eu
In reply to: Paul Ramsey (#4)
RE: Cannot terminate backend

From: Paul Ramsey <pramsey@cleverelephant.ca>
Afraid so. You can wait longer, I guess. You may have found two bugs… the lack of an interrupt in the dbscan loop, which I’m working on now. And maybe an infinite looping case? In which case if you want a fix on that, you’ll have to share your data and query. 

I let the query run over the week-end and they are now gone, so I guess your dbscan fix is what I was missing. Thanks !

Regards
Arnaud