Thread: Cannot terminate backend

Cannot terminate backend

From
Arnaud Lesauvage
Date:

Hi all,

 

I have a long running query that I seem unable to either cancel or terminate.

What could be the reason for this, and what is the bet way to terminate this kind of query ?

 

The query is a CTE using postgis ST_ClusterDBSCAN function. The CTE returns approximately 150k rows.
The SQL is as follows :


EXPLAIN ANALYZE
WITH subq AS (
        SELECT id, geom, ST_ClusterDBSCAN(geom, eps := 1000, minpoints := 1) OVER() AS cluster_id

        FROM mytable

)

SELECT cluster_id, count(id), ST_Collect(geom) 
FROM subq
GROUP BY cluster_id;
 

pg_stat_activity show no wait event.

pg_cancel_backend(mypid) returns true but the state does not change in pg_stat_activity.

pg_terminate_backend(mypid) yields the same result (as superuser)

Pg_stat_activity show no wait_event.

 

SELECT version();
PostgreSQL 14.5 (Ubuntu 14.5-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit

 

 

Any idea ? Thanks a lot !

 

Regards

Arnaud

Re: Cannot terminate backend

From
Laurenz Albe
Date:
On Fri, 2023-03-31 at 13:46 +0000, Arnaud Lesauvage wrote:
> I have a long running query that I seem unable to either cancel or terminate.
> What could be the reason for this, and what is the bet way to terminate this kind of query ?
>  
> The query is a CTE using postgis ST_ClusterDBSCAN function. The CTE returns approximately 150k rows.
> The SQL is as follows :
>
> EXPLAIN ANALYZE
> WITH subq AS (
>         SELECT id, geom, ST_ClusterDBSCAN(geom, eps := 1000, minpoints := 1) OVER() AS cluster_id
>         FROM mytable
> )
> SELECT cluster_id, count(id), ST_Collect(geom)
> FROM subq
> GROUP BY cluster_id;
>  
> pg_stat_activity show no wait event.
> pg_cancel_backend(mypid) returns true but the state does not change in pg_stat_activity.
> pg_terminate_backend(mypid) yields the same result (as superuser)
> Pg_stat_activity show no wait_event.
>  
> SELECT version();
> PostgreSQL 14.5 (Ubuntu 14.5-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1)
9.4.0,64-bit 

The most likely explanation is that one of the PostGIS functions runs for
a long time without checking CHECK_FOR_INTERRUPTS().
That would be a PostGIS bug.  Try to construct a reproducible test case
that you can share!

Perhaps this trick can help:
https://www.cybertec-postgresql.com/en/cancel-hanging-postgresql-query/

Yours,
Laurenz Albe



RE: Cannot terminate backend

From
Arnaud Lesauvage
Date:
> On Fri, 2023-03-31 at 13:46 +0000, Arnaud Lesauvage wrote:
> > I have a long running query that I seem unable to either cancel or
> terminate.
> > What could be the reason for this, and what is the bet way to terminate
> this kind of query ?
> >
> > The query is a CTE using postgis ST_ClusterDBSCAN function. The CTE
> returns approximately 150k rows.
> > The SQL is as follows :
> >
> > EXPLAIN ANALYZE
> > WITH subq AS (
> >         SELECT id, geom, ST_ClusterDBSCAN(geom, eps := 1000, minpoints
> > := 1) OVER() AS cluster_id
> >         FROM mytable
> > )
> > SELECT cluster_id, count(id), ST_Collect(geom) FROM subq GROUP BY
> > cluster_id;
> >
> > pg_stat_activity show no wait event.
> > pg_cancel_backend(mypid) returns true but the state does not change in
> pg_stat_activity.
> > pg_terminate_backend(mypid) yields the same result (as superuser)
> > Pg_stat_activity show no wait_event.
> >
> > SELECT version();
> > PostgreSQL 14.5 (Ubuntu 14.5-1.pgdg20.04+1) on x86_64-pc-linux-gnu,
> > compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit
> 
> The most likely explanation is that one of the PostGIS functions runs for a
> long time without checking CHECK_FOR_INTERRUPTS().
> That would be a PostGIS bug.  Try to construct a reproducible test case that
> you can share!
> 
> Perhaps this trick can help:
> https://www.cybertec-postgresql.com/en/cancel-hanging-postgresql-
> query/


Thanks Laurenz
Unfortunately, I don't have a shell access to the server, so I guess I'll have to ask to sysadmin to kill -9 ?

Regards
Arnaud


Re: Cannot terminate backend

From
Paul Ramsey
Date:
Afraid so. You can wait longer, I guess. You may have found two bugs… the lack of an interrupt in the dbscan loop, which I’m working on now. And maybe an infinite looping case? In which case if you want a fix on that, you’ll have to share your data and query. 

P.

On Mar 31, 2023, at 7:41 AM, Arnaud Lesauvage <arnaud.listes@codata.eu> wrote:

On Fri, 2023-03-31 at 13:46 +0000, Arnaud Lesauvage wrote:
I have a long running query that I seem unable to either cancel or
terminate.
What could be the reason for this, and what is the bet way to terminate
this kind of query ?

The query is a CTE using postgis ST_ClusterDBSCAN function. The CTE
returns approximately 150k rows.
The SQL is as follows :

EXPLAIN ANALYZE
WITH subq AS (
        SELECT id, geom, ST_ClusterDBSCAN(geom, eps := 1000, minpoints
:= 1) OVER() AS cluster_id
        FROM mytable
)
SELECT cluster_id, count(id), ST_Collect(geom) FROM subq GROUP BY
cluster_id;

pg_stat_activity show no wait event.
pg_cancel_backend(mypid) returns true but the state does not change in
pg_stat_activity.
pg_terminate_backend(mypid) yields the same result (as superuser)
Pg_stat_activity show no wait_event.

SELECT version();
PostgreSQL 14.5 (Ubuntu 14.5-1.pgdg20.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit

The most likely explanation is that one of the PostGIS functions runs for a
long time without checking CHECK_FOR_INTERRUPTS().
That would be a PostGIS bug.  Try to construct a reproducible test case that
you can share!

Perhaps this trick can help:
https://www.cybertec-postgresql.com/en/cancel-hanging-postgresql-
query/


Thanks Laurenz
Unfortunately, I don't have a shell access to the server, so I guess I'll have to ask to sysadmin to kill -9 ?

Regards
Arnaud

RE: Cannot terminate backend

From
Arnaud Lesauvage
Date:
> From: Paul Ramsey <pramsey@cleverelephant.ca> 
> Afraid so. You can wait longer, I guess. You may have found two bugs… the lack of an interrupt in the dbscan loop,
whichI’m working on now. And maybe an infinite looping case? In which case if you want a fix on that, you’ll have to
shareyour data and query. 
 

I let the query run over the week-end and they are now gone, so I guess your dbscan fix is what I was missing. Thanks
!

Regards
Arnaud