BUG #15290: Stuck Parallel Index Scan query - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #15290: Stuck Parallel Index Scan query
Date
Msg-id 153228422922.1395.1746424054206154747@wrigleys.postgresql.org
Whole thread Raw
Responses Re: BUG #15290: Stuck Parallel Index Scan query
Re: BUG #15290: Stuck Parallel Index Scan query
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      15290
Logged by:          Victor Yegorov
Email address:      vyegorov@gmail.com
PostgreSQL version: 10.4
Operating system:   Debian GNU/Linux 8.7 (jessie)
Description:

We've just encountered an issue on the streaming replica of the client.

In short — query is active for 8 hours and we're not able to terminate it:
all processes (leader and workers) are ignoring all signals.
Symptoms are similar to the ones described in #15036. As a result, we had to
perform immediate shutdown f the instance.
After restart we've got another such stuck query within minutes, so it's not
a single occurrence..


Query in subject
----------------
-[ RECORD 1
]-----------------------------------------------------------------------------------------------------
ts_age      | 08:32:17.286343
state       | active
query_age   | 08:32:17.286345
change_age  | 08:32:17.286344
datname     | coub
pid         | 2877
usename     | app
waiting     | f
client_addr |
client_port | -1
query       | select count(*) as value from coubs where type='Coub::Simple'
and is_done=false and in_process=false

                                                             QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=666001.67..666001.72 rows=1 width=8)
   ->  Gather  (cost=666001.20..666001.65 rows=8 width=8)
         Workers Planned: 8
         ->  Partial Aggregate  (cost=665901.20..665901.25 rows=1 width=8)
               ->  Parallel Index Scan using
coubs_type_is_done_partial_simple on coubs  (cost=0.56..663016.01
rows=1154077 width=0)
                     Index Cond: (is_done = false)
                     Filter: ((NOT is_done) AND (NOT in_process))
(7 rows)

In reality this query worked with just 4 workers:
 2877 ?        ts     0:08  \_ postgres: 10/main: app coub [local] SELECT
 3416 ?        Ss     0:00  \_ postgres: 10/main: bgworker: parallel worker
for PID 2877
 3417 ?        Ss     0:00  \_ postgres: 10/main: bgworker: parallel worker
for PID 2877
 3418 ?        Ss     0:00  \_ postgres: 10/main: bgworker: parallel worker
for PID 2877
 3419 ?        Ss     0:00  \_ postgres: 10/main: bgworker: parallel worker
for PID 2877

We have backtraces of the main process and all workers, and also
`pg_stat_activity` snapshot — I will attach them to the new e-mail.


pgsql-bugs by date:

Previous
From: Dmitry Dolgov
Date:
Subject: Re: LLVM jit and window functions on a temporary table
Next
From: Victor Yegorov
Date:
Subject: Re: BUG #15290: Stuck Parallel Index Scan query