On Mon, Dec 11, 2017 at 10:22 PM, <tlw@monsido.com> wrote:
> The following bug has been logged on the website:
>
> Bug reference: 14960
> Logged by: Tim Warberg
> Email address: tlw@monsido.com
> PostgreSQL version: 10.1
> Operating system: Ubuntu 16.04 LTS
> Description:
>
> Hi,
>
> Ran into a issue on our PostgreSQL 10.1 cluster that seems to be a parallel
> index scan bug. Our queue system scheduled 30 almost identical concurrent
> queries where some of them was executed as parallel queries and all of them
> became stuck on IPC BtreePage lock.
Hi Tim,
Thanks for the report. This seems to be the same as the bug that we
just analysed over here:
https://www.postgresql.org/message-id/flat/CAEepm%3D2xZUcOGP9V0O_G0%3D2P2wwXwPrkF%3DupWTCJSisUxMnuSg%40mail.gmail.com
> We discovered this after they've been
> stuck like that for about 10 hours [1]. At the same time a autovacuum was
> progressing one of the queried tables and it had become stuck in LWLock
> buffer_content while vacuuming indexes. None of the query processes
> responded to pg_cancel_backend nor pg_terminate_backend including the
> autovacuum.
Hmm. This may be because we hold a BT_READ lock while waiting in
_bt_parallel_seize(). Here it's extended by the above-mentioned bug,
preventing others from acquiring an exclusive lock.
--
Thomas Munro
http://www.enterprisedb.com