Re: BUG #15036: Un-killable queries Hanging in BgWorkerShutdown - Mailing list pgsql-bugs

From David Kohn
Subject Re: BUG #15036: Un-killable queries Hanging in BgWorkerShutdown
Date
Msg-id CAJhMaBhW9x9ER4btTUXyubGhuF9F8ca162sU5exkaPnL+cCvOQ@mail.gmail.com
Whole thread Raw
In response to Re: BUG #15036: Un-killable queries Hanging in BgWorkerShutdown  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: BUG #15036: Un-killable queries Hanging in BgWorkerShutdown
List pgsql-bugs
Responses interleaved with yours. 
On Mon, Jan 29, 2018 at 9:07 PM Thomas Munro <thomas.munro@enterprisedb.com> wrote:
 
Hi David,

Thanks for the report!  Based on the mention of BtreePage, this sounds
like the following bug:
https://www.postgresql.org/message-id/flat/CAEepm%3D2xZUcOGP9V0O_G0%3D2P2wwXwPrkF%3DupWTCJSisUxMnuSg%40mail.gmail.com
 
The fix for that will be in 10.2 (current target date: February 8th).
The workaround in the meantime would be to disable parallelism, at
least for the queries doing parallel index scans if you can identify
them.
That sounds great, I hope that patch will fix it, I'm not quite sure it will though. Some of them have workers that are in the BtreePage state, however at least as many of the hung queries have only workers in the MessageQueuePutMessage state. Would you expect the patch to fix those as well? Or could it be something different?  

However, I'm not entirely sure why you're not able to cancel these
backends politely with pg_cancel_backend().  For example, the
BtreePage waiter should be in ConditionVariableSleep() and should be
interrupted by such a signal and error out in CHECK_FOR_INTERRUPTS().
All of them are definitely un-killable by anything other than a kill -9 that I've found so far. I have a feeling it has something to do with: https://jobs.zalando.com/tech/blog/hack-to-terminate-tcp-conn-postgres/?gh_src=4n3gxh1 but I'm not 100% sure, as I didn't set tcp settings low enough to make catching a packet all that reasonable. I'm happy to try to investigate further, I just don't quite know what that should entail. If you have things that you think would be helpful, please do let me know. 

Thanks for the help! 
D
 

pgsql-bugs by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: BUG #15035: scram-sha-256 blocks all logins
Next
From: Thomas Munro
Date:
Subject: Re: BUG #15036: Un-killable queries Hanging in BgWorkerShutdown