Re: [PATCH] Fix for infinite signal loop in parallel scan - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: [PATCH] Fix for infinite signal loop in parallel scan
Date
Msg-id CAEepm=0t_FTEv1BoG_3MR1bG2fGKdTCj3UE8JT29YXH4S+urtg@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Fix for infinite signal loop in parallel scan  (Chris Travers <chris.travers@adjust.com>)
Responses Re: [PATCH] Fix for infinite signal loop in parallel scan  (Oleksii Kliukin <alexk@hintbits.com>)
List pgsql-hackers
On Tue, Sep 18, 2018 at 1:15 AM Chris Travers <chris.travers@adjust.com> wrote:
> On Mon, Sep 17, 2018 at 2:59 PM Oleksii Kliukin <alexk@hintbits.com> wrote:
>> With the patch applied, the posix_fallocate loop terminated right away (because
>> of QueryCancelPending flag set to true) and the backend went through the
>> cleanup, showing an ERROR of cancelling due to the conflict with recovery.
>> Without the patch, it looped indefinitely in the dsm_impl_posix_resize, while
>> the startup process were looping forever, trying to send SIGUSR1.

Thanks for testing!

>> One thing I’m wondering is whether we could do the same by just blocking SIGUSR1
>> for the duration of posix_fallocate?
>
> If we were to do that, I would say we should mask all signals we can mask during the call.
>
> I don't have a problem going down that road instead if people think it is better.

We discussed that when adding posix_fallocate() and decided that
retrying is better:

https://www.postgresql.org/message-id/20170628230458.n5ehizmvhoerr5yq%40alap3.anarazel.de

Here is a patch that I propose to commit and back-patch to 9.4.  I
just wrote a suitable commit message, edited the comments lightly and
fixed some whitespace.

--
Thomas Munro
http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Online verification of checksums
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Changing the setting of wal_sender_timeout per standby