Re: [PATCH] Fix for infinite signal loop in parallel scan - Mailing list pgsql-hackers

From Chris Travers
Subject Re: [PATCH] Fix for infinite signal loop in parallel scan
Date
Msg-id CAN-RpxBu8f5USrfCP0hvgJKmH+yLdB8PkrGTh213UpdOpwrkPw@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Fix for infinite signal loop in parallel scan  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
First, I have attached a cleaned-up revision (pg_indent, removing a dangling comment etc)

On Fri, Sep 14, 2018 at 1:16 PM Thomas Munro <thomas.munro@enterprisedb.com> wrote:
On Sat, Sep 8, 2018 at 3:57 AM Chris Travers <chris.travers@adjust.com> wrote:
> Attached is the patch we are fully testing at Adjust.

Thanks!

> I have run make check on Linux and MacOS, and make check-world on Linux (check-world fails on MacOS on all versions and all branches due to ecpg failures).

FWIW it's entirely possible to get make check-world passing on a Mac.
Maybe post the problem you're seeing to a new thread?

Will do. 

> ...

> In the past it had been suggested we do PG_TRY(); and PG_CATCH(), but given that it is not consistent whether we can raise an error or whether we MUST raise an error, I don't see how this approach can work.  As far as I can see, we MUST raise an error in the appropriate spot if and only if elevel is set to a sufficient level.

Yeah, your way looks a bit nicer than something involving PG_TRY().

> Is there any feedback on this approach before I add it to the next commitfest?

Please go ahead and add it.  Being a bug fix, we'll commit it sooner
than the open commitfest anyway, but it's useful to have it in there.

+ if (errno == EINTR && elevel >= ERROR)
+ CHECK_FOR_INTERRUPTS();

I think we might as well just CHECK_FOR_INTERRUPTS() unconditionally.
In this branch elevel is always ERROR as you noted, and the code
around there is confusing enough already.

The reason I didn't do that is future safety and for the off chance that someone could do something strange with extensions today that might use this in an unsafe way.    While I cannot think of any use case for calling this in an unsafe way, I know for a fact that one might write extensions, background workers, etc. to do things with this API that are not in our current code right now.  For something that could be back ported I really want to work as much as possible in a way that does not possibly brake even exotic extensions.

Longer-run I would like to see if I can help refactor this code so that responsibility for error handling is clearer and such problems cannot exist.  But that's not something to back port.

+ } while (rc == EINTR && !(ProcDiePending || QueryCancelPending));

There seems to be a precedent for checking QueryCancelPending directly
to break out of a loop in regcomp.c and syncrep.c.  So this seems OK.

Yeah, I checked.
 
Hmm, I wonder if there should be an INTERRUPT_PENDING() macro that
hides those details, but allows you to break out of a loop and then do
some cleanup before CHECK_FOR_INTERRUPT(). 

That is a good idea. 

--
Thomas Munro
http://www.enterprisedb.com


--
Best Regards,
Chris Travers
Head of Database

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com 
Saarbrücker Straße 37a, 10405 Berlin

Attachment

pgsql-hackers by date:

Previous
From: Oleksii Kliukin
Date:
Subject: Re: [PATCH] Fix for infinite signal loop in parallel scan
Next
From: Michael Paquier
Date:
Subject: SSL tests failing with "ee key too small" error on Debian SID