Re: EINTR in ftruncate() - Mailing list pgsql-hackers

From Andres Freund
Subject Re: EINTR in ftruncate()
Date
Msg-id 20220701202944.xkuroe3kdc5izykd@alap3.anarazel.de
Whole thread Raw
In response to Re: EINTR in ftruncate()  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: EINTR in ftruncate()
Re: EINTR in ftruncate()
List pgsql-hackers
Hi,

On 2022-07-01 19:55:16 +0200, Alvaro Herrera wrote:
> On 2022-Jul-01, Andres Freund wrote:
> 
> > On 2022-07-01 17:41:05 +0200, Alvaro Herrera wrote:
> > > Nicola Contu reported two years ago to pgsql-general[1] that they were
> > > having sporadic query failures, because EINTR is reported on some system
> > > call.  I have been told that the problem persists, though it is very
> > > infrequent.  I propose the attached patch.  Kyotaro proposed a slightly
> > > different patch which also protects write(), but I think that's not
> > > necessary.
> > 
> > What is the reason for the || ProcDiePending || QueryCancelPending bit? What
> > if there's dsm operations intentionally done while QueryCancelPending?
> 
> That mirrors the test for the other block in that function, which was
> added by 63efab4ca139, whose commit message explains:
> 
>     Allow DSM allocation to be interrupted.
>     
>     Chris Travers reported that the startup process can repeatedly try to
>     cancel a backend that is in a posix_fallocate()/EINTR loop and cause it
>     to loop forever.  Teach the retry loop to give up if an interrupt is
>     pending.  Don't actually check for interrupts in that loop though,
>     because a non-local exit would skip some clean-up code in the caller.

That whole approach seems quite wrong to me. At the absolute very least the
code needs to check if interrupts are being processed in the current context
before just giving up due to ProcDiePending || QueryCancelPending.

I'm very unconvinced this ought to be fixed in dsm_impl_posix_resize(), rather
than the startup process signalling.

There is an argument for allowing more things to be cancelled, but we'd need a
retry loop for the !INTERRUPTS_CAN_BE_PROCESSED() case.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: "Finnerty, Jim"
Date:
Subject: Re: Making Vars outer-join aware
Next
From: Tom Lane
Date:
Subject: Re: Making Vars outer-join aware