Re: Bug in walsender when calling out to do_pg_stop_backup (and others?) - Mailing list pgsql-hackers

From Florian Pflug
Subject Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)
Date
Msg-id 956A880E-A75E-42DE-9C2E-21FB542E04EE@phlo.org
Whole thread Raw
In response to Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)  (Magnus Hagander <magnus@hagander.net>)
Responses Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)
List pgsql-hackers
On Oct11, 2011, at 09:21 , Magnus Hagander wrote:
> On Tue, Oct 11, 2011 at 03:29, Florian Pflug <fgp@phlo.org> wrote:
>> On Oct10, 2011, at 21:25 , Magnus Hagander wrote:
>>> On Thu, Oct 6, 2011 at 23:46, Florian Pflug <fgp@phlo.org> wrote:
>>>> It'd be nice to generally terminate a backend if the client vanishes, but so
>>>> far I haven't had any bright ideas. Using FASYNC and F_SETOWN unfortunately
>>>> sends a signal *everytime* the fd becomes readable or writeable, not only on
>>>> EOF. Doing select() in CHECK_FOR_INTERRUPTS seems far too expensive. We could
>>>> make the postmaster keep the fd's of around even after forking a backend, and
>>>> make it watch for broken connections using select(). But with a large max_backends
>>>> settings, we'd risk running out of fds in the postmaster...
>>>
>>> Ugh. Yeah. But at least catching it and terminating it when we *do*
>>> notice it's down would certainly make sense...
>>
>> I'll try to put together a patch that sets a flag if we discover a broken
>> connection in pq_flush, and tests that flag in CHECK_FOR_INTERRUPTS. Unless you
>> wanna, of course.
>
> Please do, I won't have time to even think about it until after
> pgconf.eu anyway ;)

Ok, here's a first cut.

I've based this on how query cancellation due to recovery conflicts work -
internal_flush() sets QueryCancelPending and ClientConnectionLostPending.

If QueryCancelPending is set, CHECK_FOR_INTERRUPTS checks
ClientConnectionLostPending, and if it's set it does ereport(FATAL).

I've only done light testing so far - basically the only case I've tested is
killing pg_basebackup while it's waiting for all required WAL to be archived.

best regards,
Florian Pflug


Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [REVIEW] Patch for cursor calling with named parameters
Next
From: Nathan Boley
Date:
Subject: Re: WIP: collect frequency statistics for arrays