Re: global / super barriers (for checksums) - Mailing list pgsql-hackers

From Daniel Gustafsson
Subject Re: global / super barriers (for checksums)
Date
Msg-id AEB3CE37-D9F0-48AC-B47C-C5421A40DDF4@yesql.se
Whole thread Raw
In response to Re: global / super barriers (for checksums)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: global / super barriers (for checksums)
List pgsql-hackers
> On 9 Dec 2019, at 16:42, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Dec 6, 2019 at 12:17 PM Andres Freund <andres@anarazel.de> wrote:

I've read through the patchset and played around with it to try and break it
and understand it (not in that order).  Being a bit out of my comfort zone, I
can't offer the deep insights that Andres has done; but in reading the code it
all makes sense, and it works as I expect it to.

>> Hm. Why "Postgres*"? I think that's confusing. So far only backend like
>> things are named like that, and I'd e.g. not call checkpointer that.
>
> Well, that's the existing function name, and 0003 changes it anyway.

Reading the patches in sequence I was irked by Postgres* as well, and was going
to suggest PostgresProcess*, but since 0003 changes it immediately it's less of
an issue.

>>> 0003 takes this a step
>>> further by trying to unify a bunch more error-handling across
>>> different background process types. Together, 0002 and 0003 save >300
>>> lines of code and as far as I can see there's basically no downside.

Agreed, I cannot see any negative impacts.  The cleanup of common code makes
the remaining process code more readable so +1.

The patch has a tiny typo though, s/procesess/processes/:

+ * Typically, this handler would be used for SIGTERM, but some procesess use

> This patch probably needs some more changes before commit. One likely
> candidate: the use of a sample barrier type here probably needs to be
> replaced with something else. But the way this is set up doesn't
> really lend itself to starting with 0 types of barrier and then adding
> them later, so I am not sure exactly what to do to make this patch
> independent of what follows.

Something which can be observed in order to hook up a test for it, but which
has no sideeffects?  A NOOP barrier which only does a debug elog?

> The callback proposal I made before
> could've accommodated that (as well as, perhaps, some automated
> testing by dynamically loading up a barrier type) but Andres didn't
> like that. Perhaps he'll relent, or have a counter-proposal, or
> someone else will feel differently.


I sort of like the callback idea conceptually, but Andres is making a good
point about the extensibility actually making it harder to reason about.

> It seems sufficient to me for now to come up with something that can
> be used for enabling checksums online

FWIW I've rebased the online checksums patch on top of this patchset instead of
Andres' previous patch and everything seems to work fine.  All tests pass, and
I've been unable to trigger anything unexpected.  I'll post the new version of
said patch shortly.

> and for the thing that got me
> motivated to work on this, which is ALTER SYSTEM READ ONLY.

Aha, thats a pretty neat usecase.

In order to play around with and try to understand the patchset I wrote a PoC
PGC_PROCSIGNALBARRIER class for GUCs and converted a few PGC_SIGHUP GUCs to use
a barrier rather than a HUP.  The idea was to test interacting with the API a
bit more than the online checksum patch does, as its needs are pretty basic.
In hacking this up I didn't really come across anything in particular that I
lacked, and the result worked fine (insofar as a PoC can be considered
working).

cheers ./daniel


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: xact_start for walsender & logical decoding not updated
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: xact_start for walsender & logical decoding not updated