Re: global / super barriers (for checksums) - Mailing list pgsql-hackers

From Andres Freund
Subject Re: global / super barriers (for checksums)
Date
Msg-id 20191206171732.4q3kd6rqqwvk7d7p@alap3.anarazel.de
Whole thread Raw
In response to Re: global / super barriers (for checksums)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: global / super barriers (for checksums)
List pgsql-hackers
Hi,

Thanks for the updated version!

On 2019-12-02 13:06:24 -0500, Robert Haas wrote:
> On Sun, Nov 17, 2019 at 8:38 AM Magnus Hagander <magnus@hagander.net> wrote:
> > I'm definitely happy to work with it, but I did not and do not feel I have the skills for doing the "proper review"
neededfor it. So I am also very happy for you to pick it up and run with it.
 
> 
> OK, here's what I came up with.
> 
> 0001 is just a code movement patch. It puts the interrupt handling for
> each type of background process into a subroutine, instead of having
> it all inline in the main loop. I feel this makes things more clear.
> Hopefully it's uncontroversial.

I'd suggest marking the shutdown functions as noreturn. Might prevent
some compiler warnings in the future.


> 0002 and 0003 are optional and slightly off-topic as far as the matter
> at hand, but they are not entirely unrelated and I believe that they
> are good cleanups.  0002 changes assorted background processes to use
> PostgresSigHupHandler and ConfigReloadPending rather than having a
> bunch of separate global variables and a bunch of separate signal
> handlers that basically do the same stuff.

Hm. Why "Postgres*"? I think that's confusing. So far only backend like
things are named like that, and I'd e.g. not call checkpointer that.

It's probably fine, but it's worthwhile to consider that that checking an
external variable like ConfigReloadPending is more expensive than a
static var like got_SIGHUP. The compiler will most of the time
understand the code flow to understand when something like got_SIGHUP
can change, but it can't assume anything like that with
ConfigReloadPending. But as we're forcing the compilers hand, by marking
the variable as volatile, so it's not likely to be a meaningful difference.


> 0003 takes this a step
> further by trying to unify a bunch more error-handling across
> different background process types. Together, 0002 and 0003 save >300
> lines of code and as far as I can see there's basically no downside. I
> think in general it would be good to strive towards a future where
> everybody used the same signal handling and interrupt handling, but
> these patches have the much less ambitious goal of just removing
> overtly duplicated code. I think they might act as an inducement to
> future patch authors to try to make things look more alike, though.

I do agree that this is a good goal. The amount of redundant code we
have across different types of processes is utterly ridiculous. While
obviously not a significant problem, there's also


> 0004 is the main patch. It is substantially revised from Andres's
> version, but the basic principle is along similar lines. Changes:
> 
> - I felt that it was imprudent to put some of the machinery into
> PGPROC and the rest into the ProcSignal mechanism, because there is
> nothing that says that everything with a PGPROC must also be a
> ProcSignal participant. So I moved everything over into the ProcSignal
> mechanism.  This seems a lot safer to me.

I don't fully remember how I ended up with that split. But I think there
is an argument that putting the generation in PGPROC is good: That way
there are fewer changes to the shared cachelines in the ProcSignal. As I
think there'll be more modifications to barrierGen/barrierFlags (as I'd
named them) than the number of times they're read, that could be
beneficial in keeping the overhead low.

It's probably still worthwhile to go the way you have.


> - There details of the synchronization are different: there's no
> ProcSignalKind for this new kind of barrier any more

This I do not get. What's the point in making this unconditional?


> , and there's now
> only one pg_memory_barrier() and it's in a different place than Andres
> had it. It is quite possible that I have screwed things up here, but I
> couldn't make heads or tails of the way Andres had it. I don't know
> whether that's because it was a POC or because I'm dumb. Also, moving
> everything over into ProcSignal seemed to me to suggest some
> rejiggering here that might've made less sense in the old scheme.

I don't remember enough to comment usefully on why things were the way
they are.


> - I changed the code that waits for a barrier to use a latch wait,
> because I think that's our usual convention. So wait-for-barrier gets
> woken up early if the latch is set, and also itself manipulates the
> latch.  I also gave it a variable-length timeout, because when testing
> it irked me that the same test randomly took either 0s or 1s. Now it
> takes either 0s or 125ms, which is a lot less noticeable.

Sounds good.


> From e9bfcbc3a1b1aa4a01a5a281bc71c6f019f5d46b Mon Sep 17 00:00:00 2001
> From: Robert Haas <rhaas@postgresql.org>
> Date: Wed, 27 Nov 2019 11:29:25 -0500
> Subject: [PATCH v2 4/5] Extend the ProcSignal mechanism to support barriers.



> +uint64
> +EmitProcSignalBarrier(ProcSignalBarrierType type)
> +{
> +    uint64    flagbit = UINT64CONST(1) << (uint64) type;
> +    uint64    generation;
> +
> +    /*
> +     * Set all the flags.
> +     *
> +     * Note that pg_atomic_fetch_or_u32 has full barrier semantics, so this
> +     * is totally ordered with respect to anything the caller did before, and
> +     * anything that we do afterwards. (This is also true of the later call
> +     * to pg_atomic_add_fetch_u64.)
> +     */
> +    for (int i = NumProcSignalSlots - 1; i >= 0; i--)
> +    {
> +        volatile ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
> +
> +        pg_atomic_fetch_or_u32(&slot->pss_barrierCheckMask, flagbit);
> +    }

What's the reason for doing this in reverse? SendProcSignal() does it
because short circuits the search once a match is found, and it
theorizes they're at the end, but that doesn't apply here?

It could potentially be worthwhile to check whether all backends already
have the flag set, and in that case not increment the barrier
generation? Probably not worth it for now.


> +/*
> + * WaitForProcSignalBarrier - wait until it is guaranteed that all changes
> + * requested by a specific call to EmitProcSignalBarrier() have taken effect.
> + *
> + * We expect that the barrier will normally be absorbed very quickly by other
> + * backends, so we start by waiting just 1/8 of a second and then back off
> + * by a factor of two every time we time out, to a maximum wait time of
> + * 1 second.
> + */
> +void
> +WaitForProcSignalBarrier(uint64 generation)
> +{
> +    long    timeout = 125L;
> +
> +    for (int i = NumProcSignalSlots - 1; i >= 0; i--)
> +    {
> +        volatile ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
> +        uint64    oldval;
> +
> +        oldval = pg_atomic_read_u64(&slot->pss_barrierGeneration);
> +        while (oldval < generation)

Hm. without a barrier here, is it guaranteed that we can rely on the
state we need? IOW, could it be that we see an update to
pss_barrierGeneration to our desired generation, but still see
"published" state by some other backend that isn't yet in accordance to
the the generation? Seems we ought to have at least a read barrier
somewhere?



> +/*
> + * Perform global barrier related interrupt checking.
> + *
> + * Any backend that participates in ProcSignal signalling must arrange to
> + * call this function periodically. It is called from CHECK_FOR_INTERRUPTS(),
> + * which is enough for normal backends, but not necessarily for all types of
> + * background processes.
> + */
> +void
> +ProcessProcSignalBarrier(void)
> +{
> +    uint64 generation;
> +    uint32 flags;
> +
> +    /* Exit quickly if there's no work to do. */
> +    if (!ProcSignalBarrierPending)
> +        return;
> +    ProcSignalBarrierPending = false;
> +
> +    /*
> +     * Read the current barrier generation, and then get the flags that
> +     * are set for this backend. Note that pg_atomic_exchange_u32 is a full
> +     * barrier, so we're guaranteed that the read of the barrier generation
> +     * happens before we atomically extract the flags, and that any subsequent
> +     * state changes happen afterward.
> +     */
> +    generation = pg_atomic_read_u64(&ProcSignal->psh_barrierGeneration);
> +    flags = pg_atomic_exchange_u32(&MyProcSignalSlot->pss_barrierCheckMask, 0);
> +
> +    /*
> +     * Process each type of barrier. It's important that nothing we call from
> +     * here throws an error, because pss_barrierCheckMask has already been
> +     * cleared. If we jumped out of here before processing all barrier types,
> +     * then we'd forget about the need to do so later.
> +     *
> +     * NB: It ought to be OK to call the barrier-processing functions
> +     * unconditionally, but it's more efficient to call only the ones that
> +     * might need us to do something based on the flags.
> +     */
> +    if (BARRIER_SHOULD_CHECK(flags, PROCSIGNAL_BARRIER_SAMPLE))
> +        ProcessBarrierSample();

Hm. In my mental model it would be useful for barrier "processors" to
not acknowledge the state change at certain points. Imagine e.g. needing
to efficiently wait till all backends have processed a config file
reload - since we don't reload while a query is being processed, we
should be able to not ack the barrier at that point.



Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: Re: Why JIT speed improvement is so modest?
Next
From: Konstantin Knizhnik
Date:
Subject: Re: Why JIT speed improvement is so modest?