Re: global / super barriers (for checksums) - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: global / super barriers (for checksums)
Date
Msg-id CABUevEx9LGJLDiUifjXDYkjeJe9ATZ4LdCBRLjuMJT3TeDmVfw@mail.gmail.com
Whole thread Raw
In response to global / super barriers (for checksums)  (Andres Freund <andres@anarazel.de>)
Responses Re: global / super barriers (for checksums)  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers


On Tue, Oct 30, 2018 at 6:16 AM Andres Freund <andres@anarazel.de> wrote:
Hi,

Magnus cornered me at pgconf.eu and asked me whether I could prototype
the "barriers" I'd been talking about in the online checksumming thread.

The problem there was to make sure that all processes, backends and
auxiliary processes have seen the new state of checksums being enabled,
and aren't currently in the process of writing a new page out.

The current prototype solves that by requiring a restart, but that
strikes me as a far too large hammer.

The attached patch introduces "global barriers" (name was invented in a
overcrowded hotel lounge, so ...), which allow to wait for such a change
to be absorbed by all backends.

I've only tested the code with gdb, but that seems to work:

p WaitForGlobalBarrier(EmitGlobalBarrier(GLOBBAR_CHECKSUM))

waits until all backends (including bgwriter, checkpointers, walwriters,
bgworkers, ...) have accepted interrupts at least once.  Multiple such
requests are coalesced.

I decided to wait until interrupts are actually process, rather than
just the signal received, because that means the system is in a well
defined state. E.g. there's no pages currently being written out.

For the checksum enablement patch you'd do something like;

EnableChecksumsInShmemWithLock();
WaitForGlobalBarrier(EmitGlobalBarrier(GLOBBAR_CHECKSUM));

and after that you should be able to set it to a perstistent mode.


I chose to use procsignals to send the signals, a global uint64
globalBarrierGen, and per-backend barrierGen, barrierFlags, with the
latter keeping track which barriers have been requested. There likely
seem to be other usecases.


The patch definitely is in a prototype stage. At the very least it needs
a high-level comment somewhere, and some of the lower-level code needs
to be cleaned up.

One thing I wasn't happy about is how checksum internals have to absorb
barrier requests - that seems unavoidable, but I'd hope for something
more global than just BufferSync().


Comments?

Finally getting around to playing with this one and it unfortunately doesn't apply anymore (0003).

I think it's just a matter of adding those two rows though, right? That is, it's not an actual conflict it's just something else added in the same place? 

--

pgsql-hackers by date:

Previous
From: Alexey Kondratov
Date:
Subject: Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace onthe fly
Next
From: Yugo Nagata
Date:
Subject: Implementing Incremental View Maintenance