Re: Handing off SLRU fsyncs to the checkpointer - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Handing off SLRU fsyncs to the checkpointer
Date
Msg-id CA+hUKGLWdGd+H7fqCt4bJJON+ouhOq-HpST0KeERN_Jo1fgsbw@mail.gmail.com
Whole thread Raw
In response to Re: Handing off SLRU fsyncs to the checkpointer  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Handing off SLRU fsyncs to the checkpointer
List pgsql-hackers
On Sun, Sep 20, 2020 at 12:40 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Sat, Sep 19, 2020 at 5:06 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > In the meantime, from the low-hanging-fruit department, here's a new
> > version of the SLRU-fsync-offload patch.  The only changes are a
> > tweaked commit message, and adoption of C99 designated initialisers
> > for the function table, so { [SYNC_HANDLER_CLOG] = ... } instead of
> > relying on humans to make the array order match the enum values.  If
> > there are no comments or objections, I'm planning to commit this quite
> > soon.
>
> ... and CI told me that Windows didn't like my array syntax with the
> extra trailing comma.  Here's one without.

While scanning for comments and identifier names that needed updating,
I realised that this patch changed the behaviour of the ShutdownXXX()
functions, since they currently flush the SLRUs but are not followed
by a checkpoint.  I'm not entirely sure I understand the logic of
that, but it wasn't my intention to change it.  So here's a version
that converts the existing fsync_fname() to fsync_fname_recurse() to
fix that.

Strangely, the fsync calls that ensure that directory entries are on
disk seemed to be missing from CheckPointMultixact(), so I added them.
Isn't that a live bug?

I decided it was a little too magical that CheckPointCLOG() etc
depended on the later call to CheckPointBuffers() to perform their
fsyncs.  I started writing comments about that, but then I realised
that the right thing to do was to hoist ProcessSyncRequests() out of
there into CheckPointGuts() and make it all more explicit.

I also realised that it would be inconsistent to count SLRU sync
activity as buffer sync time, but not count SLRU write activity as
buffer write time, or count its buffers as written in the reported
statistics.  In other words, SLRU buffers *are* buffers for checkpoint
reporting purposes (or should at least be consistently in or out of
the stats, and with this patch they have to be in).

Does that make sense?   Is there a problem I'm not seeing with
reordering CheckPointGuts() as I have?

One comment change that seems worth highlighting is this code reached
by VACUUM, which seems like a strict improvement (it wasn't flushing
for crash recovery):

        /*
-        * Flush out dirty data, so PhysicalPageExists can work correctly.
-        * SimpleLruFlush() is a pretty big hammer for that.  Alternatively we
-        * could add an in-memory version of page exists, but
find_multixact_start
-        * is called infrequently, and it doesn't seem bad to flush buffers to
-        * disk before truncation.
+        * Write out dirty data, so PhysicalPageExists can work correctly.
         */
-       SimpleLruFlush(MultiXactOffsetCtl, true);
-       SimpleLruFlush(MultiXactMemberCtl, true);
+       SimpleLruWriteAll(MultiXactOffsetCtl, true);
+       SimpleLruWriteAll(MultiXactMemberCtl, true);

Attachment

pgsql-hackers by date:

Previous
From: "Hou, Zhijie"
Date:
Subject: Improper use about DatumGetInt32
Next
From: Ashutosh Sharma
Date:
Subject: Re: recovering from "found xmin ... from before relfrozenxid ..."