Refactoring the checkpointer's fsync request queue - Mailing list pgsql-hackers

From Thomas Munro
Subject Refactoring the checkpointer's fsync request queue
Date
Msg-id CAEepm=2gTANm=e3ARnJT=n0h8hf88wqmaZxk0JYkxw+b21fNrw@mail.gmail.com
Whole thread Raw
Responses Re: Refactoring the checkpointer's fsync request queue  (Thomas Munro <thomas.munro@enterprisedb.com>)
Temporal Table Proposal  (Ibrar Ahmed <ibrar.ahmad@gmail.com>)
List pgsql-hackers
Hello hackers,

Currently, md5.c and checkpointer.c interact in a way that breaks
smgr.c's modularity.  That doesn't matter much if md.c is the only
storage manager implementation, but currently there are two proposals
to provide new kinds of block storage accessed via the buffer manager:
UNDO and SLRU.

Here is a patch that rips the fsync stuff out of md.c, generalises it
and puts it into a new translation unit smgrsync.c.  It can deal with
fsync()ing any files you want at checkpoint time, as long as they can
be described by a SmgrFileTag (a struct type we can extend as needed).
A pathname would work too, but I wanted something small and fixed in
size.  It's just a tag that can be converted to a path in case it
needs to be reopened (eg on Windows), but otherwise is used as a hash
table key to merge requests.

There is one major fly in the ointment:  fsyncgate[1].  Originally I
planned to propose a patch on top of that one, but it's difficult --
both patches move a lot of the same stuff around.  Personally, I don't
think it would be a very good idea to back-patch that anyway.  It'd be
riskier than the problem it aims to solve, in terms of bugs and
hard-to-foresee portability problems IMHO.  I think we should consider
back-patching some variant of Craig Ringer's PANIC patch, and consider
this redesigned approach for future releases.

So, please find attached the WIP patch that I would like to propose
for PostgreSQL 12, under a separate Commitfest entry.  It incorporates
the fsyncgate work by Andres Freund (original file descriptor transfer
POC) and me (many bug fixes and improvements), and the refactoring
work as described above.

It can be compiled in two modes: with the macro
CHECKPOINTER_TRANSFER_FILES defined, it sends fds to the checkpointer,
but if you comment out that macro definition for testing, or build on
Windows, it reverts to a mode that reopens files in the checkpointer.

I'm hoping to find a Windows-savvy collaborator to help finish the
Windows support.  Right now it passes make check on AppVeyor, but it
needs to be reviewed and tested on a real system with a small
shared_buffers (installcheck, pgbench, other attempts to break it).
Other than that, there are a couple of remaining XXX notes for small
known details, but I wanted to post this version now.

[1] https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: WIP: Avoid creation of the free space map for small tables
Next
From: Amit Kapila
Date:
Subject: Re: Undo logs