Re: Multixact slru doesn't don't force WAL flushes in SlruPhysicalWritePage() - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Multixact slru doesn't don't force WAL flushes in SlruPhysicalWritePage()
Date
Msg-id 20151111042247.GA1212824@tornado.leadboat.com
Whole thread Raw
In response to Multixact slru doesn't don't force WAL flushes in SlruPhysicalWritePage()  (Andres Freund <andres@anarazel.de>)
Responses Re: Multixact slru doesn't don't force WAL flushes in SlruPhysicalWritePage()  (Noah Misch <noah@leadboat.com>)
List pgsql-hackers
On Mon, Nov 09, 2015 at 10:40:07PM +0100, Andres Freund wrote:
>     /*
>      * Optional array of WAL flush LSNs associated with entries in the SLRU
>      * pages.  If not zero/NULL, we must flush WAL before writing pages (true
>      * for pg_clog, false for multixact, pg_subtrans, pg_notify).  group_lsn[]
>      * has lsn_groups_per_page entries per buffer slot, each containing the
>      * highest LSN known for a contiguous group of SLRU entries on that slot's
>      * page.
>      */
>     XLogRecPtr *group_lsn;
>     int            lsn_groups_per_page;
> 
> Uhm. multixacts historically didn't need to follow the
> write-WAL-before-data rule because it was zapped at restart. But it's
> now persistent.
> 
> There are no comments about this choice anywhere in multixact.c, leading
> me to believe that this was not an intentional decision.

Here's the multixact.c comment justifying it:
* XLOG interactions: this module generates an XLOG record whenever a new* OFFSETs or MEMBERs page is initialized to
zeroes,as well as an XLOG record* whenever a new MultiXactId is defined.  This allows us to completely* rebuild the
dataentered since the last checkpoint during XLOG replay.* Because this is possible, we need not follow the normal rule
of*"write WAL before data"; the only correctness guarantee needed is that* we flush and sync all dirty OFFSETs and
MEMBERspages to disk before a* checkpoint is considered complete.  If a page does make it to disk ahead* of
correspondingWAL records, it will be forcibly zeroed before use anyway.* Therefore, we don't need to mark our pages
withLSN information; we have* enough synchronization already.
 

The comment's justification is incomplete, though.  What of pages filled over
the course of multiple checkpoint cycles?



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Error in char(n) example in documentation
Next
From: Thomas Munro
Date:
Subject: Proposal: "Causal reads" mode for load balancing reads without stale data