Re: WAL format and API changes (9.5) - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: WAL format and API changes (9.5)
Date
Msg-id 53DBC2F7.8050604@vmware.com
Whole thread Raw
In response to Re: WAL format and API changes (9.5)  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: WAL format and API changes (9.5)
List pgsql-hackers
Here's a new version of this patch, please review.

I've cleaned up a lot of stuff, fixed all the bugs reported so far, and
a bunch of others I found myself while testing.

I'm not going to explain again what the patch does; the README and
comments should now be complete enough to explain how it works. If not,
please speak up.

In the previous version of this patch, I made the XLogRegisterData
function to copy the WAL data to a temporary buffer, instead of
constructing the XLogRecData chain. I decided to revert back to the old
way after all; I still think that copying would probably be OK from a
performance point of view, but that'd need more testing. We can still
switch to doing it that way later; the XLogRecData struct is no longer
exposed to the functions that generate the WAL records, so it would be a
very isolated change.

I moved the new functions for constructing the WAL record into a new
file, xlogconstruct.c. XLogInsert() now just calls a function called
XLogRecordAssemble(), which returns the full XLogRecData chain that
includes all the data and backup blocks, ready to be written to the WAL
buffer. All the code to construct the XLogRecData chain is now in
xlogrecord.c; this makes XLogInsert() simpler and more readable.

One change resulting from that worth mentioning is that XLogInsert() now
always re-constructs the XLogRecordData chain, if after acquiring the
WALInsertLock it sees that RedoRecPtr changed (i.e. a checkpoint just
started). Before, it would recheck the LSNs on the individual buffers to
see if it's necessary. This is simpler, and I don't think it makes any
difference to performance in practice.

I ran this through my WAL page comparison tool to verify that all the
WAL record types are replayed correctly (although I did some small
cleanup after that, so it's not impossible that I broke it again; will
re-test before committing).

- Heikki


Attachment

pgsql-hackers by date:

Previous
From: Fabrízio de Royes Mello
Date:
Subject: Re: Index-only scans for GIST
Next
From: desmodemone
Date:
Subject: Re: Proposal: Incremental Backup