Re: WAL replay bugs - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: WAL replay bugs
Date
Msg-id CAB7nPqQhdfirs1v9097V+3APKG6ZVmCChyD=sgb6Qv+DhYEzJg@mail.gmail.com
Whole thread Raw
In response to Re: WAL replay bugs  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: WAL replay bugs  (Michael Paquier <michael.paquier@gmail.com>)
Re: WAL replay bugs  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, Apr 23, 2014 at 9:43 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> And here is the tool itself. It consists of two parts:
>
> 1. Modifications to the backend to write the page images
> 2. A post-processing tool to compare the logged images between master and
> standby.
Having that into Postgres at the disposition of developers would be
great, and I believe that it would greatly reduce the occurrence of
bugs caused by WAL replay during recovery. So, with the permission of
the author, I have been looking at this facility for a cleaner
integration into Postgres.

Roughly, this utility is made of three parts:
1) A set of masking functions that can be used on page images to
normalize them. This is used to put magic numbers or enforce flag
values to make page content consistent across nodes. This is for
example the case of the free space between pd_lower and pd_upper,
pd_flags, etc. Of course this depends on the type of page (btree,
heap, etc.).
2) Facility to memorize, analyze if they have been modified, and flush
page images to a dedicated file. This interacts with the buffer
manager mainly.
3) Facility to reorder page images within the same WAL record as
master/standby may not write them in the same order on a standby or a
master due to for example lock released in different order. This is
part of the binary analyzing the diffs between master and standby.

As of now, 2) is integrated in the backend, 1) and 3) are part of the
contrib module. However I am thinking that 1) and 2) should be done in
core using an ifdef similar to CLOBBER_FREED_MEMORY, to mask the page
images and write them in a dedicated file (in global/ ?), while 3)
would be fine as a separate binary in contrib/. An essential thing to
add would be to have a set of regression tests that developers and
buildfarm machines could directly use.

Perhaps there are parts of what is proposed here that could be made
more generalized, like the masking functions. So do not hesitate if
you have any opinion on the matter.

Regards,
-- 
Michael



pgsql-hackers by date:

Previous
From: ash
Date:
Subject: Re: Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?
Next
From: Fujii Masao
Date:
Subject: Re: backup_label revisited