Re: Production block comparison facility - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Production block comparison facility
Date
Msg-id CA+U5nM+Sy6mnYApn5RyL8u9L2xBJdziMJCQ=S9rr_+f7h_9p=Q@mail.gmail.com
Whole thread Raw
In response to Re: Production block comparison facility  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: Production block comparison facility
List pgsql-hackers
On 22 July 2014 08:49, Michael Paquier <michael.paquier@gmail.com> wrote:
> On Sun, Jul 20, 2014 at 5:31 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> The block comparison facility presented earlier by Heikki would not be
>> able to be used in production systems. ISTM that it would be desirable
>> to have something that could be used in that way.
>>
>> ISTM easy to make these changes
>>
>> * optionally generate a FPW for every WAL record, not just first
>> change after checkpoint
>> full_page_writes = 'always'
>>
>> * when an FPW arrives, optionally run a check to see if it compares
>> correctly against the page already there, when running streaming
>> replication without a recovery target. We could skip reporting any
>> problems until the database is consistent
>> full_page_write_check = on
>>
>> The above changes seem easy to implement.
>>
>> With FPW compression, this would be a usable feature in production.
>>
>> Comments?
>
> This is an interesting idea, and it would be easier to use than what
> has been submitted for CF1. However, full_page_writes set to "always"
> would generate a large amount of WAL even for small records,
> increasing I/O for the partition holding pg_xlog, and the frequency of
> checkpoints run on system. Is this really something suitable for
> production?

For critical systems, yes, I think it is.

It would be possible to make that user selectable for particular
transactions or tables.

> Then, looking at the code, we would need to tweak XLogInsert for the
> WAL record construction to always do a FPW and to update
> XLogCheckBufferNeedsBackup. Then for the redo part, we would need to
> do some extra operations in the area of
> RestoreBackupBlock/RestoreBackupBlockContents, including masking
> operations before comparing the content of the FPW and the current
> page.
>
> Does that sound right?

Yes, it doesn't look very much code because it fits well with existing
approaches.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [bug fix] Suppress "autovacuum: found orphan temp table" message
Next
From: Greg Stark
Date:
Subject: Re: Production block comparison facility