Re: Spreading full-page writes - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Spreading full-page writes
Date
Msg-id CAHGQGwFQ5k_eXMOfkU-XwqdsuenjqrOpS=KQ8GSioM4C6Kmt0w@mail.gmail.com
Whole thread Raw
In response to Re: Spreading full-page writes  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Spreading full-page writes
List pgsql-hackers
On Wed, May 28, 2014 at 1:10 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, May 27, 2014 at 1:19 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Tue, May 27, 2014 at 3:57 PM, Simon Riggs <simon@2ndquadrant.com>
>> wrote:
>> > The requirements we were discussing were around
>> >
>> > A) reducing WAL volume
>> > B) reducing foreground overhead of writing FPWs - which spikes badly
>> > after checkpoint and the overhead is paid by the user processes
>> > themselves
>> > C) need for FPWs during base backup
>> >
>> > So that gives us a few approaches
>> >
>> > * Compressing FPWs gives A
>> > * Background FPWs gives us B
>> >    which look like we can combine both ideas
>> >
>> > * Double-buffering would give us A and B, but not C
>> >    and would be incompatible with other two ideas
>>
>> Double-buffering would allow us to disable FPW safely but which would make
>> a recovery slow.
>
> Is it due to the fact that during recovery, it needs to check the
> contents of double buffer as well as the page in original location
> for consistency or there is something else also which will lead
> to slow recovery?
>
> Won't DBW (double buffer write) reduce the need for number of
> pages that needs to be read from disk as compare to FPW which
> will suffice the performance degradation due to any other impact?
>
> IIUC in DBW mechanism, we need to have a temporary sequential
> log file of fixed size which will be used to write data before the data
> gets written to its actual location in tablespace.  Now as the temporary
> log file is of fixed size, the number of pages that needs to be read
> during recovery should be less as compare to FPW because in FPW
> it needs to read all the pages written in WAL log after last successful
> checkpoint.

Hmm... maybe I'm misunderstanding how WAL replay works in DBW case.
Imagine the case where we try to replay two WAL records for the page A and
the page has not been cached in shared_buffers yet. If FPW is enabled,
the first WAL record is FPW and firstly it's just read to shared_buffers.
The page doesn't neeed to be read from the disk. Then the second WAL record
will be applied.

OTOH, in DBW case, how does this example case work? I was thinking that
firstly we try to apply the first WAL record but find that the page A doesn't
exist in shared_buffers yet. We try to read the page from the disk, check
whether its CRC is valid or not, and read the same page from double buffer
if it's invalid. After reading the page into shared_buffers, the first WAL
record can be applied. Then the second WAL record will be applied. Is my
understanding right?

Regards,

-- 
Fujii Masao



pgsql-hackers by date:

Previous
From: Koichi Suzuki
Date:
Subject: Re: Documenting the Frontend/Backend Protocol update criteria
Next
From: Fujii Masao
Date:
Subject: Re: Compression of full-page-writes