Re: Spreading full-page writes - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Spreading full-page writes |
Date | |
Msg-id | CAHGQGwFQ5k_eXMOfkU-XwqdsuenjqrOpS=KQ8GSioM4C6Kmt0w@mail.gmail.com Whole thread Raw |
In response to | Re: Spreading full-page writes (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Spreading full-page writes
|
List | pgsql-hackers |
On Wed, May 28, 2014 at 1:10 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Tue, May 27, 2014 at 1:19 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Tue, May 27, 2014 at 3:57 PM, Simon Riggs <simon@2ndquadrant.com> >> wrote: >> > The requirements we were discussing were around >> > >> > A) reducing WAL volume >> > B) reducing foreground overhead of writing FPWs - which spikes badly >> > after checkpoint and the overhead is paid by the user processes >> > themselves >> > C) need for FPWs during base backup >> > >> > So that gives us a few approaches >> > >> > * Compressing FPWs gives A >> > * Background FPWs gives us B >> > which look like we can combine both ideas >> > >> > * Double-buffering would give us A and B, but not C >> > and would be incompatible with other two ideas >> >> Double-buffering would allow us to disable FPW safely but which would make >> a recovery slow. > > Is it due to the fact that during recovery, it needs to check the > contents of double buffer as well as the page in original location > for consistency or there is something else also which will lead > to slow recovery? > > Won't DBW (double buffer write) reduce the need for number of > pages that needs to be read from disk as compare to FPW which > will suffice the performance degradation due to any other impact? > > IIUC in DBW mechanism, we need to have a temporary sequential > log file of fixed size which will be used to write data before the data > gets written to its actual location in tablespace. Now as the temporary > log file is of fixed size, the number of pages that needs to be read > during recovery should be less as compare to FPW because in FPW > it needs to read all the pages written in WAL log after last successful > checkpoint. Hmm... maybe I'm misunderstanding how WAL replay works in DBW case. Imagine the case where we try to replay two WAL records for the page A and the page has not been cached in shared_buffers yet. If FPW is enabled, the first WAL record is FPW and firstly it's just read to shared_buffers. The page doesn't neeed to be read from the disk. Then the second WAL record will be applied. OTOH, in DBW case, how does this example case work? I was thinking that firstly we try to apply the first WAL record but find that the page A doesn't exist in shared_buffers yet. We try to read the page from the disk, check whether its CRC is valid or not, and read the same page from double buffer if it's invalid. After reading the page into shared_buffers, the first WAL record can be applied. Then the second WAL record will be applied. Is my understanding right? Regards, -- Fujii Masao
pgsql-hackers by date: