On 05/28/2014 09:41 AM, Simon Riggs wrote:
> On 27 May 2014 13:20, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
>> On 05/27/2014 03:18 PM, Simon Riggs wrote:
>>>
>>> IIRC Koichi had a patch for prefetch during recovery. Heikki, is that
>>> the reason you also discussed changing the WAL record format to allow
>>> us to identify the blocks touched by recovery more easily?
>>
>>
>> Yeah, that was one use case I had in mind for the WAL format changes. See
>> http://www.postgresql.org/message-id/533D6CBF.6080203@vmware.com.
>
> Those proposals suggest some very big changes to the way WAL works.
>
> Prefetch can work easily enough for most records - do we really need
> that much churn?
>
> You mentioned Btree vacuum records, but I'm planning to optimize those
> another way.
>
> Why don't we just have the prefetch code in core and forget the WAL
> format changes?
Well, the prefetching was just one example of why the proposed WAL
format changes are a good idea. The changes will make life easier for
any external (or internal, for that matter) tool that wants to read WAL
records. The thing that finally really got me into doing that was
pg_rewind. For pg_rewind it's not enough to cover most records, you have
to catch all modifications to data pages for correctness, and that's
difficult to maintain as new WAL record types are added and old ones are
modified in every release.
Also, the changes make WAL-logging and -replaying code easier to write.
Which reduces the potential for bugs.
- Heikki