Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
Date
Msg-id b7486071-1b2b-11c6-4412-97dd1f42481d@iki.fi
Whole thread Raw
In response to Re: Re: Reduce amount of WAL generated by CREATE INDEX for gist, ginand sp-gist  (David Steele <david@pgmasters.net>)
Responses Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist
List pgsql-hackers
On 25/03/2019 09:57, David Steele wrote:
> On 2/6/19 2:08 PM, Andrey Lepikhov wrote:
>> The patchset had a problem with all-zero pages, has appeared at index
>> build stage: the generic_log_relation() routine sends all pages into the
>> WAL. So  lsn field at all-zero page was initialized and the
>> PageIsVerified() routine detects it as a bad page.
>> The solution may be:
>> 1. To improve index build algorithms and eliminate the possibility of
>> not used pages appearing.
>> 2. To mark each page as 'dirty' right after initialization. In this case
>> we will got 'empty' page instead of the all-zeroed.
>> 3. Do not write into the WAL all-zero pages.

Hmm. When do we create all-zero pages during index build? That seems 
pretty surprising.

>> On 04.02.2019 10:04, Michael Paquier wrote:
>>> On Tue, Dec 18, 2018 at 10:41:48AM +0500, Andrey Lepikhov wrote:
>>>> Ok. It is used only for demonstration.
>>>
>>> The latest patch set needs a rebase, so moved to next CF, waiting on
>>> author as this got no reviews.
> 
> The patch no longer applies so marked Waiting on Author.
> 
> Alexander, Heikki, are either of you planning to review the patch in
> this CF?

I had another quick look.

I still think using the "generic xlog AM" for this is a wrong level of 
abstraction, and we should use the XLOG_FPI records for this directly. 
We can extend XLOG_FPI so that it can store multiple pages in a single 
record, if it doesn't already handle it.

Another counter-point to using the generic xlog record is that you're 
currently doing unnecessary two memcpy's of all pages in the index, in 
GenericXLogRegisterBuffer() and GenericXLogFinish(). That's not free.

I guess the generic_log_relation() function can stay where it is, but it 
should use XLogRegisterBuffer() and XLogInsert() directly.

- Heikki


pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: reorder pg_rewind control file sync
Next
From: David Steele
Date:
Subject: Re: libpq compression