Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist |
Date | |
Msg-id | 090fb3cb-1ca4-e173-ecf7-47d41ebac620@iki.fi Whole thread Raw |
In response to | Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin andsp-gist (Andrey Lepikhov <a.lepikhov@postgrespro.ru>) |
List | pgsql-hackers |
On 02/04/2019 08:58, Andrey Lepikhov wrote: > On 25/03/2019 15:21, Heikki Linnakangas wrote: >> I had another quick look. >> >> I still think using the "generic xlog AM" for this is a wrong level of >> abstraction, and we should use the XLOG_FPI records for this directly. >> We can extend XLOG_FPI so that it can store multiple pages in a single >> record, if it doesn't already handle it. >> >> Another counter-point to using the generic xlog record is that you're >> currently doing unnecessary two memcpy's of all pages in the index, in >> GenericXLogRegisterBuffer() and GenericXLogFinish(). That's not free. >> >> I guess the generic_log_relation() function can stay where it is, but it >> should use XLogRegisterBuffer() and XLogInsert() directly. > > Patch set v.3 uses XLOG_FPI records directly. Thanks! Committed, with some changes: * I moved the log_relation() function to xlog.c, so that it sits beside log_newpage_*() functions. I renamed it to log_newpage_range(), and changed the argument so that the caller provides the beginning and end block ranges. I added a 'page_std' flag, instead of just assuming that all pages use the standard page layout. All of the callers pass page_std=true at the moment, but seems better to be explicit about it. I made those changes because I felt that the function was too narrowly tailored for the current callers. The assumption about standard page layout, and also the fact that it only logged the main fork. It's more flexible now, for any future AMs that might not be exactly like that. It feels like it's at the same level of abstraction now as the other log_newpage_*() functions. Even if we never need the flexibility, I think making the 'page_std' and 'forknum' arguments explicit is good, to draw attention to those details, for anyone calling the function. * I fixed the REDO code. It was trivially broken, it only restored the first page in each FPI WAL record. * Using "fake" unlogged LSNs for GiST index build seemed fishy. I could not convince myself that it was safe in all corner cases. In a recently initdb'd cluster, it's theoretically possible that the fake LSN counter overtakes the real LSN value, and that could lead to strange behavior. For example, how would the buffer manager behave, if there was a dirty page in the buffer cache with an LSN value that's greater than the current WAL flush pointer? I think you'd get "ERROR: xlog flush request %X/%X is not satisfied --- flushed only to %X/%X". I changed that so that we use LSN 1 for all pages during index build. That's smaller than any real or fake LSN. Or actually, the fake LSN counter used to start at 1 - I bumped that up to 1000, so now it's safely smaller. Could've used 0 instead, but there's an assertion in gist scan code that didn't like that. > As a benchmark I use the script (test.sql in attachment) which show WAL > size increment during index build. In the table below you can see the > influence of the patch on WAL growth. > > Results > ======= > AM | master | patch | > GIN | 347 MB | 66 MB | > GiST | 157 MB | 43 MB | > SP-GiST | 119 MB | 38 MB | Nice! - Heikki
pgsql-hackers by date: