Re: HEAD seems to generate larger WAL regarding GIN index - Mailing list pgsql-hackers

From Jesper Krogh
Subject Re: HEAD seems to generate larger WAL regarding GIN index
Date
Msg-id 532B4B93.6060408@krogh.cc
Whole thread Raw
In response to Re: HEAD seems to generate larger WAL regarding GIN index  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: HEAD seems to generate larger WAL regarding GIN index  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 15/03/14 20:27, Heikki Linnakangas wrote:
> That said, I didn't expect the difference to be quite that big when 
> you're appending to the end of the table. When the new entries go to 
> the end of the posting lists, you only need to recompress and WAL-log 
> the last posting list, which is max 256 bytes long. But I guess that's 
> still a lot more WAL than in the old format.
>
> That could be optimized, but I figured we can live with it, thanks to 
> the fastupdate feature. Fastupdate allows amortizing that cost over 
> several insertions. But of course, you explicitly disabled that...

In a concurrent update environment, fastupdate as it is in 9.2 is not 
really useful. It may be that you can bulk up insertion, but you have no 
control over who ends up paying the debt. Doubling the amount of wal 
from gin-indexing would be pretty tough for us, in 9.2 we generate 
roughly 1TB wal / day, keeping it
for some weeks to be able to do PITR. The wal are mainly due to 
gin-index updates as new data is added and needs to be searchable by 
users. We do run gzip that cuts it down to 25-30% before keeping the for 
too long, but doubling this is going to be a migration challenge.

If fast-update could be made to work in an environment where we both 
have users searching the index and manually updating it and 4+ backend 
processes updating the index concurrently then it would be a good 
benefit to gain.

the gin index currently contains 70+ million records with and average 
tsvector of 124 terms.

-- 
Jesper .. trying to add some real-world info.



> - Heikki
>
>




pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: QSoC proposal: date_trunc supporting intervals
Next
From: Thom Brown
Date:
Subject: Re: QSoC proposal: date_trunc supporting intervals