Home > mailing lists

Re: GiST VACUUM - Mailing list pgsql-hackers

From	Jeff Janes
Subject	Re: GiST VACUUM
Date	March 15, 2019 22:20:06
Msg-id	CAMkU=1wOqJgjXMfO_R_rUjZ6dvba3x-xeCPBLYQZ1pTV1utLow@mail.gmail.com Whole thread Raw
In response to	Re: GiST VACUUM (Heikki Linnakangas <hlinnaka@iki.fi>)
List	pgsql-hackers

Tree view

On Tue, Mar 5, 2019 at 8:21 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 05/03/2019 02:26, Andrey Borodin wrote:
>> I also tried your amcheck tool with this. It did not report any
>> errors.
>>
>> Attached is also latest version of the patch itself. It is the
>> same as your latest patch v19, except for some tiny comment
>> kibitzing. I'll mark this as Ready for Committer in the commitfest
>> app, and will try to commit it in the next couple of days.
>
> That's cool! I'll work on 2nd step of these patchset to make
> blockset data structure prettier and less hacky.

Committed the first patch. Thanks for the patch!

Thank you. This is a transformational change; it will allow GiST indexes larger than RAM to be used in some cases where they were simply not feasible to use before. On a HDD, it resulted in a 50 fold improvement in vacuum time, and the machine went from unusably unresponsive to merely sluggish during the vacuum. On a SSD (albeit a very cheap laptop one, and exposed from Windows host to Ubuntu over VM Virtual Box) it is still a 30 fold improvement, from a far faster baseline. Even on an AWS instance with a "GP2" SSD volume, which normally shows little benefit from sequential reads, I get a 3 fold speed up.

I also ran this through a lot of crash-recovery testing using simulated torn-page writes using my traditional testing harness with high concurrency (AWS c4.4xlarge and a1.4xlarge using 32 concurrent update processes) and did not encounter any problems. I tested both with btree_gist on a scalar int, and on tsvector with each tsvector having 101 tokens.

I did notice that the space freed up in the index by vacuum doesn't seem to get re-used very efficiently, but that is an ancestral problem independent of this change.

Cheers,

Jeff

pgsql-hackers by date:

From: Robert Haas
Date: 15 March 2019, 21:39:54
Subject: Re: hyrax vs. RelationBuildPartitionDesc

From: Tom Lane
Date: 15 March 2019, 22:45:28
Subject: Re: hyrax vs. RelationBuildPartitionDesc

Re: GiST VACUUM - Mailing list pgsql-hackers

Previous

Next