Re: Fillfactor for GIN indexes - Mailing list pgsql-hackers
From | Alexander Korotkov |
---|---|
Subject | Re: Fillfactor for GIN indexes |
Date | |
Msg-id | CAPpHfdsbeULKQdpPzSmGMpNj=62Dg48GmbN0d17xmYvQn0mhKg@mail.gmail.com Whole thread Raw |
In response to | Fillfactor for GIN indexes (Michael Paquier <michael.paquier@gmail.com>) |
Responses |
Re: Fillfactor for GIN indexes
|
List | pgsql-hackers |
Hi!
On Fri, Nov 21, 2014 at 8:12 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
------
With best regards,
Alexander Korotkov.
Please find attached a simple patch adding fillfactor as storage parameter for GIN indexes. The default value is the same as the one currently aka 100 to have the pages completely packed when a GIN index is created.
That's not true. Let us discuss it a little bit.
In btree access method index is created by sorting index tuples. Once they are sorted it can write a tree with any fullness of the pages. With completely filled pages you will get flood of index page splits on table updates or inserts. In order to avoid that the default fillfactor is 90. Besides index creation, btree access method tries to follow given fillfactor in the case of inserting ordered data. When rightmost page splits then fillfactor percent of data goes to the left page.
Btree page life cycle is between being freshly branched by split (50% filled) and being full packed (100% filled) and then splitted. Thus we can roughly estimate average fullness of btree page to be 75%. This "equilibrium" state can be achieved by huge amount of inserts over btree index.
Fillfactor was spreaded to other access methods. For instance, GiST. In GiST, tree is always constructed by page splits. So, freshly created GiST is already in its "equilibrium" state. Even more GiST doesn't splits page by halves. Split ration depends on opclass. So, "equilibrium" fullness of particular GiST index is even less than 75%. What does fillfactor do with GiST? It makes GiST pages split on fillfactor fullness on index build. So, GiST is even less fulled. But why? You anyway don't get flood of split on fresh GiST index because it's in "equilibrium" state. That's why I would rather delete fillfactor from GiST until we have some algorithm to construct GiST without page splits.
What is going on with GIN? GIN is, basically, two-level nested btree. So, fillfactor should be applicable here. But GIN is not constructed like btree...
GIN accumulates maintenance_work_mem of data and then inserts it into the tree. So fresh GIN is the result of insertion of maintenance_work_mem buckets. And each bucket is sorted. On small index (or large maintenance_work_mem) it would be the only one bucket. GIN has two levels of btree: entry tree and posting tree. Currently entry tree has on special logic to control fullness during index build. So, with only one bucket entry tree will be 50% filled. With many buckets entry tree will be at "equilibrium" state. In some specific cases (about two buckets) entry tree can be almost fully packed. Insertion into posting tree is always ordered during index build because posting tree is a tree of TIDs. When posting tree page splits it leaves left page fully packed during index build and 75% packed during insertion (because it expects some sequential inserts).
Idea of GIN building algorithm is that entry tree is expected to be relatively small and fit to cache. Insertions into posting trees are orders. Thus this algorithm should be IO efficient and have low overhead. However, with large entry trees this algorithm can perform much worse.
My summary is following:
1) In order to have fully correct support of fillfactor in GIN we need to rewrite GIN build algorithm.
2) Without rewriting GIN build algorithm, not much can be done with entry tree. However, you can implement some heuristics.
3) You definitely need to touch code that selects ratio of split in dataPlaceToPageLeaf (starting with if (!btree->isBuild)).3) GIN data pages are always compressed excepts pg_upgraded indexes from pre-9.4. Take care about it in following code.
if (GinPageIsCompressed(page))
freespace = GinDataLeafPageGetFreeSpace(page);
+ else if (btree->isBuild)
+ freespace = BLCKSZ * (100 - fillfactor) / 100;
With best regards,
Alexander Korotkov.
pgsql-hackers by date: