Re: REINDEX : new parameter to preserve current average leaf densityas new implicit FILLFACTOR - Mailing list pgsql-general
From | John Lumby |
---|---|
Subject | Re: REINDEX : new parameter to preserve current average leaf densityas new implicit FILLFACTOR |
Date | |
Msg-id | DM6PR06MB5562839D0229D832E1AE4742A3F60@DM6PR06MB5562.namprd06.prod.outlook.com Whole thread Raw |
In response to | Re: REINDEX : new parameter to preserve current average leaf densityas new implicit FILLFACTOR (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: REINDEX : new parameter to preserve current average leaf densityas new implicit FILLFACTOR
|
List | pgsql-general |
> From: Peter Geoghegan <pg@bowt.ie> > Sent: June 26, 2019 12:09 PM > Subject: Re: REINDEX : new parameter to preserve current average leaf density as new implicit FILLFACTOR > > > > > > > I suspect that you might find that the enhancements to B-Tree indexes > > > that went into Postgres 12 would help with this workload, especially > > > if you notice that this happens with indexes that have a lot of duplicates > > > > > > > > I had not noticed that, thanks for pointing it out. Yes , in my workload most of the indexes in question > > are non-unique and some have very low key card. I will try out the pg-12 beta when I get a chance. > > > > > It's easy to show problems with very low cardinality indexes in the > old code. You'll definitely notice a difference there. > I've run a comparison of pg-12beta2 with an older release, pg-9.4.6, running the same intensive-delete-insert workloadas mentioned before in this thread, and would appreciate if you could comment on the results, especially as towhether they are roughly in line with your expectation. I also have one question about a new-in-pg-11 config parm. Briefly, the workload involves a repetition of a loop in which , on one single table which has 8 indexes, 2 unique and6 non-unique, about 4300 records are deleted, and for each of those records, a corresponding record is inserted in which one or moreof the non-unique key values are modified to values which are not present in the relevant index at that point. In otherwords, across all the indexes , 4300 key-tids are deleted and then 4300 new key-tids are inserted. At the end ofeach loop there is zero net change in counts of records and keys but possibly some increase in numbers of pages, whichis what the test is interested in as well as overall throughput rate. For this test, I did not modify the index default fill factors which therefore remained at 90%, in order to make a stabat evaluating not setting explicit fillfactor. In each case the indexes were either freshly loaded or else reindexedto have the same starting density. Here are counts and sizes after 768 iterations | tbpages | tbtuples | ixname | isuniq | livetuples | deadtuples | avg_leaf_density | ixpages +---------+--------------+------------------------------+--------+------------+------------+------------------+--------- pg-9.4.6 --------------------- | 32160 | 2.55548e+06 | metadata_value_boolean | f | 2932852 | 0 | 46.39 | 13535 | 32160 | 2.55548e+06 | metadata_value_field_id | f | 2932852 | 0 | 48.58 | 12916 | 32160 | 2.55548e+06 | metadata_value_floatnumber | f | 2932852 | 0 | 45.97 | 13658 | 32160 | 2.55548e+06 | metadata_value_longnumber | f | 2932852 | 0 | 48.26 | 13009 | 32160 | 2.55548e+06 | metadata_value_owner_field_u | t | 2932852 | 0 | 58.69 | 14990 | 32160 | 2.55548e+06 | metadata_value_owner_id | f | 2932852 | 0 | 53.06 | 11817 | 32160 | 2.55548e+06 | metadata_value_pkey | t | 2932852 | 0 | 57.83 | 10842 | 32160 | 2.55548e+06 | metadata_value_timestamp | f | 2932852 | 0 | 45.96 | 13663 pg-12beta2 --------------------- | 41814 | 2.519268e+06 | metadata_value_boolean | f | 2519268 | 6766 | 63.17 | 10768 | 41814 | 2.519268e+06 | metadata_value_field_id | f | 2519268 | 6766 | 68.7 | 12031 | 41814 | 2.519268e+06 | metadata_value_floatnumber | f | 2519268 | 6766 | 61.48 | 11225 | 41814 | 2.519268e+06 | metadata_value_longnumber | f | 2519268 | 6766 | 58.34 | 12397 | 41814 | 2.519268e+06 | metadata_value_owner_field_u | t | 2519268 | 6766 | 61.69 | 14780 | 41814 | 2.519268e+06 | metadata_value_owner_id | f | 2519268 | 6766 | 48.86 | 12947 | 41814 | 2.519268e+06 | metadata_value_pkey | t | 2519268 | 6766 | 59.71 | 11076 | 41814 | 2.519268e+06 | metadata_value_timestamp | f | 2519268 | 6766 | 57.81 | 12295 Overall, pg-12beta2 yielded a 6.7% reduction in sizes (total pages) of indexes, which was most noticable with the 6 non-uniqueones. In fact the primary-key index was larger with pg-12. Would you have expected better than 6.7%? Although awelcome improvement, I think it is not enough to justify stopping use of setting a lower explicit FILLFACTOR. Whichthen brings me back to thinking there is a case for the subject of this thread, an automatic way to preserve density. Secondary points: I did not expect to see the number of table pages grow so much larger for pg-12 than for pg-9.4. The number of tablepages was almost identical at the start of each run. However this was not the focus of the test. Also, although not shown in those tables, pg-12 was around 4.5 times faster in completing those 768 iterations, an enormousimprovement. And one question : I notice that in some pg-11 release, a new config parameter appeared : vacuum_cleanup_index_scale_factor specifies the fraction of the total number of heap tuples counted in the previous statistics collection that can be insertedwithout incurring an index scan at the VACUUM cleanup stage. I have not researched this at all and nor did I set it to anything for my pg-12beta2 run, but it sounds as though maybeit could be relevant to this kind of workload - Is that so? Cheers John Lumby
pgsql-general by date: