Thread: TODO-Item: B-tree fillfactor control
This is a draft patch for index fillfactor control discussed in http://archives.postgresql.org/pgsql-hackers/2006-02/msg00013.php I added the following features: - Add support for btree, hash and gist. - Syntax extension using PCTFREE. - Save settings to catalog. Next REINDEX will use the last value. I'd like to ask index developers to review the patch, especially the method to control fill factor for hash and gist. I'll write documentations if there is no problem in the features. Comments are welcome. [Syntax extension] - CREATE INDEX index ON table (columns) [ PCTFREE percent ]; - REINDEX { INDEX | TABLE | DATABASE | SYSTEM } name [ PCTFREE percent ]; - ALTER INDEX index SET PCTFREE percent; - PRIMARY KEY, UNIQUE constraint CREATE TABLE / ALTER TABLE table ADD PRIMARY KEY [ PCTFREE percent ] - with GUC SET btree_free_percent = 30; CREATE INDEX index ON table (...); SET btree_free_percent = 10; -- revert [Test and Result] # CREATE table test1 (i int); # INSERT INTO test1 SELECT generate_series(1, 100000); # CREATE table test2 (c circle); # INSERT INTO test2 # SELECT circle(point(100 * random(), 100 * random()), random()) # from generate_series(1, 100000); # CREATE INDEX idx1_btree_0 ON test1 USING btree (i) PCTFREE 0; # CREATE INDEX idx1_btree_10 ON test1 USING btree (i) PCTFREE 10; # CREATE INDEX idx1_btree_30 ON test1 USING btree (i) PCTFREE 30; # CREATE INDEX idx1_hash_0 ON test1 USING hash (i) PCTFREE 0; # CREATE INDEX idx1_hash_25 ON test1 USING hash (i) PCTFREE 25; # CREATE INDEX idx1_hash_40 ON test1 USING hash (i) PCTFREE 40; # CREATE INDEX idx2_gist_0 ON test2 USING gist (c) PCTFREE 0; # CREATE INDEX idx2_gist_10 ON test2 USING gist (c) PCTFREE 10; # CREATE INDEX idx2_gist_30 ON test2 USING gist (c) PCTFREE 30; # SELECT relname, relpages from pg_class where relname LIKE 'idx%' ORDER BY relname; relname | relpages ---------------+---------- idx1_btree_0 | 249 idx1_btree_10 | 276 -- 249 / 0.9 = 277 idx1_btree_30 | 357 -- 249 / 0.7 = 356 idx1_hash_0 | 375 idx1_hash_25 | 413 -- Hash is not linear against fill factors. idx1_hash_40 | 453 -- idx2_gist_0 | 882 idx2_gist_10 | 977 -- 882 / 0.9 = 980 idx2_gist_30 | 1273 -- 882 / 0.7 = 1260 (9 rows) --- ITAGAKI Takahiro NTT Cyber Space Laboratories
Attachment
On Mon, 2006-02-06 at 13:27 +0900, ITAGAKI Takahiro wrote: > This is a draft patch for index fillfactor control discussed in > http://archives.postgresql.org/pgsql-hackers/2006-02/msg00013.php > > I added the following features: > - Add support for btree, hash and gist. > - Syntax extension using PCTFREE. > - Save settings to catalog. Next REINDEX will use the last value. > > I'd like to ask index developers to review the patch, especially > the method to control fill factor for hash and gist. > I'll write documentations if there is no problem in the features. > Comments are welcome. Looks pretty complete to me. A useful patch for large databases. Do you have any performance numbers for the extreme settings? It may be worth having different max limits for each of the index types, since they differ so widely in algorithms. Do we have any tests to show whether 3*setting is the right value for b-tree node pages? It sounds about right but I have no evidence either way. I'm surprised that you do not use the parameter to control the RIGHTMOST index block split factor for B-trees, which remains at a constant 67%. The PCTFREE only seems to apply at CREATE INDEX time. "The steady-state load factor for btrees is usually estimated at 70%." but we recognise that estimate as being from the 1980s and not necessarily reflecting all application types for which we now use databases. Can we use the PCTFREE setting to control the RIGHTMOST behaviour? If I manually control the PCTFREE I want it to work like that all of the time, not just some of the time. [i.e. with this patch if I fill an index with 1000 blocks of data using PCTFREE 0 the index will use 1000 blocks. If I COPY another 1000 blocks of data the index would then be 1500 blocks larger, 2500 total. The current cvstip acts thus: if I fill an index with 1000 blocks of data the index will use 1111 blocks. If I COPY another 1000 blocks of data the index would then be 1500 blocks larger, 2611 total. I'd like to be able to have the index use only 2000 blocks when PCTFREE=0 - if I ask for fully packed I want fully packed, please] If we support PCTFREE for compatibility reasons should we not also support the alternative FILLFACTOR syntax also? I see no reason to favour Oracle/DB2 compatability at the expense of SQLServer compatibility. Best Regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: > If we support PCTFREE for compatibility reasons should we not also > support the alternative FILLFACTOR syntax also? I see no reason to > favour Oracle/DB2 compatability at the expense of SQLServer > compatibility. One nonstandard syntax is more than enough. regards, tom lane
This is a revised patch for index fillfactor control: - Split MAX_PCTFREE into three for each index method. - B-tree indexes use their own settings when rightmost page is split. - Fix a bug that GUC is modified when index building is canceled. - Add some documentations. Simon Riggs <simon@2ndquadrant.com> wrote: > Do you have any performance numbers for the extreme settings? It may be > worth having different max limits for each of the index types, since > they differ so widely in algorithms. Different max limits are done. I worry about whether index works properly on high PCTFREE settings. I found hash has its own sanity checking, but I don't know other indexes have. > I'm surprised that you do not use the parameter to control the RIGHTMOST > index block split factor for B-trees, which remains at a constant 67%. > The PCTFREE only seems to apply at CREATE INDEX time. Thanks for pointing out. I did not inadvertently use fillfactor on the rightmost page. With the revised patch, PCTFREE will be considered in such cases. # CREATE TABLE test (i int); # INSERT INTO test SELECT generate_series(1, 100000); # CREATE INDEX btree ON test USING btree (i) PCTFREE 0; # SELECT relpages from pg_class where relname ='btree'; relpages | 249 # INSERT INTO test SELECT generate_series(100001, 200000); # SELECT relpages from pg_class where relname ='btree'; relpages | 497 <-- +99.6% But default settings will change. Is this ok? | | patched | | now | free=10 | free=0 | -----------------+-----+---------+--------+- leaf (REINDEX) | 10 | 10 | 0 | leaf (RIGHTMOST) | 30 | 10 | 0 | = leaf node (REINDEX) | 30 | 30 | 0 | = 3*leaf > If we support PCTFREE for compatibility reasons should we not also > support the alternative FILLFACTOR syntax also? I see no reason to > favour Oracle/DB2 compatability at the expense of SQLServer > compatibility. There are few synonyms in PostgreSQL, so I think it is better for us to adopt only either one. I like FILLFACTOR personally, but compatibility with Oracle is more important to users around me. --- ITAGAKI Takahiro NTT Cyber Space Laboratories
Attachment
On Fri, 2006-02-10 at 19:12 +0900, ITAGAKI Takahiro wrote: > This is a revised patch for index fillfactor control: > - Split MAX_PCTFREE into three for each index method. > - B-tree indexes use their own settings when rightmost page is split. > - Fix a bug that GUC is modified when index building is canceled. > - Add some documentations. > Simon Riggs <simon@2ndquadrant.com> wrote: > > > Do you have any performance numbers for the extreme settings? It may be > > worth having different max limits for each of the index types, since > > they differ so widely in algorithms. > > Different max limits are done. > I worry about whether index works properly on high PCTFREE settings. I found > hash has its own sanity checking, but I don't know other indexes have. Thanks. > > I'm surprised that you do not use the parameter to control the RIGHTMOST > > index block split factor for B-trees, which remains at a constant 67%. > > The PCTFREE only seems to apply at CREATE INDEX time. > > Thanks for pointing out. I did not inadvertently use fillfactor on > the rightmost page. With the revised patch, PCTFREE will be considered > in such cases. > > # CREATE TABLE test (i int); > # INSERT INTO test SELECT generate_series(1, 100000); > # CREATE INDEX btree ON test USING btree (i) PCTFREE 0; > # SELECT relpages from pg_class where relname ='btree'; > relpages | 249 > # INSERT INTO test SELECT generate_series(100001, 200000); > # SELECT relpages from pg_class where relname ='btree'; > relpages | 497 <-- +99.6% > This is great. > But default settings will change. Is this ok? > > | | patched | > | now | free=10 | free=0 | > -----------------+-----+---------+--------+- > leaf (REINDEX) | 10 | 10 | 0 | > leaf (RIGHTMOST) | 30 | 10 | 0 | = leaf > node (REINDEX) | 30 | 30 | 0 | = 3*leaf I think thats appropriate; lets see what others think. > > If we support PCTFREE for compatibility reasons should we not also > > support the alternative FILLFACTOR syntax also? I see no reason to > > favour Oracle/DB2 compatability at the expense of SQLServer > > compatibility. > > There are few synonyms in PostgreSQL, so I think it is better for us to > adopt only either one. I like FILLFACTOR personally, but compatibility > with Oracle is more important to users around me. OK, no probs. Reading through rest of patch now. Best Regards, Simon Riggs
On Fri, 2006-02-10 at 19:12 +0900, ITAGAKI Takahiro wrote: > Simon Riggs <simon@2ndquadrant.com> wrote: > > I'm surprised that you do not use the parameter to control the RIGHTMOST > > index block split factor for B-trees, which remains at a constant 67%. > > The PCTFREE only seems to apply at CREATE INDEX time. > > Thanks for pointing out. I did not inadvertently use fillfactor on > the rightmost page. With the revised patch, PCTFREE will be considered > in such cases. > > # CREATE TABLE test (i int); > # INSERT INTO test SELECT generate_series(1, 100000); > # CREATE INDEX btree ON test USING btree (i) PCTFREE 0; > # SELECT relpages from pg_class where relname ='btree'; > relpages | 249 > # INSERT INTO test SELECT generate_series(100001, 200000); > # SELECT relpages from pg_class where relname ='btree'; > relpages | 497 <-- +99.6% > This additional functionality looks like it would work for b-trees. I've not looked at this for GIST and hash indexes. The reduction in index size should give useful performance gains on larger, growing tables with increasing keys. We'll test that. Best Regards, Simon Riggs