Thread: TODO-Item: B-tree fillfactor control

TODO-Item: B-tree fillfactor control

From
ITAGAKI Takahiro
Date:
This is a draft patch for index fillfactor control discussed in
http://archives.postgresql.org/pgsql-hackers/2006-02/msg00013.php

I added the following features:
  - Add support for btree, hash and gist.
  - Syntax extension using PCTFREE.
  - Save settings to catalog. Next REINDEX will use the last value.

I'd like to ask index developers to review the patch, especially
the method to control fill factor for hash and gist.
I'll write documentations if there is no problem in the features.
Comments are welcome.


[Syntax extension]
- CREATE INDEX index ON table (columns) [ PCTFREE percent ];
- REINDEX { INDEX | TABLE | DATABASE | SYSTEM } name [ PCTFREE percent ];
- ALTER INDEX index SET PCTFREE percent;
- PRIMARY KEY, UNIQUE constraint
        CREATE TABLE / ALTER TABLE table ADD PRIMARY KEY [ PCTFREE percent ]
- with GUC
        SET btree_free_percent = 30;
        CREATE INDEX index ON table (...);
        SET btree_free_percent = 10; -- revert


[Test and Result]
# CREATE table test1 (i int);
# INSERT INTO test1 SELECT generate_series(1, 100000);
# CREATE table test2 (c circle);
# INSERT INTO test2
#     SELECT circle(point(100 * random(), 100 * random()), random())
#     from generate_series(1, 100000);

# CREATE INDEX idx1_btree_0  ON test1 USING btree (i) PCTFREE 0;
# CREATE INDEX idx1_btree_10 ON test1 USING btree (i) PCTFREE 10;
# CREATE INDEX idx1_btree_30 ON test1 USING btree (i) PCTFREE 30;
# CREATE INDEX idx1_hash_0   ON test1 USING hash  (i) PCTFREE 0;
# CREATE INDEX idx1_hash_25  ON test1 USING hash  (i) PCTFREE 25;
# CREATE INDEX idx1_hash_40  ON test1 USING hash  (i) PCTFREE 40;
# CREATE INDEX idx2_gist_0   ON test2 USING gist  (c) PCTFREE 0;
# CREATE INDEX idx2_gist_10  ON test2 USING gist  (c) PCTFREE 10;
# CREATE INDEX idx2_gist_30  ON test2 USING gist  (c) PCTFREE 30;

# SELECT relname, relpages from pg_class where relname LIKE 'idx%' ORDER BY relname;
    relname    | relpages
---------------+----------
 idx1_btree_0  |      249
 idx1_btree_10 |      276       -- 249 / 0.9 = 277
 idx1_btree_30 |      357       -- 249 / 0.7 = 356
 idx1_hash_0   |      375
 idx1_hash_25  |      413       -- Hash is not linear against fill factors.
 idx1_hash_40  |      453       --
 idx2_gist_0   |      882
 idx2_gist_10  |      977       -- 882 / 0.9 = 980
 idx2_gist_30  |     1273       -- 882 / 0.7 = 1260
(9 rows)

---
ITAGAKI Takahiro
NTT Cyber Space Laboratories


Attachment

Re: TODO-Item: B-tree fillfactor control

From
Simon Riggs
Date:
On Mon, 2006-02-06 at 13:27 +0900, ITAGAKI Takahiro wrote:
> This is a draft patch for index fillfactor control discussed in
> http://archives.postgresql.org/pgsql-hackers/2006-02/msg00013.php
>
> I added the following features:
>   - Add support for btree, hash and gist.
>   - Syntax extension using PCTFREE.
>   - Save settings to catalog. Next REINDEX will use the last value.
>
> I'd like to ask index developers to review the patch, especially
> the method to control fill factor for hash and gist.
> I'll write documentations if there is no problem in the features.
> Comments are welcome.

Looks pretty complete to me. A useful patch for large databases.

Do you have any performance numbers for the extreme settings? It may be
worth having different max limits for each of the index types, since
they differ so widely in algorithms. Do we have any tests to show
whether 3*setting is the right value for b-tree node pages? It sounds
about right but I have no evidence either way.

I'm surprised that you do not use the parameter to control the RIGHTMOST
index block split factor for B-trees, which remains at a constant 67%.
The PCTFREE only seems to apply at CREATE INDEX time.

"The steady-state load factor for btrees is usually estimated at 70%."
but we recognise that estimate as being from the 1980s and not
necessarily reflecting all application types for which we now use
databases.

Can we use the PCTFREE setting to control the RIGHTMOST behaviour? If I
manually control the PCTFREE I want it to work like that all of the
time, not just some of the time.

[i.e. with this patch if I fill an index with 1000 blocks of data using
PCTFREE 0 the index will use 1000 blocks. If I COPY another 1000 blocks
of data the index would then be 1500 blocks larger, 2500 total. The
current cvstip acts thus: if I fill an index with 1000 blocks of data
the index will use 1111 blocks. If I COPY another 1000 blocks of data
the index would then be 1500 blocks larger, 2611 total. I'd like to be
able to have the index use only 2000 blocks when PCTFREE=0 - if I ask
for fully packed I want fully packed, please]

If we support PCTFREE for compatibility reasons should we not also
support the alternative FILLFACTOR syntax also? I see no reason to
favour Oracle/DB2 compatability at the expense of SQLServer
compatibility.

Best Regards, Simon Riggs


Re: TODO-Item: B-tree fillfactor control

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> If we support PCTFREE for compatibility reasons should we not also
> support the alternative FILLFACTOR syntax also? I see no reason to
> favour Oracle/DB2 compatability at the expense of SQLServer
> compatibility.

One nonstandard syntax is more than enough.

            regards, tom lane

Re: TODO-Item: B-tree fillfactor control

From
ITAGAKI Takahiro
Date:
This is a revised patch for index fillfactor control:
  - Split MAX_PCTFREE into three for each index method.
  - B-tree indexes use their own settings when rightmost page is split.
  - Fix a bug that GUC is modified when index building is canceled.
  - Add some documentations.


Simon Riggs <simon@2ndquadrant.com> wrote:

> Do you have any performance numbers for the extreme settings? It may be
> worth having different max limits for each of the index types, since
> they differ so widely in algorithms.

Different max limits are done.
I worry about whether index works properly on high PCTFREE settings. I found
hash has its own sanity checking, but I don't know other indexes have.


> I'm surprised that you do not use the parameter to control the RIGHTMOST
> index block split factor for B-trees, which remains at a constant 67%.
> The PCTFREE only seems to apply at CREATE INDEX time.

Thanks for pointing out. I did not inadvertently use fillfactor on
the rightmost page. With the revised patch, PCTFREE will be considered
in such cases.

# CREATE TABLE test (i int);
# INSERT INTO test SELECT generate_series(1, 100000);
# CREATE INDEX btree ON test USING btree (i) PCTFREE 0;
# SELECT relpages from pg_class where relname ='btree';
relpages | 249
# INSERT INTO test SELECT generate_series(100001, 200000);
# SELECT relpages from pg_class where relname ='btree';
relpages | 497    <-- +99.6%


But default settings will change. Is this ok?

                 |     |      patched     |
                 | now | free=10 | free=0 |
-----------------+-----+---------+--------+-
leaf (REINDEX)   |  10 |      10 |      0 |
leaf (RIGHTMOST) |  30 |      10 |      0 | = leaf
node (REINDEX)   |  30 |      30 |      0 | = 3*leaf


> If we support PCTFREE for compatibility reasons should we not also
> support the alternative FILLFACTOR syntax also? I see no reason to
> favour Oracle/DB2 compatability at the expense of SQLServer
> compatibility.

There are few synonyms in PostgreSQL, so I think it is better for us to
adopt only either one. I like FILLFACTOR personally, but compatibility
with Oracle is more important to users around me.

---
ITAGAKI Takahiro
NTT Cyber Space Laboratories


Attachment

Re: TODO-Item: B-tree fillfactor control

From
Simon Riggs
Date:
On Fri, 2006-02-10 at 19:12 +0900, ITAGAKI Takahiro wrote:
> This is a revised patch for index fillfactor control:
>   - Split MAX_PCTFREE into three for each index method.
>   - B-tree indexes use their own settings when rightmost page is split.
>   - Fix a bug that GUC is modified when index building is canceled.
>   - Add some documentations.

> Simon Riggs <simon@2ndquadrant.com> wrote:
>
> > Do you have any performance numbers for the extreme settings? It may be
> > worth having different max limits for each of the index types, since
> > they differ so widely in algorithms.
>
> Different max limits are done.
> I worry about whether index works properly on high PCTFREE settings. I found
> hash has its own sanity checking, but I don't know other indexes have.

Thanks.

> > I'm surprised that you do not use the parameter to control the RIGHTMOST
> > index block split factor for B-trees, which remains at a constant 67%.
> > The PCTFREE only seems to apply at CREATE INDEX time.
>
> Thanks for pointing out. I did not inadvertently use fillfactor on
> the rightmost page. With the revised patch, PCTFREE will be considered
> in such cases.
>
> # CREATE TABLE test (i int);
> # INSERT INTO test SELECT generate_series(1, 100000);
> # CREATE INDEX btree ON test USING btree (i) PCTFREE 0;
> # SELECT relpages from pg_class where relname ='btree';
> relpages | 249
> # INSERT INTO test SELECT generate_series(100001, 200000);
> # SELECT relpages from pg_class where relname ='btree';
> relpages | 497    <-- +99.6%
>

This is great.

> But default settings will change. Is this ok?
>
>                  |     |      patched     |
>                  | now | free=10 | free=0 |
> -----------------+-----+---------+--------+-
> leaf (REINDEX)   |  10 |      10 |      0 |
> leaf (RIGHTMOST) |  30 |      10 |      0 | = leaf
> node (REINDEX)   |  30 |      30 |      0 | = 3*leaf

I think thats appropriate; lets see what others think.

> > If we support PCTFREE for compatibility reasons should we not also
> > support the alternative FILLFACTOR syntax also? I see no reason to
> > favour Oracle/DB2 compatability at the expense of SQLServer
> > compatibility.
>
> There are few synonyms in PostgreSQL, so I think it is better for us to
> adopt only either one. I like FILLFACTOR personally, but compatibility
> with Oracle is more important to users around me.

OK, no probs.

Reading through rest of patch now.

Best Regards, Simon Riggs


Re: TODO-Item: B-tree fillfactor control

From
Simon Riggs
Date:
On Fri, 2006-02-10 at 19:12 +0900, ITAGAKI Takahiro wrote:
> Simon Riggs <simon@2ndquadrant.com> wrote:
> > I'm surprised that you do not use the parameter to control the RIGHTMOST
> > index block split factor for B-trees, which remains at a constant 67%.
> > The PCTFREE only seems to apply at CREATE INDEX time.
>
> Thanks for pointing out. I did not inadvertently use fillfactor on
> the rightmost page. With the revised patch, PCTFREE will be considered
> in such cases.
>
> # CREATE TABLE test (i int);
> # INSERT INTO test SELECT generate_series(1, 100000);
> # CREATE INDEX btree ON test USING btree (i) PCTFREE 0;
> # SELECT relpages from pg_class where relname ='btree';
> relpages | 249
> # INSERT INTO test SELECT generate_series(100001, 200000);
> # SELECT relpages from pg_class where relname ='btree';
> relpages | 497    <-- +99.6%
>

This additional functionality looks like it would work for b-trees. I've
not looked at this for GIST and hash indexes.

The reduction in index size should give useful performance gains on
larger, growing tables with increasing keys. We'll test that.

Best Regards, Simon Riggs