Thread: TODO-Item: B-tree fillfactor control
Hi Hackers, I'm trying the following TODO item: [Indexes] - Add fillfactor to control reserved free space during index creation I have already made an patch and it seemed to work well. ---- $ ./pgbench -i -s 10 # select relpages from pg_class where relname = 'accounts_pkey'; relpages | 2745 ( default PCTFree is 10% ) # set btree_leaf_free_percent = 0; # reindex index accounts_pkey; # select relpages from pg_class where relname = 'accounts_pkey'; relpages | 2475 ( <- about 2745 * 0.9 = 2470.5 ) # set btree_leaf_free_percent = 30; # reindex index accounts_pkey; # select relpages from pg_class where relname = 'accounts_pkey'; relpages | 3537 ( <- about 2745 * 0.9 / 0.7 = 3529.3 ) ---- And now, I need advice on some issues. - Is it appropriate to use GUC variables to control fillfactors? Is it better to extend CREATE INDEX / REINDEX grammar? - Should indexes remember their fillfactors when they are created? The last fillfactors will be used on next reindex. - Is fillfactor useful for hash and gist indexes? I think hash does not need it, but gist might need it. Look forward to your comments. Thanks, --- ITAGAKI Takahiro NTT Cyber Space Laboratories
Attachment
ITAGAKI Takahiro wrote: > Hi Hackers, > > I'm trying the following TODO item: > [Indexes] > - Add fillfactor to control reserved free space during index creation > > I have already made an patch and it seemed to work well. Great. > And now, I need advice on some issues. > > - Is it appropriate to use GUC variables to control fillfactors? > Is it better to extend CREATE INDEX / REINDEX grammar? I think it has to be part of CREATE INDEX and ALTER INDEX. Is there a use for separate node and leaf settings? This patch needs documentation, and if we have separate items, we are going to have to explain when to use node or leaf. > - Should indexes remember their fillfactors when they are created? > The last fillfactors will be used on next reindex. They should remember, for sure, and REINDEX should use it. It think this is similar to the ALTER TABLE ALTER [ COLUMN ] ... SET STATISTICS functionality. It will need to be dumped as well by pg_dump. If you need help with any of this, let me know. > - Is fillfactor useful for hash and gist indexes? > I think hash does not need it, but gist might need it. Not sure. We don't know what type of index a GIST will be so we have no way of knowing. I am thinking we can implement just btree now and the GIST folks can add it later if they want. My guess is that each GIST is going to behave differently for different fill-factors, so if allow it to be set for GIST, GIST developers can pull the value if they want. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> - Should indexes remember their fillfactors when they are created? >> The last fillfactors will be used on next reindex. > They should remember, for sure, and REINDEX should use it. It think > this is similar to the ALTER TABLE ALTER [ COLUMN ] ... SET STATISTICS > functionality. It will need to be dumped as well by pg_dump. If you want it to be dumped by pg_dump (which is debatable IMHO) then it MUST NOT be a syntax extension, it has to be driven by a GUC variable, else we have compatibility problems with the dumps. We just went through this with WITH/WITHOUT OIDS. regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> - Should indexes remember their fillfactors when they are created? > >> The last fillfactors will be used on next reindex. > > > They should remember, for sure, and REINDEX should use it. It think > > this is similar to the ALTER TABLE ALTER [ COLUMN ] ... SET STATISTICS > > functionality. It will need to be dumped as well by pg_dump. > > If you want it to be dumped by pg_dump (which is debatable IMHO) then > it MUST NOT be a syntax extension, it has to be driven by a GUC > variable, else we have compatibility problems with the dumps. We just > went through this with WITH/WITHOUT OIDS. OK, so we are going to issue a GUC to set the fill factor in pg_dump, but still have a fillfactor syntax for use by users? That is how we do WITH/WITHOUT OIDS. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
> If you want it to be dumped by pg_dump (which is debatable IMHO) then > it MUST NOT be a syntax extension, it has to be driven by a GUC > variable, else we have compatibility problems with the dumps. We just > went through this with WITH/WITHOUT OIDS. Compatibility problems? CREATE INDEX isn't an SQL standard command is it? Chris
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes: > Compatibility problems? CREATE INDEX isn't an SQL standard command is it? No, but it'll cause unnecessary cross-version compatibility issues for us. regards, tom lane
Tom Lane wrote: > Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes: > > Compatibility problems? CREATE INDEX isn't an SQL standard command is it? > > No, but it'll cause unnecessary cross-version compatibility issues for > us. It is true it isn't SQL standard, but I think our CREATE INDEX syntax matches many vendor's syntax in most cases. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> wrote: > > - Is fillfactor useful for hash and gist indexes? > > I think hash does not need it, but gist might need it. > > Not sure. We don't know what type of index a GIST will be so we have no > way of knowing. I am thinking we can implement just btree now and the > GIST folks can add it later if they want. My guess is that each GIST is > going to behave differently for different fill-factors, so if allow it > to be set for GIST, GIST developers can pull the value if they want. My understanding about hash was wrong. It uses fill factor of 75%, which is hard-coded. On the other hand, GIST has no ability to control fill factor currently. I'm trying to add fill factors to hash and gist, so I'll ask index developers to review a patch in the future. > > - Is it appropriate to use GUC variables to control fillfactors? > > Is it better to extend CREATE INDEX / REINDEX grammar? > > I think it has to be part of CREATE INDEX and ALTER INDEX. SQL standard has no regulation for indexes, so I refered to other databases. - Oracle and DB2 : CREATE INDEX index ON table(...) PCTFREE 30; - MS SQL Server : CREATE INDEX index ON table (...) WITH FILLFACTOR = 70; PCTFREE seems to be common, so I'll extend DDL to use PCTFREE syntax. The following two syntaxes will be able to be used. 1. SET btree_free_percent = 30; CREATE INDEX index ON table (...); SET btree_free_percent = 10; -- revert 2. CREATE INDEX index ON table (...) PCTFREE 30; 1 would be useful for a compatibe pg_dump format, per suggestion from Tom. > Is there a use for separate node and leaf settings? We should use different settings for leaf and node, but it may confuse users. So I'll simplify the setting as follows: node_free_percent = Min(30%, 3 * leaf_free_percent) When leaf_free_percent is 10%, node_free_percent is 30%. They are the same values of the current implementation. --- ITAGAKI Takahiro NTT Cyber Space Laboratories
ITAGAKI Takahiro wrote: > Bruce Momjian <pgman@candle.pha.pa.us> wrote: > > > > - Is fillfactor useful for hash and gist indexes? > > > I think hash does not need it, but gist might need it. > > > > Not sure. We don't know what type of index a GIST will be so we have no > > way of knowing. I am thinking we can implement just btree now and the > > GIST folks can add it later if they want. My guess is that each GIST is > > going to behave differently for different fill-factors, so if allow it > > to be set for GIST, GIST developers can pull the value if they want. > > My understanding about hash was wrong. It uses fill factor of 75%, which is > hard-coded. On the other hand, GIST has no ability to control fill factor > currently. I'm trying to add fill factors to hash and gist, so I'll ask > index developers to review a patch in the future. OK. > > > - Is it appropriate to use GUC variables to control fillfactors? > > > Is it better to extend CREATE INDEX / REINDEX grammar? > > > > I think it has to be part of CREATE INDEX and ALTER INDEX. > > SQL standard has no regulation for indexes, so I refered to other databases. > - Oracle and DB2 : CREATE INDEX index ON table (...) PCTFREE 30; > - MS SQL Server : CREATE INDEX index ON table (...) WITH FILLFACTOR = 70; > > PCTFREE seems to be common, so I'll extend DDL to use PCTFREE syntax. > The following two syntaxes will be able to be used. > 1. SET btree_free_percent = 30; > CREATE INDEX index ON table (...); > SET btree_free_percent = 10; -- revert > 2. CREATE INDEX index ON table (...) PCTFREE 30; > > 1 would be useful for a compatibe pg_dump format, per suggestion from Tom. I personally like FILLFACTOR, but I understand the desire to match Oracle. PCTFREE seems too abreviated for me, but it would match the GUC better, so maybe it is the best. > > Is there a use for separate node and leaf settings? > > We should use different settings for leaf and node, but it may confuse users. > So I'll simplify the setting as follows: > node_free_percent = Min(30%, 3 * leaf_free_percent) > When leaf_free_percent is 10%, node_free_percent is 30%. They are the same > values of the current implementation. Yes, I think that is ideal. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073