Thread: COPY Fillfactor patch
During recent tuning of the TPC-C workload, I produced the following patch to force COPY to leave some space in each data block when it loads data into heap relations. Rationale for this was to improve performance of random UPDATE statements against a table too large to fit in memory. That activity is joint-equal most frequent single statement type. [tpcc.Stock table, for those who know the workload]. Each UPDATE tries to insert a new row version. The blocks are all full, so each UPDATE causes this activity: read block, decide cannot insert, unlock block extend relation by 1 block relock first block, lock new block insert row into new block After the patch, most UPDATEs cause only a single block access/update, excluding the index effects. This patch is essentially the "bottom-half" of what is needed to make FILLFACTOR work for heap relations. There is no syntax/command statements coded at this time...a recoded executable was used to reload the STOCK table before testing with a fixed padding of 1000 bytes, or roughly FILLFACTOR 88. Index relations might also be added, though it was less clear to me at the time I wrote the patch that such an approach would bring performance improvement. The main discussion now is probably where the FILLFACTOR should be specified. Should it be: a) COPY table FROM .... WITH FILLFACTOR 90 which would fill datablocks for *this copy only* at 90% full b) ALTER TABLE table SET FILLFACTOR 90 which would set the table so that all COPY commands used to load data into it would load heap blocks at 90% full Implementing an extension to the COPY command seemed easier than trying to fiddle with the system catalogs to allow this extra factor be recorded for ever, when it is only used in conjunction with a COPY statement anyway. The patch was produced quickly to assist tuning efforts during Scalability & Performance benchmarking of PostgreSQL 8.0 carried out at Unisys Corporation's Mission Viejo engineering laboratory. The development was sponsored by Unisys Corporation and the patch has now been donated to the PostgreSQL community under the standard PostgreSQL/BSD licence. Approval for release of this code has been given in writing to me by the Director, Open Runtime Products, Unisys on April 8, 2005. If anybody would like to complete the work started here, I'd be most grateful - there are some other items I need to work on for 8.1 ahead of this now. Best Regards, Simon Riggs
Attachment
Simon Riggs <simon@2ndquadrant.com> writes: > During recent tuning of the TPC-C workload, I produced the following > patch to force COPY to leave some space in each data block when it loads > data into heap relations. This is an amazingly ugly way to do it. Put the info into the Relation structure instead of cluttering heap_insert's API. (This would mean that *every* heap_insert is affected, not only COPY, which is what you want anyway I would think.) In practice, a real fix for this would add a column to pg_class, in which case you could get it from the Relation for free. regards, tom lane
Simon Riggs wrote: > During recent tuning of the TPC-C workload, I produced the following > patch to force COPY to leave some space in each data block when it loads > data into heap relations. I can't get too excited about incorporating changes designed solely to improve performance for the workload of a specific database benchmark. If the change has merit in some plausible "real world" situations, so be it -- but if not, I don't see the point. > Each UPDATE tries to insert a new row version. The blocks are all full, > so each UPDATE causes this activity: > read block, decide cannot insert, unlock block > extend relation by 1 block > relock first block, lock new block > insert row into new block > After the patch, most UPDATEs cause only a single block access/update, > excluding the index effects. Did you do any benchmarks to measure the performance of the patch? -Neil
On Wed, 2005-04-13 at 00:00 +1000, Neil Conway wrote: > Simon Riggs wrote: > > During recent tuning of the TPC-C workload, I produced the following > > patch to force COPY to leave some space in each data block when it loads > > data into heap relations. > > I can't get too excited about incorporating changes designed solely to > improve performance for the workload of a specific database benchmark. Sometimes we tune for specific workloads, sometimes we need to tune to a generic design pattern that effects many users. Unisys wanted to test and tune a workload that would improve things for the most number of users and I would say I support them in that. Database benchmarks exist for two reasons: - generate some great numbers - they offer a generic workload that stresses PostgreSQL in pseudo-real customer situations, but can be easily re-run, discussed, published and dissected for real insight The first reason helps people to accurately size systems and the second reason helps them save them time and money....but without the first people buy the wrong systems and waste money anyway. IMHO if you care about the second, you should also care about the first. Seriously, if you know a workload that better represents the majority of performance critical database applications then I'll be happy to consider tuning for that at another time. (Seriously). > If the change has merit in some plausible "real world" situations, so be > it -- but if not, I don't see the point. Well, I've used PCTFREE many times with Oracle and DB2. Did it do any good? Well, thats much harder, because you don't get as much chance to tune specific issues like that in the real world, but yes, it makes a difference for *some* real workloads. All? No. Best Regards, Simon Riggs
On Tue, 2005-04-12 at 09:56 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > During recent tuning of the TPC-C workload, I produced the following > > patch to force COPY to leave some space in each data block when it loads > > data into heap relations. > > Put the info into the Relation > structure instead of cluttering heap_insert's API. (This would mean > that *every* heap_insert is affected, not only COPY, which is what you > want anyway I would think.) Well, I intended it to work *only* for copy, not for insert, which is why I did it that way. I can see both sides of that discussion though. > In practice, a real fix for this would add a column to pg_class, in > which case you could get it from the Relation for free. Again, that was intentional when I wrote it. I am inclined to agree with that approach now, since it could be cached fairly easily. Anyway, when I get time, I'll finish off the patch. Unless other readers would like to follow on. Best Regards, Simon Riggs
On T, 2005-04-12 at 18:38 +0100, Simon Riggs wrote: > On Tue, 2005-04-12 at 09:56 -0400, Tom Lane wrote: (Neil, I added you to CC: to show you at least two more places where sparse heaps can be generally useful and not tweaks for single benchmark) > > Simon Riggs <simon@2ndquadrant.com> writes: > > > During recent tuning of the TPC-C workload, I produced the following > > > patch to force COPY to leave some space in each data block when it loads > > > data into heap relations. When I comtemplated a similar idea some time ago, my primary goal was reducing disk head movement during massive updates. At that time it seemed to me cheaper to not leave space in each page, but to leave each Nth page empty, as more new tuples will be punt on the same page and thus cause less WAL writes. Warning: This may be a false assumption - I did not check from code, if this is actually so for any or even a significant number of cases. > > Put the info into the Relation > > structure instead of cluttering heap_insert's API. (This would mean > > that *every* heap_insert is affected, not only COPY, which is what you > > want anyway I would think.) > > Well, I intended it to work *only* for copy, not for insert, which is > why I did it that way. To be more generally useful similar thing should be added to VACUUM FULL and CLUSTER. And perhaps some weird LAZY version of "VACUUM EXPAND" could be written as well, for keeping the holes from filling up in constantly growing databases. Having these holes is also essential, if we want a cheap way to keep data in CLUSTER order after initial CLUSTER command - if we do have free space everywhere in the file, we can just put each new tuple on the first page with free space on or after its preceeding tuple in cluster index. > Anyway, when I get time, I'll finish off the patch. Unless other readers > would like to follow on. I hope you will you will get that time before 8.1 ships :) -- Hannu Krosing <hannu@tm.ee>
My guess is that the FILLFACTOR will actually be a GUC variable rather than a clause to CREATE INDEX or REINDEX or CLUSTER. --------------------------------------------------------------------------- Hannu Krosing wrote: > On T, 2005-04-12 at 18:38 +0100, Simon Riggs wrote: > > On Tue, 2005-04-12 at 09:56 -0400, Tom Lane wrote: > > (Neil, I added you to CC: to show you at least two more places where > sparse heaps can be generally useful and not tweaks for single > benchmark) > > > > Simon Riggs <simon@2ndquadrant.com> writes: > > > > During recent tuning of the TPC-C workload, I produced the following > > > > patch to force COPY to leave some space in each data block when it loads > > > > data into heap relations. > > When I comtemplated a similar idea some time ago, my primary goal was > reducing disk head movement during massive updates. > > At that time it seemed to me cheaper to not leave space in each page, > but to leave each Nth page empty, as more new tuples will be punt on the > same page and thus cause less WAL writes. > > Warning: This may be a false assumption - I did not check from code, if > this is actually so for any or even a significant number of cases. > > > > Put the info into the Relation > > > structure instead of cluttering heap_insert's API. (This would mean > > > that *every* heap_insert is affected, not only COPY, which is what you > > > want anyway I would think.) > > > > Well, I intended it to work *only* for copy, not for insert, which is > > why I did it that way. > > To be more generally useful similar thing should be added to VACUUM FULL > and CLUSTER. > > And perhaps some weird LAZY version of "VACUUM EXPAND" could be written > as well, for keeping the holes from filling up in constantly growing > databases. > > Having these holes is also essential, if we want a cheap way to keep > data in CLUSTER order after initial CLUSTER command - if we do have free > space everywhere in the file, we can just put each new tuple on the > first page with free space on or after its preceeding tuple in cluster > index. > > > Anyway, when I get time, I'll finish off the patch. Unless other readers > > would like to follow on. > > I hope you will you will get that time before 8.1 ships :) > > -- > Hannu Krosing <hannu@tm.ee> > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Wed, 2005-04-20 at 02:08 +0300, Hannu Krosing wrote: > > Anyway, when I get time, I'll finish off the patch. Unless other readers > > would like to follow on. > > I hope you will you will get that time before 8.1 ships :) It's very unlikely, sorry. I'm working towards improving Business Intelligence workloads now and this patch isn't in that direction. BI likes full blocks (and no spaces). The sparse heap idea would use resources better than a smooth spread and is probably just as easy to code. The main problem is the FSM is only currently filled at VACUUM time, so putting spaces in tables wouldn't help much until after the first vacuum of a table. Best Regards, Simon Riggs
On Wed, 2005-04-20 at 09:03 -0400, Bruce Momjian wrote: > My guess is that the FILLFACTOR will actually be a GUC variable rather > than a clause to CREATE INDEX or REINDEX or CLUSTER. I hope not, but it would be easier to set like that and a lot easier to code up that way. Trouble is, if you need it, it needs to be varied according to what is happening to a particular relation. Otherwise you'd gain benefit in one area and lose it in another. Best Regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: > On Wed, 2005-04-20 at 09:03 -0400, Bruce Momjian wrote: >> My guess is that the FILLFACTOR will actually be a GUC variable rather >> than a clause to CREATE INDEX or REINDEX or CLUSTER. > I hope not, but it would be easier to set like that and a lot easier to > code up that way. I don't like that either. It seems highly unlikely that every table in a database would need the same fillfactor. It should be a per-table attribute, maybe with a DB-wide default, much like we handle per-column statistics targets. My thought about this is that the case where extra free space really helps is UPDATE: if heap_update can put the new tuple version on the same page where the old one is, there's a noticeable savings. So what you'd probably want is that heap_insert respects fillfactor (ie, won't eat the last fillfactor percent of a page) but heap_update doesn't, if it can thereby put the update on the same page. As long as you vacuum before a particular page is entirely filled by updates of its own tuples, you have a steady-state situation. I don't see a lot of point in Hannu's suggestion about leaving every Nth page free. Once you've had to move off the source page, near or far makes little difference. It might win if we were also trying to teach insert and update to preserve CLUSTER ordering --- but that is an order of magnitude harder than what's been discussed in this thread, and I'm unconvinced it's a good idea anyway ... regards, tom lane
Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > On Wed, 2005-04-20 at 09:03 -0400, Bruce Momjian wrote: > >> My guess is that the FILLFACTOR will actually be a GUC variable rather > >> than a clause to CREATE INDEX or REINDEX or CLUSTER. > > > I hope not, but it would be easier to set like that and a lot easier to > > code up that way. > > I don't like that either. It seems highly unlikely that every table in > a database would need the same fillfactor. It should be a per-table > attribute, maybe with a DB-wide default, much like we handle per-column > statistics targets. That is fine. I just don't want to add the FILLFACTOR clause to several commands. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Wed, 2005-04-20 at 09:42 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > On Wed, 2005-04-20 at 09:03 -0400, Bruce Momjian wrote: > >> My guess is that the FILLFACTOR will actually be a GUC variable rather > >> than a clause to CREATE INDEX or REINDEX or CLUSTER. > > > I hope not, but it would be easier to set like that and a lot easier to > > code up that way. > > I don't like that either. It seems highly unlikely that every table in > a database would need the same fillfactor. It should be a per-table > attribute, maybe with a DB-wide default, much like we handle per-column > statistics targets. > > My thought about this is that the case where extra free space really > helps is UPDATE: if heap_update can put the new tuple version on the > same page where the old one is, there's a noticeable savings. So what > you'd probably want is that heap_insert respects fillfactor (ie, won't > eat the last fillfactor percent of a page) but heap_update doesn't, > if it can thereby put the update on the same page. As long as you > vacuum before a particular page is entirely filled by updates of its > own tuples, you have a steady-state situation. Agreed. Thats the best approach. Best Regards, Simon Riggs