Thread: COPY Fillfactor patch

COPY Fillfactor patch

From
Simon Riggs
Date:
During recent tuning of the TPC-C workload, I produced the following
patch to force COPY to leave some space in each data block when it loads
data into heap relations.

Rationale for this was to improve performance of random UPDATE
statements against a table too large to fit in memory. That activity is
joint-equal most frequent single statement type. [tpcc.Stock table, for
those who know the workload].

Each UPDATE tries to insert a new row version. The blocks are all full,
so each UPDATE causes this activity:
    read block, decide cannot insert, unlock block
    extend relation by 1 block
    relock first block, lock new block
    insert row into new block
After the patch, most UPDATEs cause only a single block access/update,
excluding the index effects.

This patch is essentially the "bottom-half" of what is needed to make
FILLFACTOR work for heap relations. There is no syntax/command
statements coded at this time...a recoded executable was used to reload
the STOCK table before testing with a fixed padding of 1000 bytes, or
roughly FILLFACTOR 88.

Index relations might also be added, though it was less clear to me at
the time I wrote the patch that such an approach would bring performance
improvement.

The main discussion now is probably where the FILLFACTOR should be
specified. Should it be:
a)     COPY table FROM .... WITH FILLFACTOR 90
which would fill datablocks for *this copy only* at 90% full

b)     ALTER TABLE table SET FILLFACTOR 90
which would set the table so that all COPY commands used to load data
into it would load heap blocks at 90% full

Implementing an extension to the COPY command seemed easier than trying
to fiddle with the system catalogs to allow this extra factor be
recorded for ever, when it is only used in conjunction with a COPY
statement anyway.

The patch was produced quickly to assist tuning efforts during
Scalability & Performance benchmarking of PostgreSQL 8.0 carried out at
Unisys Corporation's Mission Viejo engineering laboratory. The
development was sponsored by Unisys Corporation and the patch has now
been donated to the PostgreSQL community under the standard
PostgreSQL/BSD licence. Approval for release of this code has been given
in writing to me by the Director, Open Runtime Products, Unisys on April
8, 2005.

If anybody would like to complete the work started here, I'd be most
grateful - there are some other items I need to work on for 8.1 ahead of
this now.

Best Regards, Simon Riggs

Attachment

Re: COPY Fillfactor patch

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> During recent tuning of the TPC-C workload, I produced the following
> patch to force COPY to leave some space in each data block when it loads
> data into heap relations.

This is an amazingly ugly way to do it.  Put the info into the Relation
structure instead of cluttering heap_insert's API.  (This would mean
that *every* heap_insert is affected, not only COPY, which is what you
want anyway I would think.)

In practice, a real fix for this would add a column to pg_class, in
which case you could get it from the Relation for free.

            regards, tom lane

Re: COPY Fillfactor patch

From
Neil Conway
Date:
Simon Riggs wrote:
> During recent tuning of the TPC-C workload, I produced the following
> patch to force COPY to leave some space in each data block when it loads
> data into heap relations.

I can't get too excited about incorporating changes designed solely to
improve performance for the workload of a specific database benchmark.
If the change has merit in some plausible "real world" situations, so be
it -- but if not, I don't see the point.

> Each UPDATE tries to insert a new row version. The blocks are all full,
> so each UPDATE causes this activity:
>     read block, decide cannot insert, unlock block
>     extend relation by 1 block
>     relock first block, lock new block
>     insert row into new block
> After the patch, most UPDATEs cause only a single block access/update,
> excluding the index effects.

Did you do any benchmarks to measure the performance of the patch?

-Neil

Re: COPY Fillfactor patch

From
Simon Riggs
Date:
On Wed, 2005-04-13 at 00:00 +1000, Neil Conway wrote:
> Simon Riggs wrote:
> > During recent tuning of the TPC-C workload, I produced the following
> > patch to force COPY to leave some space in each data block when it loads
> > data into heap relations.
>
> I can't get too excited about incorporating changes designed solely to
> improve performance for the workload of a specific database benchmark.

Sometimes we tune for specific workloads, sometimes we need to tune to a
generic design pattern that effects many users. Unisys wanted to test
and tune a workload that would improve things for the most number of
users and I would say I support them in that.

Database benchmarks exist for two reasons:
- generate some great numbers
- they offer a generic workload that stresses PostgreSQL in pseudo-real
customer situations, but can be easily re-run, discussed, published and
dissected for real insight

The first reason helps people to accurately size systems and the second
reason helps them save them time and money....but without the first
people buy the wrong systems and waste money anyway. IMHO if you care
about the second, you should also care about the first.

Seriously, if you know a workload that better represents the majority of
performance critical database applications then I'll be happy to
consider tuning for that at another time. (Seriously).

> If the change has merit in some plausible "real world" situations, so be
> it -- but if not, I don't see the point.

Well, I've used PCTFREE many times with Oracle and DB2. Did it do any
good? Well, thats much harder, because you don't get as much chance to
tune specific issues like that in the real world, but yes, it makes a
difference for *some* real workloads. All? No.

Best Regards, Simon Riggs


Re: COPY Fillfactor patch

From
Simon Riggs
Date:
On Tue, 2005-04-12 at 09:56 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > During recent tuning of the TPC-C workload, I produced the following
> > patch to force COPY to leave some space in each data block when it loads
> > data into heap relations.
>

> Put the info into the Relation
> structure instead of cluttering heap_insert's API.  (This would mean
> that *every* heap_insert is affected, not only COPY, which is what you
> want anyway I would think.)

Well, I intended it to work *only* for copy, not for insert, which is
why I did it that way.

I can see both sides of that discussion though.

> In practice, a real fix for this would add a column to pg_class, in
> which case you could get it from the Relation for free.

Again, that was intentional when I wrote it. I am inclined to agree with
that approach now, since it could be cached fairly easily.

Anyway, when I get time, I'll finish off the patch. Unless other readers
would like to follow on.

Best Regards, Simon Riggs


Re: COPY Fillfactor patch

From
Hannu Krosing
Date:
On T, 2005-04-12 at 18:38 +0100, Simon Riggs wrote:
> On Tue, 2005-04-12 at 09:56 -0400, Tom Lane wrote:

(Neil, I added you to CC: to show you at least two more places where
sparse heaps can be generally useful and not tweaks for single
benchmark)

> > Simon Riggs <simon@2ndquadrant.com> writes:
> > > During recent tuning of the TPC-C workload, I produced the following
> > > patch to force COPY to leave some space in each data block when it loads
> > > data into heap relations.

When I comtemplated a similar idea some time ago, my primary goal was
reducing disk head movement during massive updates.

At that time it seemed to me cheaper to not leave space in each page,
but to leave each Nth page empty, as more new tuples will be punt on the
same page and thus cause less WAL writes.

Warning: This may be a false assumption - I did not check from code, if
this is actually so for any or even a significant number of cases.

> > Put the info into the Relation
> > structure instead of cluttering heap_insert's API.  (This would mean
> > that *every* heap_insert is affected, not only COPY, which is what you
> > want anyway I would think.)
>
> Well, I intended it to work *only* for copy, not for insert, which is
> why I did it that way.

To be more generally useful similar thing should be added to VACUUM FULL
and CLUSTER.

And perhaps some weird LAZY version of "VACUUM EXPAND" could be written
as well, for keeping the holes from filling up in constantly growing
databases.

Having these holes is also essential, if we want a cheap way to keep
data in CLUSTER order after initial CLUSTER command - if we do have free
space everywhere in the file, we can just put each new tuple on the
first page with free space on or after its preceeding tuple in cluster
index.

> Anyway, when I get time, I'll finish off the patch. Unless other readers
> would like to follow on.

I hope you will you will get that time before 8.1 ships :)

--
Hannu Krosing <hannu@tm.ee>

Re: COPY Fillfactor patch

From
Bruce Momjian
Date:
My guess is that the FILLFACTOR will actually be a GUC variable rather
than a clause to CREATE INDEX or REINDEX or CLUSTER.

---------------------------------------------------------------------------

Hannu Krosing wrote:
> On T, 2005-04-12 at 18:38 +0100, Simon Riggs wrote:
> > On Tue, 2005-04-12 at 09:56 -0400, Tom Lane wrote:
>
> (Neil, I added you to CC: to show you at least two more places where
> sparse heaps can be generally useful and not tweaks for single
> benchmark)
>
> > > Simon Riggs <simon@2ndquadrant.com> writes:
> > > > During recent tuning of the TPC-C workload, I produced the following
> > > > patch to force COPY to leave some space in each data block when it loads
> > > > data into heap relations.
>
> When I comtemplated a similar idea some time ago, my primary goal was
> reducing disk head movement during massive updates.
>
> At that time it seemed to me cheaper to not leave space in each page,
> but to leave each Nth page empty, as more new tuples will be punt on the
> same page and thus cause less WAL writes.
>
> Warning: This may be a false assumption - I did not check from code, if
> this is actually so for any or even a significant number of cases.
>
> > > Put the info into the Relation
> > > structure instead of cluttering heap_insert's API.  (This would mean
> > > that *every* heap_insert is affected, not only COPY, which is what you
> > > want anyway I would think.)
> >
> > Well, I intended it to work *only* for copy, not for insert, which is
> > why I did it that way.
>
> To be more generally useful similar thing should be added to VACUUM FULL
> and CLUSTER.
>
> And perhaps some weird LAZY version of "VACUUM EXPAND" could be written
> as well, for keeping the holes from filling up in constantly growing
> databases.
>
> Having these holes is also essential, if we want a cheap way to keep
> data in CLUSTER order after initial CLUSTER command - if we do have free
> space everywhere in the file, we can just put each new tuple on the
> first page with free space on or after its preceeding tuple in cluster
> index.
>
> > Anyway, when I get time, I'll finish off the patch. Unless other readers
> > would like to follow on.
>
> I hope you will you will get that time before 8.1 ships :)
>
> --
> Hannu Krosing <hannu@tm.ee>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: COPY Fillfactor patch

From
Simon Riggs
Date:
On Wed, 2005-04-20 at 02:08 +0300, Hannu Krosing wrote:
> > Anyway, when I get time, I'll finish off the patch. Unless other readers
> > would like to follow on.
>
> I hope you will you will get that time before 8.1 ships :)

It's very unlikely, sorry.

I'm working towards improving Business Intelligence workloads now and
this patch isn't in that direction. BI likes full blocks (and no
spaces).

The sparse heap idea would use resources better than a smooth spread and
is probably just as easy to code. The main problem is the FSM is only
currently filled at VACUUM time, so putting spaces in tables wouldn't
help much until after the first vacuum of a table.

Best Regards, Simon Riggs


Re: COPY Fillfactor patch

From
Simon Riggs
Date:
On Wed, 2005-04-20 at 09:03 -0400, Bruce Momjian wrote:
> My guess is that the FILLFACTOR will actually be a GUC variable rather
> than a clause to CREATE INDEX or REINDEX or CLUSTER.

I hope not, but it would be easier to set like that and a lot easier to
code up that way.

Trouble is, if you need it, it needs to be varied according to what is
happening to a particular relation. Otherwise you'd gain benefit in one
area and lose it in another.

Best Regards, Simon Riggs




Re: COPY Fillfactor patch

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> On Wed, 2005-04-20 at 09:03 -0400, Bruce Momjian wrote:
>> My guess is that the FILLFACTOR will actually be a GUC variable rather
>> than a clause to CREATE INDEX or REINDEX or CLUSTER.

> I hope not, but it would be easier to set like that and a lot easier to
> code up that way.

I don't like that either.  It seems highly unlikely that every table in
a database would need the same fillfactor.  It should be a per-table
attribute, maybe with a DB-wide default, much like we handle per-column
statistics targets.

My thought about this is that the case where extra free space really
helps is UPDATE: if heap_update can put the new tuple version on the
same page where the old one is, there's a noticeable savings.  So what
you'd probably want is that heap_insert respects fillfactor (ie, won't
eat the last fillfactor percent of a page) but heap_update doesn't,
if it can thereby put the update on the same page.  As long as you
vacuum before a particular page is entirely filled by updates of its
own tuples, you have a steady-state situation.

I don't see a lot of point in Hannu's suggestion about leaving every Nth
page free.  Once you've had to move off the source page, near or far
makes little difference.  It might win if we were also trying to teach
insert and update to preserve CLUSTER ordering --- but that is an order
of magnitude harder than what's been discussed in this thread, and I'm
unconvinced it's a good idea anyway ...

            regards, tom lane

Re: COPY Fillfactor patch

From
Bruce Momjian
Date:
Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > On Wed, 2005-04-20 at 09:03 -0400, Bruce Momjian wrote:
> >> My guess is that the FILLFACTOR will actually be a GUC variable rather
> >> than a clause to CREATE INDEX or REINDEX or CLUSTER.
>
> > I hope not, but it would be easier to set like that and a lot easier to
> > code up that way.
>
> I don't like that either.  It seems highly unlikely that every table in
> a database would need the same fillfactor.  It should be a per-table
> attribute, maybe with a DB-wide default, much like we handle per-column
> statistics targets.

That is fine.  I just don't want to add the FILLFACTOR clause to several
commands.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: COPY Fillfactor patch

From
Simon Riggs
Date:
On Wed, 2005-04-20 at 09:42 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > On Wed, 2005-04-20 at 09:03 -0400, Bruce Momjian wrote:
> >> My guess is that the FILLFACTOR will actually be a GUC variable rather
> >> than a clause to CREATE INDEX or REINDEX or CLUSTER.
>
> > I hope not, but it would be easier to set like that and a lot easier to
> > code up that way.
>
> I don't like that either.  It seems highly unlikely that every table in
> a database would need the same fillfactor.  It should be a per-table
> attribute, maybe with a DB-wide default, much like we handle per-column
> statistics targets.
>
> My thought about this is that the case where extra free space really
> helps is UPDATE: if heap_update can put the new tuple version on the
> same page where the old one is, there's a noticeable savings.  So what
> you'd probably want is that heap_insert respects fillfactor (ie, won't
> eat the last fillfactor percent of a page) but heap_update doesn't,
> if it can thereby put the update on the same page.  As long as you
> vacuum before a particular page is entirely filled by updates of its
> own tuples, you have a steady-state situation.

Agreed. Thats the best approach.

Best Regards, Simon Riggs