COPY Fillfactor patch - Mailing list pgsql-patches

From Simon Riggs
Subject COPY Fillfactor patch
Date
Msg-id 1113292843.16721.1334.camel@localhost.localdomain
Whole thread Raw
Responses Re: COPY Fillfactor patch
Re: COPY Fillfactor patch
List pgsql-patches
During recent tuning of the TPC-C workload, I produced the following
patch to force COPY to leave some space in each data block when it loads
data into heap relations.

Rationale for this was to improve performance of random UPDATE
statements against a table too large to fit in memory. That activity is
joint-equal most frequent single statement type. [tpcc.Stock table, for
those who know the workload].

Each UPDATE tries to insert a new row version. The blocks are all full,
so each UPDATE causes this activity:
    read block, decide cannot insert, unlock block
    extend relation by 1 block
    relock first block, lock new block
    insert row into new block
After the patch, most UPDATEs cause only a single block access/update,
excluding the index effects.

This patch is essentially the "bottom-half" of what is needed to make
FILLFACTOR work for heap relations. There is no syntax/command
statements coded at this time...a recoded executable was used to reload
the STOCK table before testing with a fixed padding of 1000 bytes, or
roughly FILLFACTOR 88.

Index relations might also be added, though it was less clear to me at
the time I wrote the patch that such an approach would bring performance
improvement.

The main discussion now is probably where the FILLFACTOR should be
specified. Should it be:
a)     COPY table FROM .... WITH FILLFACTOR 90
which would fill datablocks for *this copy only* at 90% full

b)     ALTER TABLE table SET FILLFACTOR 90
which would set the table so that all COPY commands used to load data
into it would load heap blocks at 90% full

Implementing an extension to the COPY command seemed easier than trying
to fiddle with the system catalogs to allow this extra factor be
recorded for ever, when it is only used in conjunction with a COPY
statement anyway.

The patch was produced quickly to assist tuning efforts during
Scalability & Performance benchmarking of PostgreSQL 8.0 carried out at
Unisys Corporation's Mission Viejo engineering laboratory. The
development was sponsored by Unisys Corporation and the patch has now
been donated to the PostgreSQL community under the standard
PostgreSQL/BSD licence. Approval for release of this code has been given
in writing to me by the Director, Open Runtime Products, Unisys on April
8, 2005.

If anybody would like to complete the work started here, I'd be most
grateful - there are some other items I need to work on for 8.1 ahead of
this now.

Best Regards, Simon Riggs

Attachment

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: Remove unneeded left joins from psql
Next
From: Peter Eisentraut
Date:
Subject: Re: OS Locale UTF-8