On Fri, Aug 12, 2011 at 2:16 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> COPY is slow. Let's make it faster. One obvious optimization is to insert
> heap tuples in bigger chunks, instead of calling heap_insert() separately
> for every tuple. That saves the overhead of pinning and locking the buffer
> for every tuple, and you only need to write one WAL record for all the
> tuples written to the same page, instead of one for each tuple.
>
> Attached is a WIP patch to do that. It adds a new function,
> heap_multi_insert, which does the same thing as heap_insert, but works in
> bulk. It takes an array of tuples as argument, and tries to cram as many of
> them into the chosen targe page as it can, and only writes a single WAL
> record of the operation.
>
> This gives a significant speedup to COPY, particularly for narrow tables,
> with small tuples. Grouping multiple tuples into one WAL record reduces the
> WAL volume significantly, and the time spent in writing that WAL. The
> reduced overhead of repeatedly locking the buffer is also most noticeable on
> narrow tables. On wider tables, the effects are smaller. See
> copytest-results.txt, containing test results with three tables of different
> widths. The scripts used to get those numbers are also attached.
>
> Triggers complicate this. I believe it is only safe to group tuples together
> like this if the table has no triggers. A BEFORE ROW trigger might run a
> SELECT on the table being copied to, and check if some of the tuples we're
> about to insert exist. If we run BEFORE ROW triggers for a bunch of tuples
> first, and only then insert them, none of the trigger invocations will see
> the other rows as inserted yet. Similarly, if we run AFTER ROW triggers
> after inserting a bunch of tuples, the trigger for each of the insertions
> would see all the inserted rows. So at least for now, the patch simply falls
> back to inserting one row at a time if there are any triggers on the table.
But generic RI triggers would be ok, right?
merlin