Re: Using multi-row technique with COPY - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Using multi-row technique with COPY
Date
Msg-id 1133179508.6165.2.camel@dell9300
Whole thread Raw
In response to Re: Using multi-row technique with COPY  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
On Mon, 2005-11-28 at 00:56 +0000, Simon Riggs wrote:
> On Sun, 2005-11-27 at 17:45 -0500, Tom Lane wrote:
> > Simon Riggs <simon@2ndquadrant.com> writes:
> > > COPY FROM can read in sufficient rows until it has a whole block worth
> > > of data, then get a new block and write it all with one pair of
> > > BufferLock calls.
> > 
> > > Comments?
> > 
> > I don't see any way to do this without horrible modularity violations.
> > The COPY code has no business going anywhere near individual buffers;
> > for that matter, it doesn't even really know what "a block worth" of
> > data is, since the tuples it's dealing with aren't toasted yet.
> 
> I've taken on board your comments about modularity issues from earlier.
> [I've not included anything on unique indexes, notice]
> 
> I was expecting to buffer this in the heap access method with a new
> call, say, heap_bulk_insert() rather than have all that code hanging
> around in COPY. A lower level routine RelationGetBufferForTupleArray can
> handle the actual grunt. It can work, without ugliness.
> 
> We'd need to handle a buffer bigger than a single tuple anyway, so you
> keep adding tuples until the last one tips over the edge, which then
> gets saved for the next block. Heap access method knows about blocks.
> 
> We could reasonably do a test for would-be-toasted within those
> routines. I should have said that this wouldn't apply if any of the
> tuples require toasting, which of course has to be a dynamic test.

If we had a buffer big enough (say 10-100x the page size), then we would
not actually need to test toasting. We can just pass the big buffer to
heap_bulk_insert() which inserts the whole buffer in as big chunks  as
needed to fill the free space on pages (with single page lock).

--------------
Hannu




pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Using multi-row technique with COPY
Next
From: Martijn van Oosterhout
Date:
Subject: Re: Using multi-row technique with COPY