On Mon, 2005-11-28 at 00:56 +0000, Simon Riggs wrote:
> On Sun, 2005-11-27 at 17:45 -0500, Tom Lane wrote:
> > Simon Riggs <simon@2ndquadrant.com> writes:
> > > COPY FROM can read in sufficient rows until it has a whole block worth
> > > of data, then get a new block and write it all with one pair of
> > > BufferLock calls.
> >
> > > Comments?
> >
> > I don't see any way to do this without horrible modularity violations.
> > The COPY code has no business going anywhere near individual buffers;
> > for that matter, it doesn't even really know what "a block worth" of
> > data is, since the tuples it's dealing with aren't toasted yet.
>
> I've taken on board your comments about modularity issues from earlier.
> [I've not included anything on unique indexes, notice]
>
> I was expecting to buffer this in the heap access method with a new
> call, say, heap_bulk_insert() rather than have all that code hanging
> around in COPY. A lower level routine RelationGetBufferForTupleArray can
> handle the actual grunt. It can work, without ugliness.
>
> We'd need to handle a buffer bigger than a single tuple anyway, so you
> keep adding tuples until the last one tips over the edge, which then
> gets saved for the next block. Heap access method knows about blocks.
>
> We could reasonably do a test for would-be-toasted within those
> routines. I should have said that this wouldn't apply if any of the
> tuples require toasting, which of course has to be a dynamic test.
If we had a buffer big enough (say 10-100x the page size), then we would
not actually need to test toasting. We can just pass the big buffer to
heap_bulk_insert() which inserts the whole buffer in as big chunks as
needed to fill the free space on pages (with single page lock).
--------------
Hannu