Re: BufferAccessStrategy for bulk insert - Mailing list pgsql-hackers

From Robert Haas
Subject Re: BufferAccessStrategy for bulk insert
Date
Msg-id 603c8f070810282045g7fa6bdf9p53f03114d49643d7@mail.gmail.com
Whole thread Raw
In response to Re: BufferAccessStrategy for bulk insert  ("Robert Haas" <robertmhaas@gmail.com>)
Responses Re: BufferAccessStrategy for bulk insert
List pgsql-hackers
And here's the patch, which based on comments thus far does the following:

- Replaces the use_wal, use_fsm arguments in various places with a
single options argument.
- Creates a BAS_BULKWRITE buffer access strategy.
- Creates a BulkInsertState object so that COPY and CTAS can use
BAS_BULKWRITE and also keep the most recent page pinned.

Note that the original purpose of this exercise was to implement the
optimization that COPY and CTAS would keep the most recent page pinned
to avoid repeated pin/unpin cycles.  This change shows a small but
measurable performance improvement on short rows.  The remaining items
were added based on reviewer comments.

One concern that I have about this approach is that the situation in
which people are probably most concerned about COPY performance is
restoring a dump.  In that case, the COPY will be the only thing
running, and using a BufferAccessStrategy is an anti-optimization.  I
don't think it's a very big effect (any testing anyone can do on real
hardware rather than what I have would be appreciated) but I'm sort of
unsold of optimizing for what I believe to be the less-common use
case.  If the consensus is to reverse course on this point I'm happy
to rip those changes back out and resubmit; they are a relatively
small proportion of the patch.

...Robert

On Sun, Oct 26, 2008 at 8:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Seems sane to me.  I don't see the point of the HEAP_INSERT_BULK flag
>> bit --- providing or not providing bistate would cover that, and if
>> you have a bit as well then you have to define what the inconsistent
>> combinations mean.  I concur with making all-zeroes be the typical
>> state of the flag bits, too.
>
> Thanks for the design review.  I had thought to make the inconsistent
> combinations fail an assertion, but I'm just as happy to leave it out
> altogether.
>
>> FWIW, we generally declare bitmask flag variables as int, unless
>> there's some really good reason to do otherwise.
>
> OK, thanks for the tip.
>
> ...Robert
>

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Checking stack depth
Next
From: "Robert Haas"
Date:
Subject: Re: array_agg and array_accum (patch)