Thread: Bulk Insert tuning

Bulk Insert tuning

From
Simon Riggs
Date:
Following patch implements a simple mechanism to keep a buffer pinned
while we are bulk loading.

Performance gains measured as +20% gain for CREATE TABLE as SELECT, and
15-17% for COPY on very short rows. Measurable difference drops away and
is not measurable at all at 1000/bytes per row.

As a result, patch doesn't bother to implement buffer pinning for TOAST
operations.

Can I ask for some independent performance results please?

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

Attachment

Re: Bulk Insert tuning

From
Simon Riggs
Date:
On Tue, 2008-02-26 at 14:43 +0000, Simon Riggs wrote:
> Following patch implements a simple mechanism to keep a buffer pinned
> while we are bulk loading.
>
> Performance gains measured as +20% gain for CREATE TABLE as SELECT, and
> 15-17% for COPY on very short rows. Measurable difference drops away and
> is not measurable at all at 1000/bytes per row.
>
> As a result, patch doesn't bother to implement buffer pinning for TOAST
> operations.

Short guide for reviewers:

backend/access/heap/heapam.c     | 47 +++++++++++++!!!!!!!!!!!!!!!!!!!!!
backend/access/heap/hio.c        | 32 ++-!!!!!!!!!!!!!!!!!!!!!!
backend/access/heap/tuptoaster.c |  2 !
backend/access/transam/xact.c    |  1
backend/commands/copy.c          |  9 +++++!!
backend/executor/execMain.c      |  9 !!!!!!!
backend/storage/buffer/bufmgr.c  | 32 ++++++++++++++++++++++++++
include/access/heapam.h          |  4 +!!
include/access/hio.h             |  2 !
include/storage/bufmgr.h         |  5 ++++

10 files changed, 66 insertions(+), 1 deletion(-), 76 modifications(!)

New heap API calls in heapam.c, caled from copy.c and execMain.c
Modified API to HeapGetBufferForTuple() in hio.c, causes changes
elsewhere in that file and tuptoaster.c
New buffer manager API to manage pinned buffer in bufmgr.c
Transaction cleanup on abort in xact.c

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


Re: Bulk Insert tuning

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> Following patch implements a simple mechanism to keep a buffer pinned
> while we are bulk loading.

This will fail to clean up nicely after a subtransaction abort, no?
(For that matter I don't think it's right even for a top-level abort.)
And I'm pretty sure it will trash your table entirely if someone
inserts into another relation while a bulk insert is happening.
(Not at all impossible, think of triggers for instance.)

From a code structural point of view, we are already well past the
number of distinct options that heap_insert ought to have.  I was
thinking the other day that bulk inserts ought to use a ring-buffer
strategy to avoid having COPY IN trash the whole buffer arena, just
as we've taught COPY OUT not to.  So maybe a better idea is to
generalize BufferAccessStrategy to be able to handle write as well
as read concerns; or have two versions of it, one for writing and one
for reading.  In any case the point being to encapsulate all these
random little options in a struct, which could also carry along
state that needs to be saved across a series of inserts, such as
the last pinned buffer.

            regards, tom lane

Re: Bulk Insert tuning

From
Simon Riggs
Date:
On Tue, 2008-02-26 at 15:12 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > Following patch implements a simple mechanism to keep a buffer pinned
> > while we are bulk loading.
>
> This will fail to clean up nicely after a subtransaction abort, no?

Yes, will fix.

> (For that matter I don't think it's right even for a top-level abort.)
> And I'm pretty sure it will trash your table entirely if someone
> inserts into another relation while a bulk insert is happening.
> (Not at all impossible, think of triggers for instance.)

The pinned buffer is separate from the preferred block for each
relation; BulkInsertBuffer isn't used for determining the block to
insert into. If you try to insert into a block that differs from the
pinned one it unpins it and re-pins the new one. So it is always safe
with respect to the data in the table.

It can run into recursive bulk insert ops but that just destroys the
performance advantage, its not actually dangerous.

> >From a code structural point of view, we are already well past the
> number of distinct options that heap_insert ought to have.  I was
> thinking the other day that bulk inserts ought to use a ring-buffer
> strategy to avoid having COPY IN trash the whole buffer arena, just
> as we've taught COPY OUT not to.  So maybe a better idea is to
> generalize BufferAccessStrategy to be able to handle write as well
> as read concerns; or have two versions of it, one for writing and one
> for reading.  In any case the point being to encapsulate all these
> random little options in a struct, which could also carry along
> state that needs to be saved across a series of inserts, such as
> the last pinned buffer.

That was actually my first thought when I realised recursive ops were
possible. I don't think its necessary from a code correctness
perspective but it might be an appropriate re-factoring considering
those little bool-s seem to be breeding.

I think we need two Strategy types since CTAS would need one of each.
But then VACUUM is mid-way on that. Hmmm. Will consider.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


Re: Bulk Insert tuning

From
Bruce Momjian
Date:
Added to TODO:

        o Consider using a ring buffer for COPY FROM

          http://archives.postgresql.org/pgsql-patches/2008-02/msg00140.php


---------------------------------------------------------------------------

Simon Riggs wrote:
> On Tue, 2008-02-26 at 15:12 -0500, Tom Lane wrote:
> > Simon Riggs <simon@2ndquadrant.com> writes:
> > > Following patch implements a simple mechanism to keep a buffer pinned
> > > while we are bulk loading.
> >
> > This will fail to clean up nicely after a subtransaction abort, no?
>
> Yes, will fix.
>
> > (For that matter I don't think it's right even for a top-level abort.)
> > And I'm pretty sure it will trash your table entirely if someone
> > inserts into another relation while a bulk insert is happening.
> > (Not at all impossible, think of triggers for instance.)
>
> The pinned buffer is separate from the preferred block for each
> relation; BulkInsertBuffer isn't used for determining the block to
> insert into. If you try to insert into a block that differs from the
> pinned one it unpins it and re-pins the new one. So it is always safe
> with respect to the data in the table.
>
> It can run into recursive bulk insert ops but that just destroys the
> performance advantage, its not actually dangerous.
>
> > >From a code structural point of view, we are already well past the
> > number of distinct options that heap_insert ought to have.  I was
> > thinking the other day that bulk inserts ought to use a ring-buffer
> > strategy to avoid having COPY IN trash the whole buffer arena, just
> > as we've taught COPY OUT not to.  So maybe a better idea is to
> > generalize BufferAccessStrategy to be able to handle write as well
> > as read concerns; or have two versions of it, one for writing and one
> > for reading.  In any case the point being to encapsulate all these
> > random little options in a struct, which could also carry along
> > state that needs to be saved across a series of inserts, such as
> > the last pinned buffer.
>
> That was actually my first thought when I realised recursive ops were
> possible. I don't think its necessary from a code correctness
> perspective but it might be an appropriate re-factoring considering
> those little bool-s seem to be breeding.
>
> I think we need two Strategy types since CTAS would need one of each.
> But then VACUUM is mid-way on that. Hmmm. Will consider.
>
> --
>   Simon Riggs
>   2ndQuadrant  http://www.2ndQuadrant.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Bulk Insert tuning

From
Gavin Sherry
Date:
On Tue, Feb 26, 2008 at 02:43:51PM +0000, Simon Riggs wrote:
> Following patch implements a simple mechanism to keep a buffer pinned
> while we are bulk loading.
>

CK Tan and I worked on something similar but the problem we discovered
was self locking. Consider a primary key: we insert a tuple into a
buffer and do not release the exclusive lock. The btree code fetches the
buffer and tries to share lock it, but we've already exclusive locked
it. Oops. The performance improvement, though, makes it worth seeing if
there is a solution.

Thanks,

Gavin

Re: Bulk Insert tuning

From
"Heikki Linnakangas"
Date:
Gavin Sherry wrote:
> On Tue, Feb 26, 2008 at 02:43:51PM +0000, Simon Riggs wrote:
>> Following patch implements a simple mechanism to keep a buffer pinned
>> while we are bulk loading.
>
> CK Tan and I worked on something similar but the problem we discovered
> was self locking. Consider a primary key: we insert a tuple into a
> buffer and do not release the exclusive lock. The btree code fetches the
> buffer and tries to share lock it, but we've already exclusive locked
> it. Oops. The performance improvement, though, makes it worth seeing if
> there is a solution.

That's not a problem if you only keep the buffer pinned, not locked. In
my experience, pinning is the more expensive part, with the lookup into
the buffer lookup table. Locking isn't free either, of course, but just
avoiding the pin+unpin would help a lot.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Bulk Insert tuning

From
Simon Riggs
Date:
On Tue, 2008-02-26 at 21:36 +0000, Simon Riggs wrote:
> On Tue, 2008-02-26 at 15:12 -0500, Tom Lane wrote:
> > Simon Riggs <simon@2ndquadrant.com> writes:
> > > Following patch implements a simple mechanism to keep a buffer pinned
> > > while we are bulk loading.
> >
> > This will fail to clean up nicely after a subtransaction abort, no?
>
> Yes, will fix.

Additional line in AbortSubTransaction handles this.

> > (For that matter I don't think it's right even for a top-level abort.)
> > And I'm pretty sure it will trash your table entirely if someone
> > inserts into another relation while a bulk insert is happening.
> > (Not at all impossible, think of triggers for instance.)
>
> The pinned buffer is separate from the preferred block for each
> relation; BulkInsertBuffer isn't used for determining the block to
> insert into. If you try to insert into a block that differs from the
> pinned one it unpins it and re-pins the new one. So it is always safe
> with respect to the data in the table.
>
> It can run into recursive bulk insert ops but that just destroys the
> performance advantage, its not actually dangerous.

I'm about to start refactoring code as suggested, so wanted to drop off
another version to allow everybody to examine the safety/not of this
approach. (So this patch is WIP)

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk

Attachment

Re: Bulk Insert tuning

From
Simon Riggs
Date:
On Tue, 2008-02-26 at 15:12 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > Following patch implements a simple mechanism to keep a buffer pinned
> > while we are bulk loading.
>
> This will fail to clean up nicely after a subtransaction abort, no?
> (For that matter I don't think it's right even for a top-level abort.)
> And I'm pretty sure it will trash your table entirely if someone
> inserts into another relation while a bulk insert is happening.
> (Not at all impossible, think of triggers for instance.)
>
> >From a code structural point of view, we are already well past the
> number of distinct options that heap_insert ought to have.  I was
> thinking the other day that bulk inserts ought to use a ring-buffer
> strategy to avoid having COPY IN trash the whole buffer arena, just
> as we've taught COPY OUT not to.

Agree with that. That was mentioned here again
http://archives.postgresql.org/pgsql-hackers/2008-02/msg01080.php

What do you think of the "Full Block List" idea?

> So maybe a better idea is to
> generalize BufferAccessStrategy to be able to handle write as well
> as read concerns; or have two versions of it, one for writing and one
> for reading.  In any case the point being to encapsulate all these
> random little options in a struct, which could also carry along
> state that needs to be saved across a series of inserts, such as
> the last pinned buffer.

I'm trying to implement this, but it begins to look quite ugly.

First, if we allow multiple BulkInsertBuffers then we have to remember
them all in a list so we can unpin them all in case of abort. The change
isn't needed for correctness, as explained before.

Second, we need multiple BufferIOStrategy objects for various purposes.
There is no single default strategy, since normal inserts, normal
updates and toast all have different default behaviour. That makes it
ugly because we either need to differentiate between update/insert and
toast/main inserts - which leads to just as many options, or we need to
have lots of statically defined BufferIOStrategy objects for various
purposes.

I did agree that it was sensible to try to refactor the code, but now it
just looks like a big pile of ugly changes for no benefit. I'll save my
WIP so we can judge.

Coding it using flags that can be ORd together makes more sense than a
list of bools and will be easier to read. I'm going to try that approach
instead.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk