Re: BufferAccessStrategy for bulk insert - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: BufferAccessStrategy for bulk insert
Date
Msg-id 1225450381.3971.517.camel@ebony.2ndQuadrant
Whole thread Raw
In response to Re: BufferAccessStrategy for bulk insert  ("Robert Haas" <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, 2008-10-30 at 22:46 -0400, Robert Haas wrote:
> > You should try profiling the patch. You can count the invocations of the
> > buffer access routines to check its all working in the right ratios.
> 
> *goes and learns how to do profile PostgreSQL*
> 
> OK, that was a good suggestion.  It looks like part of my problem here
> is that I didn't put the CREATE TABLE and the COPY into the same
> transaction.  As a result, a lot of time was spent on XLogInsert.
> Modified the test case, new profiling results attached.

The CPU time in XLogInsert can be confusing. The WAL writes can make
COPY I/O bound and so any savings on CPU may have been masked in the
earlier tests.

Patched profile shows we can still save a further 20% by writing data
block-at-a-time. That's more complex because we'd need to buffer the
index inserts also, or it would optimise only for the no-index (initial
load) case. So I think this is definitely enough for this release.

Using the buffer access strategy is going to be a big win for people
running large data loads in production and it will also help with people
running parallel load tasks (e.g. Dimitri's pg_loader). That effect is
more subtle and harder to measure, but it's an important consideration.

Thanks very much for finishing the patch in time for commitfest.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Distinct types
Next
From: Heikki Linnakangas
Date:
Subject: Re: Synchronous replication patch v1