Thread: Bulk Insert tuning
Following patch implements a simple mechanism to keep a buffer pinned while we are bulk loading. Performance gains measured as +20% gain for CREATE TABLE as SELECT, and 15-17% for COPY on very short rows. Measurable difference drops away and is not measurable at all at 1000/bytes per row. As a result, patch doesn't bother to implement buffer pinning for TOAST operations. Can I ask for some independent performance results please? -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Attachment
On Tue, 2008-02-26 at 14:43 +0000, Simon Riggs wrote: > Following patch implements a simple mechanism to keep a buffer pinned > while we are bulk loading. > > Performance gains measured as +20% gain for CREATE TABLE as SELECT, and > 15-17% for COPY on very short rows. Measurable difference drops away and > is not measurable at all at 1000/bytes per row. > > As a result, patch doesn't bother to implement buffer pinning for TOAST > operations. Short guide for reviewers: backend/access/heap/heapam.c | 47 +++++++++++++!!!!!!!!!!!!!!!!!!!!! backend/access/heap/hio.c | 32 ++-!!!!!!!!!!!!!!!!!!!!!! backend/access/heap/tuptoaster.c | 2 ! backend/access/transam/xact.c | 1 backend/commands/copy.c | 9 +++++!! backend/executor/execMain.c | 9 !!!!!!! backend/storage/buffer/bufmgr.c | 32 ++++++++++++++++++++++++++ include/access/heapam.h | 4 +!! include/access/hio.h | 2 ! include/storage/bufmgr.h | 5 ++++ 10 files changed, 66 insertions(+), 1 deletion(-), 76 modifications(!) New heap API calls in heapam.c, caled from copy.c and execMain.c Modified API to HeapGetBufferForTuple() in hio.c, causes changes elsewhere in that file and tuptoaster.c New buffer manager API to manage pinned buffer in bufmgr.c Transaction cleanup on abort in xact.c -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Simon Riggs <simon@2ndquadrant.com> writes: > Following patch implements a simple mechanism to keep a buffer pinned > while we are bulk loading. This will fail to clean up nicely after a subtransaction abort, no? (For that matter I don't think it's right even for a top-level abort.) And I'm pretty sure it will trash your table entirely if someone inserts into another relation while a bulk insert is happening. (Not at all impossible, think of triggers for instance.) From a code structural point of view, we are already well past the number of distinct options that heap_insert ought to have. I was thinking the other day that bulk inserts ought to use a ring-buffer strategy to avoid having COPY IN trash the whole buffer arena, just as we've taught COPY OUT not to. So maybe a better idea is to generalize BufferAccessStrategy to be able to handle write as well as read concerns; or have two versions of it, one for writing and one for reading. In any case the point being to encapsulate all these random little options in a struct, which could also carry along state that needs to be saved across a series of inserts, such as the last pinned buffer. regards, tom lane
On Tue, 2008-02-26 at 15:12 -0500, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > Following patch implements a simple mechanism to keep a buffer pinned > > while we are bulk loading. > > This will fail to clean up nicely after a subtransaction abort, no? Yes, will fix. > (For that matter I don't think it's right even for a top-level abort.) > And I'm pretty sure it will trash your table entirely if someone > inserts into another relation while a bulk insert is happening. > (Not at all impossible, think of triggers for instance.) The pinned buffer is separate from the preferred block for each relation; BulkInsertBuffer isn't used for determining the block to insert into. If you try to insert into a block that differs from the pinned one it unpins it and re-pins the new one. So it is always safe with respect to the data in the table. It can run into recursive bulk insert ops but that just destroys the performance advantage, its not actually dangerous. > >From a code structural point of view, we are already well past the > number of distinct options that heap_insert ought to have. I was > thinking the other day that bulk inserts ought to use a ring-buffer > strategy to avoid having COPY IN trash the whole buffer arena, just > as we've taught COPY OUT not to. So maybe a better idea is to > generalize BufferAccessStrategy to be able to handle write as well > as read concerns; or have two versions of it, one for writing and one > for reading. In any case the point being to encapsulate all these > random little options in a struct, which could also carry along > state that needs to be saved across a series of inserts, such as > the last pinned buffer. That was actually my first thought when I realised recursive ops were possible. I don't think its necessary from a code correctness perspective but it might be an appropriate re-factoring considering those little bool-s seem to be breeding. I think we need two Strategy types since CTAS would need one of each. But then VACUUM is mid-way on that. Hmmm. Will consider. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Added to TODO: o Consider using a ring buffer for COPY FROM http://archives.postgresql.org/pgsql-patches/2008-02/msg00140.php --------------------------------------------------------------------------- Simon Riggs wrote: > On Tue, 2008-02-26 at 15:12 -0500, Tom Lane wrote: > > Simon Riggs <simon@2ndquadrant.com> writes: > > > Following patch implements a simple mechanism to keep a buffer pinned > > > while we are bulk loading. > > > > This will fail to clean up nicely after a subtransaction abort, no? > > Yes, will fix. > > > (For that matter I don't think it's right even for a top-level abort.) > > And I'm pretty sure it will trash your table entirely if someone > > inserts into another relation while a bulk insert is happening. > > (Not at all impossible, think of triggers for instance.) > > The pinned buffer is separate from the preferred block for each > relation; BulkInsertBuffer isn't used for determining the block to > insert into. If you try to insert into a block that differs from the > pinned one it unpins it and re-pins the new one. So it is always safe > with respect to the data in the table. > > It can run into recursive bulk insert ops but that just destroys the > performance advantage, its not actually dangerous. > > > >From a code structural point of view, we are already well past the > > number of distinct options that heap_insert ought to have. I was > > thinking the other day that bulk inserts ought to use a ring-buffer > > strategy to avoid having COPY IN trash the whole buffer arena, just > > as we've taught COPY OUT not to. So maybe a better idea is to > > generalize BufferAccessStrategy to be able to handle write as well > > as read concerns; or have two versions of it, one for writing and one > > for reading. In any case the point being to encapsulate all these > > random little options in a struct, which could also carry along > > state that needs to be saved across a series of inserts, such as > > the last pinned buffer. > > That was actually my first thought when I realised recursive ops were > possible. I don't think its necessary from a code correctness > perspective but it might be an appropriate re-factoring considering > those little bool-s seem to be breeding. > > I think we need two Strategy types since CTAS would need one of each. > But then VACUUM is mid-way on that. Hmmm. Will consider. > > -- > Simon Riggs > 2ndQuadrant http://www.2ndQuadrant.com > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Tue, Feb 26, 2008 at 02:43:51PM +0000, Simon Riggs wrote: > Following patch implements a simple mechanism to keep a buffer pinned > while we are bulk loading. > CK Tan and I worked on something similar but the problem we discovered was self locking. Consider a primary key: we insert a tuple into a buffer and do not release the exclusive lock. The btree code fetches the buffer and tries to share lock it, but we've already exclusive locked it. Oops. The performance improvement, though, makes it worth seeing if there is a solution. Thanks, Gavin
Gavin Sherry wrote: > On Tue, Feb 26, 2008 at 02:43:51PM +0000, Simon Riggs wrote: >> Following patch implements a simple mechanism to keep a buffer pinned >> while we are bulk loading. > > CK Tan and I worked on something similar but the problem we discovered > was self locking. Consider a primary key: we insert a tuple into a > buffer and do not release the exclusive lock. The btree code fetches the > buffer and tries to share lock it, but we've already exclusive locked > it. Oops. The performance improvement, though, makes it worth seeing if > there is a solution. That's not a problem if you only keep the buffer pinned, not locked. In my experience, pinning is the more expensive part, with the lookup into the buffer lookup table. Locking isn't free either, of course, but just avoiding the pin+unpin would help a lot. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Tue, 2008-02-26 at 21:36 +0000, Simon Riggs wrote: > On Tue, 2008-02-26 at 15:12 -0500, Tom Lane wrote: > > Simon Riggs <simon@2ndquadrant.com> writes: > > > Following patch implements a simple mechanism to keep a buffer pinned > > > while we are bulk loading. > > > > This will fail to clean up nicely after a subtransaction abort, no? > > Yes, will fix. Additional line in AbortSubTransaction handles this. > > (For that matter I don't think it's right even for a top-level abort.) > > And I'm pretty sure it will trash your table entirely if someone > > inserts into another relation while a bulk insert is happening. > > (Not at all impossible, think of triggers for instance.) > > The pinned buffer is separate from the preferred block for each > relation; BulkInsertBuffer isn't used for determining the block to > insert into. If you try to insert into a block that differs from the > pinned one it unpins it and re-pins the new one. So it is always safe > with respect to the data in the table. > > It can run into recursive bulk insert ops but that just destroys the > performance advantage, its not actually dangerous. I'm about to start refactoring code as suggested, so wanted to drop off another version to allow everybody to examine the safety/not of this approach. (So this patch is WIP) -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk
Attachment
On Tue, 2008-02-26 at 15:12 -0500, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > Following patch implements a simple mechanism to keep a buffer pinned > > while we are bulk loading. > > This will fail to clean up nicely after a subtransaction abort, no? > (For that matter I don't think it's right even for a top-level abort.) > And I'm pretty sure it will trash your table entirely if someone > inserts into another relation while a bulk insert is happening. > (Not at all impossible, think of triggers for instance.) > > >From a code structural point of view, we are already well past the > number of distinct options that heap_insert ought to have. I was > thinking the other day that bulk inserts ought to use a ring-buffer > strategy to avoid having COPY IN trash the whole buffer arena, just > as we've taught COPY OUT not to. Agree with that. That was mentioned here again http://archives.postgresql.org/pgsql-hackers/2008-02/msg01080.php What do you think of the "Full Block List" idea? > So maybe a better idea is to > generalize BufferAccessStrategy to be able to handle write as well > as read concerns; or have two versions of it, one for writing and one > for reading. In any case the point being to encapsulate all these > random little options in a struct, which could also carry along > state that needs to be saved across a series of inserts, such as > the last pinned buffer. I'm trying to implement this, but it begins to look quite ugly. First, if we allow multiple BulkInsertBuffers then we have to remember them all in a list so we can unpin them all in case of abort. The change isn't needed for correctness, as explained before. Second, we need multiple BufferIOStrategy objects for various purposes. There is no single default strategy, since normal inserts, normal updates and toast all have different default behaviour. That makes it ugly because we either need to differentiate between update/insert and toast/main inserts - which leads to just as many options, or we need to have lots of statically defined BufferIOStrategy objects for various purposes. I did agree that it was sensible to try to refactor the code, but now it just looks like a big pile of ugly changes for no benefit. I'll save my WIP so we can judge. Coding it using flags that can be ORd together makes more sense than a list of bools and will be easier to read. I'm going to try that approach instead. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk