Re: Bulk Insert tuning - Mailing list pgsql-patches
From | Bruce Momjian |
---|---|
Subject | Re: Bulk Insert tuning |
Date | |
Msg-id | 200803032059.m23Kxe012260@momjian.us Whole thread Raw |
In response to | Re: Bulk Insert tuning (Simon Riggs <simon@2ndquadrant.com>) |
List | pgsql-patches |
Added to TODO: o Consider using a ring buffer for COPY FROM http://archives.postgresql.org/pgsql-patches/2008-02/msg00140.php --------------------------------------------------------------------------- Simon Riggs wrote: > On Tue, 2008-02-26 at 15:12 -0500, Tom Lane wrote: > > Simon Riggs <simon@2ndquadrant.com> writes: > > > Following patch implements a simple mechanism to keep a buffer pinned > > > while we are bulk loading. > > > > This will fail to clean up nicely after a subtransaction abort, no? > > Yes, will fix. > > > (For that matter I don't think it's right even for a top-level abort.) > > And I'm pretty sure it will trash your table entirely if someone > > inserts into another relation while a bulk insert is happening. > > (Not at all impossible, think of triggers for instance.) > > The pinned buffer is separate from the preferred block for each > relation; BulkInsertBuffer isn't used for determining the block to > insert into. If you try to insert into a block that differs from the > pinned one it unpins it and re-pins the new one. So it is always safe > with respect to the data in the table. > > It can run into recursive bulk insert ops but that just destroys the > performance advantage, its not actually dangerous. > > > >From a code structural point of view, we are already well past the > > number of distinct options that heap_insert ought to have. I was > > thinking the other day that bulk inserts ought to use a ring-buffer > > strategy to avoid having COPY IN trash the whole buffer arena, just > > as we've taught COPY OUT not to. So maybe a better idea is to > > generalize BufferAccessStrategy to be able to handle write as well > > as read concerns; or have two versions of it, one for writing and one > > for reading. In any case the point being to encapsulate all these > > random little options in a struct, which could also carry along > > state that needs to be saved across a series of inserts, such as > > the last pinned buffer. > > That was actually my first thought when I realised recursive ops were > possible. I don't think its necessary from a code correctness > perspective but it might be an appropriate re-factoring considering > those little bool-s seem to be breeding. > > I think we need two Strategy types since CTAS would need one of each. > But then VACUUM is mid-way on that. Hmmm. Will consider. > > -- > Simon Riggs > 2ndQuadrant http://www.2ndQuadrant.com > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
pgsql-patches by date: