Re: Bulk Inserts - Mailing list pgsql-hackers

From Pierre Frédéric Caillaud
Subject Re: Bulk Inserts
Date
Msg-id op.u0aesrf8cke6l8@soyouz
Whole thread Raw
In response to Re: Bulk Inserts  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
> Yes, I did not consider that to be a problem because I did not think it
> would be used on indexed tables.  I figured that the gain from doing bulk
> inserts into the table would be so diluted by the still-bottle-necked  
> index maintenance that it was OK not to use this optimization for  
> indexed tables.

I've tested with indexes, and the index update time is much larger than  
the inserts time. Bulk inserts still provide a little bonus though, and  
having a solution that works in all cases is better IMHO.

> My original thought was based on the idea of still using heap_insert, but
> with a modified form of bistate which would hold the exclusive lock and  
> not
> just a pin.  If heap_insert is being driven by the unmodified COPY code,
> then it can't guarantee that COPY won't stall on a pipe read or  
> something,
> and so probably shouldn't hold an exclusive lock while filling the block.

Exactly, that's what I was thinking too, and reached the same conclusion.

> That is why I decided a local buffer would be better, as the exclusive  
> lock
> is really a no-op and wouldn't block anyone.  But if you are creating a  
> new
> heap_bulk_insert and modifying the COPY to go with it, then you can
> guarantee it won't stall from the driving end, instead.

I think it's better, but you have to buffer tuples : at least a full  
page's worth, or better, several pages' worth of tuples, in case inline  
compression kicks in and shrinks them, since the purpose is to be able to  
fill a complete page in one go.

>  Whether any of these approaches will be maintainable enough to be
> integrated into the code base is another matter.  It seems like there is
> already a lot of discussion going on around various permutations of copy
> options.

It's not really a COPY mod, since it would also be good for big INSERT  
INTO SELECT FROM which is wal-bound too (even more so than COPY, since  
there is no parsing to do).


pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Streaming Replication patch for CommitFest 2009-09
Next
From: Pierre Frédéric Caillaud
Date:
Subject: Re: Bulk Inserts