Re: should INSERT SELECT use a BulkInsertState? - Mailing list pgsql-hackers

From Luc Vlaming
Subject Re: should INSERT SELECT use a BulkInsertState?
Date
Msg-id bd0dc980-c4c9-29c9-8cb6-8d7787ae9fef@swarm64.com
Whole thread Raw
In response to Re: should INSERT SELECT use a BulkInsertState?  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
On 30.10.20 05:51, Justin Pryzby wrote:
> On Thu, Oct 22, 2020 at 01:29:53PM +0100, Simon Riggs wrote:
>> On Fri, 16 Oct 2020 at 22:05, Justin Pryzby <pryzby@telsasoft.com> wrote:
>>
>>>>> I made this conditional on BEGIN BULK/SET bulk, so I'll solicit comments on that.
>>
>> I think it would be better if this was self-tuning. So that we don't
>> allocate a bulkinsert state until we've done say 100 (?) rows
>> inserted.
> 
> I made it an optional, non-default behavior in response to the legitimate
> concern for performance regression for the cases where a loader needs to be as
> fast as possible - as compared with our case, where we want instead to optimize
> for our reports by making the loaders responsible for their own writes, rather
> than leaving behind many dirty pages, and clobbering the cache, too.
> 
> Also, INSERT SELECT doesn't immediately help us (telsasoft), since we use
> INSERT .. VALUES () .. ON CONFLICT.  This would handle that case, which is
> great, even though that wasn't a design goal.  It could also be an integer GUC
> to allow configuring the size of the ring buffer.
> 
>> You should also use table_multi_insert() since that will give further
>> performance gains by reducing block access overheads. Switching from
>> single row to multi-row should also only happen once we've loaded a
>> few rows, so we don't introduce overahads for smaller SQL statements.
> 
> Good idea...multi_insert (which reduces the overhead of individual inserts) is
> mostly independent from BulkInsert state (which uses a ring-buffer to avoid
> dirtying the cache).  I made this 0002.
> 
> This makes INSERT SELECT several times faster, and not clobber the cache too.
> 
> Time: 4700.606 ms (00:04.701)
>     123 |          1
>      37 |          2
>      20 |          3
>      11 |          4
>    4537 |          5
>   11656 |
> 
> Time: 1125.302 ms (00:01.125)
>    2171 |          1
>      37 |          2
>      20 |          3
>      11 |          4
>     111 |          5
>   14034 |
> 
> When enabled, this passes nearly all regression tests, and all but 2 of the
> changes are easily understood.  The 2nd patch still needs work.
> 

Hi,

Came across this thread because I'm working on an improvement for the 
relation extension to improve the speed of the bulkinsert itself in 
(highly) parallel cases and would like to make sure that our approaches 
work nicely together.

Given what I've seen and tried so far with various benchmarks I would 
also really like to see a different approach here. The "BEGIN BULK" can 
be problematic for example if you mix small amounts of inserts and big 
amounts in the same transaction, or if your application possibly does a 
bulk insert but otherwise mostly OLTP transactions.

To me the idea from Simon sounds good to only use a bulk insert state 
after inserting e.g. a 1000 rows, and this also seems more applicable to 
most applications compared to requiring a change to any application that 
wishes to have faster ingest.

Another approach could be to combine this, for example, with a few extra 
requirements to limit the amount of regressions and first learn more how 
this behaves in the field.
We could, for example, only (just throwing out some ideas), require that:
- the relation has a certain size
- a BufferStrategy a maximum certain size is used
- there is a certain amount of lock waiters on relation extension. (like 
we do with bulk extend)
- we have extended the relation for at least e.g. 4 MB and not used the 
FSM anymore thereby proving that we are doing bulk operations instead of 
random small extensions everywhere into the relation that use the FSM.

Another thing is that we first try to improve the bulk operation 
facilities in general and then have another shot at this? Not sure if 
there is some benchmark / query that shows where such a 10x slowdown 
would appear but maybe that would be worth a look as well possibly.

Regards,
Luc



pgsql-hackers by date:

Previous
From: Greg Nancarrow
Date:
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Next
From: Julien Rouhaud
Date:
Subject: Re: Collation versioning