Home > mailing lists

Re: Inserting heap tuples in bulk in COPY - Mailing list pgsql-hackers

From	Simon Riggs
Subject	Re: Inserting heap tuples in bulk in COPY
Date	August 8, 2012 04:36:31
Msg-id	CA+U5nM+xVHWcrJnfuAHu2H06BdRr6eYbvk=bJ5Dy+sPXbKfLqQ@mail.gmail.com Whole thread
In response to	Re: Inserting heap tuples in bulk in COPY (Jeff Janes <jeff.janes@gmail.com>)
List	pgsql-hackers

Tree view

On 8 August 2012 03:44, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Tue, Aug 7, 2012 at 1:52 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On 7 August 2012 20:58, Jeff Janes <jeff.janes@gmail.com> wrote:
>>> Hi Heikki,
>>>
>>> Is the bulk index insert still an active area for you?
>>>
>>> If not, is there some kind of summary of design or analysis work
>>> already done, which would be a useful for someone else interested in
>>> picking it up?
>>
>> The main cost comes from repeated re-seeking down the index tree to
>> find the insertion point, but repeated lock and pin operations on
>> index buffers are also high. That is optimisable if the index inserts
>> are grouped, as they will be with monotonic indexes, (e.g. SERIAL), or
>> with partial monotonic (i.e. with Parent/Child relationship, on Child
>> table many inserts will be of the form (x,1), (x,2), (x, 3) etc, e.g.
>> Order/OrderLine).
>>
>> All we need do is buffer the inserts in the inserts, before inserting
>> them into the main index. As long as we flush the buffer before end of
>> transaction, we're good.
>>
>> Incidentally, we can also optimise repeated inserts within a normal
>> transaction using this method, by implementing deferred unique
>> constraints. At present we say that unique constraints aren't
>> deferrable, but there's no reason they can't be, if we allow buffering
>> in the index. (Implementing a deferred constraint that won't fail if
>> we do UPDATE foo SET pk = pk+1 requires a buffer the size of foo,
>> which is probably a bad plan, plus you'd need to sort the inputs, so
>> that particular nut is another issue altogether, AFAICS).
>>
>> Suggested implementation is to buffer index tuples at the generic
>> index API, then implement a bulk insert index API call that can then
>> be implemented for each AM separately. Suggested buffer size is
>> work_mem. Note we must extract index tuple from heap tuples -
>> refetching heap blocks to get rows is too costly.
>
> OK, thanks for the summary.  It looks like your plans are to improve
> the case where the index maintenance is already CPU limited.  I was
> more interested in cases where the regions of the index(es) undergoing
> active insertion do not fit into usable RAM, so the limit is the
> random IO needed to fetch the leaf pages in order to update them or to
> write them out once dirtied.  Here too buffering is probably the
> answer, but the size of the buffer needed to turn that IO from
> effectively random to effectively sequential is probably much larger
> than the size of the buffer you are contemplating.

The buffer size can be variable, yes. I was imagining a mechanism that
worked for normal INSERTs as well as COPY. Perhaps we could say buffer
is work_mem with INSERT and maintenance_work_mem with COPY.

Very large index appends are useful, but currently not very easily
usable because of the transactional nature of COPY. If we could reject
rows without ERROR it would be more practical.

I'm not planning to work on this, so all comments for your assistance.

>> I think we need to implement buffering both to end of statement or end
>> of transaction, not just one or the other.
>
> With the planner deciding which would be better, or explicit user action?

Probably both: on/off/choose.

Deferring unique check would change the point at which errors were
reported in a transaction, which might not be desirable for some. I
think SQL standard has something to say about this also, so that needs
care. But in general, if your tables use generated PK values they
should be able to benefit from this, so I would suggest a default
setting of choose.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

From: Matthias
Date: 08 August 2012, 03:17:44
Subject: Re: Statistics and selectivity estimation for ranges

From: Jim Vanns
Date: 08 August 2012, 06:02:08
Subject: Bug in libpq implentation and omission in documentation?

Re: Inserting heap tuples in bulk in COPY - Mailing list pgsql-hackers

Previous

Next