Re: Parallel copy - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel copy
Date
Msg-id CAA4eK1L+SX=ov6K9kjkWDpbOHP573JrDPj8F=L_ETHHiTX7M=A@mail.gmail.com
Whole thread Raw
In response to Re: Parallel copy  (Ants Aasma <ants@cybertec.at>)
Responses Re: Parallel copy  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Thu, Apr 9, 2020 at 3:55 AM Ants Aasma <ants@cybertec.at> wrote:
>
> On Wed, 8 Apr 2020 at 22:30, Robert Haas <robertmhaas@gmail.com> wrote:
>
> > - The portion of the time that is used to split the lines is not
> > easily parallelizable. That seems to be a fairly small percentage for
> > a reasonably wide table, but it looks significant (13-18%) for a
> > narrow table. Such cases will gain less performance and be limited to
> > a smaller number of workers. I think we also need to be careful about
> > files whose lines are longer than the size of the buffer. If we're not
> > careful, we could get a significant performance drop-off in such
> > cases. We should make sure to pick an algorithm that seems like it
> > will handle such cases without serious regressions and check that a
> > file composed entirely of such long lines is handled reasonably
> > efficiently.
>
> I don't have a proof, but my gut feel tells me that it's fundamentally
> impossible to ingest csv without a serial line-ending/comment
> tokenization pass.
>

I think even if we try to do it via multiple workers it might not be
better.  In such a scheme,  every worker needs to update the end
boundaries and the next worker to keep a check if the previous has
updated the end pointer.  I think this can add a significant
synchronization effort for cases where tuples are of 100 or so bytes
which will be a common case.

> The current line splitting algorithm is terrible.
> I'm currently working with some scientific data where on ingestion
> CopyReadLineText() is about 25% on profiles. I prototyped a
> replacement that can do ~8GB/s on narrow rows, more on wider ones.
>

Good to hear.  I think that will be a good project on its own and that
might give a boost to parallel copy as with that we can further reduce
the non-parallelizable work unit.

> For rows that are consistently wider than the input buffer I think
> parallelism will still give a win - the serial phase is just memcpy
> through a ringbuffer, after which a worker goes away to perform the
> actual insert, letting the next worker read the data. The memcpy is
> already happening today, CopyReadLineText() copies the input buffer
> into a StringInfo, so the only extra work is synchronization between
> leader and worker.
>
>
> > - There could also be similar contention on the heap. Say the tuples
> > are narrow, and many backends are trying to insert tuples into the
> > same heap page at the same time. This would lead to many lock/unlock
> > cycles. This could be avoided if the backends avoid targeting the same
> > heap pages, but I'm not sure there's any reason to expect that they
> > would do so unless we make some special provision for it.
>
> I thought there already was a provision for that. Am I mis-remembering?
>

The copy uses heap_multi_insert to insert batch of tuples and I think
each batch should ideally use a different page mostly it will be a new
page. So, not sure if this will be a problem or a problem of a level
for which we need to do some special handling.  But if this turns out
to be a problem, we definetly need some better way to deal with it.

> > - What else? I bet the above list is not comprehensive.
>
> I think parallel copy patch needs to concentrate on splitting input
> data to workers. After that any performance issues would be basically
> the same as a normal parallel insert workload. There may well be
> bottlenecks there, but those could be tackled independently.
>

I agree.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Parallel copy
Next
From: tushar
Date:
Subject: Re: [Proposal] Global temporary tables