Re: Parallel copy - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Parallel copy
Date
Msg-id 72363fde-b8ac-397f-67e1-ed5f74909cd8@iki.fi
Whole thread Raw
In response to Re: Parallel copy  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-hackers
On 02/11/2020 09:10, Heikki Linnakangas wrote:
> On 02/11/2020 08:14, Amit Kapila wrote:
>> We have discussed both these approaches (a) single producer multiple
>> consumer, and (b) all workers doing the processing as you are saying
>> in the beginning and concluded that (a) is better, see some of the
>> relevant emails [1][2][3].
>>
>> [1] - https://www.postgresql.org/message-id/20200413201633.cki4nsptynq7blhg%40alap3.anarazel.de
>> [2] - https://www.postgresql.org/message-id/20200415181913.4gjqcnuzxfzbbzxa%40alap3.anarazel.de
>> [3] - https://www.postgresql.org/message-id/78C0107E-62F2-4F76-BFD8-34C73B716944%40anarazel.de
> 
> Sorry I'm late to the party. I don't think the design I proposed was
> discussed in that threads. The alternative that's discussed in that
> thread seems to be something much more fine-grained, where processes
> claim individual lines. I'm not sure though, I didn't fully understand
> the alternative designs.

I read the thread more carefully, and I think Robert had basically the 
right idea here 
(https://www.postgresql.org/message-id/CA%2BTgmoZMU4az9MmdJtg04pjRa0wmWQtmoMxttdxNrupYJNcR3w%40mail.gmail.com):

> I really think we don't want a single worker in charge of finding
> tuple boundaries for everybody. That adds a lot of unnecessary
> inter-process communication and synchronization. Each process should
> just get the next tuple starting after where the last one ended, and
> then advance the end pointer so that the next process can do the same
> thing. [...]

And here 
(https://www.postgresql.org/message-id/CA%2BTgmoZw%2BF3y%2BoaxEsHEZBxdL1x1KAJ7pRMNgCqX0WjmjGNLrA%40mail.gmail.com):

> On Thu, Apr 9, 2020 at 2:55 PM Andres Freund
<andres(at)anarazel(dot)de> wrote:
>> I'm fairly certain that we do *not* want to distribute input data
>> between processes on a single tuple basis. Probably not even below
>> a few
hundred kb. If there's any sort of natural clustering in the loaded data
- extremely common, think timestamps - splitting on a granular basis
will make indexing much more expensive. And have a lot more contention.
> 
> That's a fair point. I think the solution ought to be that once any
> process starts finding line endings, it continues until it's grabbed
> at least a certain amount of data for itself. Then it stops and lets
> some other process grab a chunk of data.
Yes! That's pretty close to the design I sketched. I imagined that the 
leader would divide the input into 64 kB blocks, and each block would 
have  few metadata fields, notably the starting position of the first 
line in the block. I think Robert envisioned having a single "next 
starting position" field in shared memory. That works too, and is even 
simpler, so +1 for that.

For some reason, the discussion took a different turn from there, to 
discuss how the line-endings (called "chunks" in the discussion) should 
be represented in shared memory. But none of that is necessary with 
Robert's design.

- Heikki



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Next
From: Thomas Munro
Date:
Subject: Re: Collation versioning