Home > mailing lists

Re: Parallel copy - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Parallel copy
Date	April 13, 2020 20:16:33
Msg-id	20200413201633.cki4nsptynq7blhg@alap3.anarazel.de Whole thread Raw
In response to	Re: Parallel copy (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Parallel copy Re: Parallel copy
List	pgsql-hackers

Tree view

Hi,

On 2020-04-13 14:13:46 -0400, Robert Haas wrote:
> On Fri, Apr 10, 2020 at 2:26 PM Andres Freund <andres@anarazel.de> wrote:
> > > Still, it might be the case that having the process that is reading
> > > the data also find the line endings is so fast that it makes no sense
> > > to split those two tasks. After all, whoever just read the data must
> > > have it in cache, and that helps a lot.
> >
> > Yea. And if it's not fast enough to split lines, then we have a problem
> > regardless of which process does the splitting.
> 
> Still, if the reader does the splitting, then you don't need as much
> IPC, right? The shared memory data structure is just a ring of bytes,
> and whoever reads from it is responsible for the rest.

I don't think so. If only one process does the splitting, the
exclusively locked section is just popping off a bunch of offsets of the
ring. And that could fairly easily be done with atomic ops (since what
we need is basically a single producer multiple consumer queue, which
can be done lock free fairly easily ). Whereas in the case of each
process doing the splitting, the exclusively locked part is splitting
along lines - which takes considerably longer than just popping off a
few offsets.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Robert Haas
Date: 13 April 2020, 20:16:06
Subject: Re: documenting the backup manifest file format

From: Robert Haas
Date: 13 April 2020, 20:18:46
Subject: Re: Poll: are people okay with function/operator table redesign?

Re: Parallel copy - Mailing list pgsql-hackers

Previous

Next