Re: Parallel copy - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel copy
Date
Msg-id CAA4eK1LNq_juinpTUhsLMwLQiz5Q0mv6=MebiMtP0TZ3ijVyyg@mail.gmail.com
Whole thread Raw
In response to Re: Parallel copy  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Thu, Apr 9, 2020 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 9, 2020 at 1:00 AM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Tue, Apr 7, 2020 at 9:38 AM Ants Aasma <ants@cybertec.at> wrote:
> > >
> > > With option 1 it's not possible to read input data into shared memory
> > > and there needs to be an extra memcpy in the time critical sequential
> > > flow of the leader. With option 2 data could be read directly into the
> > > shared memory buffer. With future async io support, reading and
> > > looking for tuple boundaries could be performed concurrently.
> >
> > But option 2 still seems significantly worse than your proposal above, right?
> >
> > I really think we don't want a single worker in charge of finding
> > tuple boundaries for everybody. That adds a lot of unnecessary
> > inter-process communication and synchronization. Each process should
> > just get the next tuple starting after where the last one ended, and
> > then advance the end pointer so that the next process can do the same
> > thing. Vignesh's proposal involves having a leader process that has to
> > switch roles - he picks an arbitrary 25% threshold - and if it doesn't
> > switch roles at the right time, performance will be impacted. If the
> > leader doesn't get scheduled in time to refill the queue before it
> > runs completely empty, workers will have to wait. Ants's scheme avoids
> > that risk: whoever needs the next tuple reads the next line. There's
> > no need to ever wait for the leader because there is no leader.
> >
>
> Hmm, I think in his scheme also there is a single reader process.  See
> the email above [1] where he described how it should work.
>

oops, I forgot to specify the link to the email.  See
https://www.postgresql.org/message-id/CANwKhkO87A8gApobOz_o6c9P5auuEG1W2iCz0D5CfOeGgAnk3g%40mail.gmail.com


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Parallel copy
Next
From: Amit Kapila
Date:
Subject: Re: Parallel copy