Re: ExecGather() + nworkers - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: ExecGather() + nworkers
Date
Msg-id CAA4eK1+AXV6M9URdHyuQ7QvweQ73wMHSVe6hkx99aext+4297w@mail.gmail.com
Whole thread Raw
In response to Re: ExecGather() + nworkers  (Peter Geoghegan <pg@heroku.com>)
Responses Re: ExecGather() + nworkers  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Mon, Jan 11, 2016 at 3:14 AM, Peter Geoghegan <pg@heroku.com> wrote:
>
> On Sun, Jan 10, 2016 at 9:13 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> I'm not sure why the test for nworkers following the
> >> LaunchParallelWorkers() call doesn't look like this, though:
> >>
> >>     /* Set up tuple queue readers to read the results. */
> >>     if (pcxt->nworkers_launched > 0)
> >>     {
> >>         ...
> >>     }
> >
> > Hmm, yeah, I guess it could do that.
>
> That would make it clearer as an example.
>
> >> But going to this additional trouble (detecting no workers launched on
> >> the basis of !nworkers_launched) suggests that simply testing
> >> nworkers_launched would be wrong, which AFAICT it isn't. Can't we just
> >> do that, and in so doing also totally remove the "for" loop shown
> >> here?
> >
> > I don't see how the for loop goes away.
>
> I meant that some code in the "for" loop goes away. Not all of it.
> Just the more obscure code. As I said, I'm mostly pointing this out
> out of concern for making it clearer as example code.
>

Right, I can write a patch to do it in a way you are suggesting if you
are not planning to do it.
 
>
> >> In the case of parallel sequential scan, it looks like one worker can
> >> be helpful, because then the gather node (leader process) can run the
> >> plan itself to some degree, and so there are effectively 2 processes
> >> scanning at a minimum (unless 0 workers could be allocated to begin
> >> with). How useful is it to have a parallel scan when this happens,
> >> though?
> >
> > Empirically, that's really quite useful.  When you have 3 or 4
> > workers, the leader really doesn't make a significant contribution to
> > the work, but what I've seen in my testing is that 1 worker often runs
> > almost twice as fast as 0 workers.
>
> I suppose that makes sense, given that parallel sequential scan works
> best when most tuples are eliminated in workers; there ought not to be
> many tuples filling the single worker's queue anyway.
>
> > I don't really understand why this should be so.  I thought the idea
> > of parallel sort is (roughly) that each worker should read data until
> > it fills work_mem, sort that data, and write a tape.  Repeat until no
> > data remains.  Then, merge the tapes.  I don't see any reason at all
> > why this shouldn't work just fine with a leader and 1 worker.
>
> It will work fine with a leader and 1 worker -- the code will be
> correct, and without any special cases. But it will be a suboptimal
> use of resources. From the caller's point of view, there is no reason
> to think it will be faster, and some reason to think it will be
> slower. A particular concern for parallel sort is that the sort might
> not use enough memory to need to be an external sort, but you
> effectively force it to be one by making it a parallel sort (that is
> not ideal in the long term, but it's a good compromise for 9.6's
> parallel index build stuff). You're also consuming a
> BackgroundWorkerSlot for the duration of the sort, in an environment
> where evidently those are in short supply.
>
> Now, you might wonder why it is that the leader cannot also sort runs,
> just as a worker would. It's possible, but it isn't exactly
> straightforward. You have to have special cases in several places,
> even though it probably is going to be uncommon to only have one
> BackgroundWorkerSlot available in practice. It's simpler to just
> opt-out, and seems better given that max_parallel_degree is a way of
> resource limiting based on available cores (it's certainly not about
> the availability of shared memory for the BackgroundWorkerSlot array).
>

If I understand correctly, you are worried about the case where if the
leader is not able to launch the minimum required number of workers,
the parallel index builds will be slower as compare serial index builds.
I think it is genuine to worry about such cases, but it won't be
difficult to just make parallel execution behave as serial execution
(basically, you need to get all the work done by leader).  Now, one
could worry, that there will be some overhead of setting-up and
destroy of workers in this case, but I think that could be treated as
a limitation for the initial version of implementation and if such a
case is more common in general usage, then we could have some
mechanism to reserve the workers and start parallelism only when
the leader is able to reserve required number of workers.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: No Issue Tracker - Say it Ain't So!
Next
From: Amit Kapila
Date:
Subject: Re: PATCH: add pg_current_xlog_flush_location function