Re: Parallel tuplesort (for parallel B-Tree index creation) - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Parallel tuplesort (for parallel B-Tree index creation) |
Date | |
Msg-id | CAA4eK1J-=1mzxgGqQHnA-eQ1d-7adCErqd2GmOCc2oAnn6-Qwg@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel tuplesort (for parallel B-Tree index creation) (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: Parallel tuplesort (for parallel B-Tree index creation)
Re: Parallel tuplesort (for parallel B-Tree index creation) |
List | pgsql-hackers |
On Thu, Oct 13, 2016 at 12:35 AM, Peter Geoghegan <pg@heroku.com> wrote: > On Wed, Oct 12, 2016 at 11:09 AM, Robert Haas <robertmhaas@gmail.com> wrote: > >> On the flip side, what if anything can queries hope to get out of >> parallel sort that they can't get out of Gather Merge? One >> possibility is that a parallel sort might end up being substantially >> faster than Gather Merge-over-non-parallel sort. In that case, we >> obviously should prefer it. > > I must admit that I don't know enough about it to comment just yet. > Offhand, it occurs to me that the Gather Merge sorted input could come > from a number of different types of paths/nodes, whereas adopting what > I've done here could only work more or less equivalently to "Gather > Merge -> Sort -> Seq Scan" -- a special case, really. > >> For example, it's possible that you might want to have all >> workers participate in sorting some data and then divide the result of >> the sort into equal ranges that are again divided among the workers, >> or that you might want all workers to sort and then each worker to >> read a complete copy of the output data. But these don't seem like >> particularly mainstream needs, nor do they necessarily seem like >> problems that parallel sort itself should be trying to solve. > > This project of mine is about parallelizing tuplesort.c, which isn't > really what you want for parallel query -- you shouldn't try to scope > the problem as "make the sort more scalable using parallelism" there. > Rather, you want to scope it at "make the execution of the entire > query more scalable using parallelism", which is really quite a > different thing, which necessarily involves the executor having direct > knowledge of partition boundaries. > Okay, but what is the proof or why do you think second is going to better than first? One thing which strikes as a major difference between your approach and Gather Merge is that in your approach leader has to wait till all the workers have done with their work on sorting whereas with Gather Merge as soon as first one is done, leader starts with merging. I could be wrong here, but if I understood it correctly, then there is a argument that Gather Merge kind of approach can win in cases where some of the workers can produce sorted outputs ahead of others and I am not sure if we can dismiss such cases. +struct Sharedsort +{ .. + * Workers increment workersFinished to indicate having finished. If + * this is equal to state.launched within the leader, leader is ready + * to merge runs. + * + * leaderDone indicates if leader is completely done (i.e., was + * tuplesort_end called against the state through which parallel output + * was consumed?) + */ + int currentWorker; + int workersFinished; .. } By looking at 'workersFinished' usage, it looks like you have devised a new way for leader to know when workers have finished which might be required for this patch. However, have you tried to use or investigate if existing infrastructure which serves same purpose could be used for it? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: