Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) - Mailing list pgsql-hackers

From Rushabh Lathia
Subject Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Date
Msg-id CAGPqQf0xbnk3r-EYWzGtLw0DrCAxSwmT-Ny=cdFky7uefGBZRQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
List pgsql-hackers


On Sat, Jan 6, 2018 at 3:47 AM, Peter Geoghegan <pg@bowt.ie> wrote:
On Tue, Jan 2, 2018 at 8:43 PM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
> I agree that plan_create_index_workers() needs to count the leader as a
> normal worker for the CREATE INDEX.  So what you proposing is - when
> parallel_leader_participation is true launch (return value of
> compute_parallel_worker() - 1)
> workers. true ?

Almost. We need to not subtract one when only one worker is indicated
by compute_parallel_worker(). I also added some new stuff there, to
consider edge cases with the parallel_leader_participation GUC.

>> I'm working on fixing up what you posted. I'm probably not more than a
>> week away from posting a patch that I'm going to mark "ready for
>> committer". I've already made the change above, and once I spend time
>> on trying to break the few small changes needed within buffile.c I'll
>> have taken it as far as I can, most likely.
>>
>
> Okay, once you submit the patch with changes - I will do one round of
> review for the changes.

I've attached my revision. Changes include:

* Changes to plan_create_index_workers() were made along the lines
recently discussed.

* plan_create_index_workers() is now called right before the ambuild
routine is called (nbtree index builds only, of course).

* Significant overhaul of tuplesort.h contract. This had references to
the old approach, and to tqueue.c's tuple descriptor thing that was
since superseded by the typmod registry added for parallel hash join.
These were updated/removed.

* Both tuplesort.c and logtape.c now say that they cannot write to the
writable/last tape, while still acknowledging that it is in fact the
leader tape, and that this restriction is due to a restriction with
BufFiles. They also point out that if the restriction within buffile.c
ever was removed, everything would work fine.

* Added new call to BufFileExportShared() when freezing tape in logtape.c.

* Tweaks to documentation.

* pgindent ran on modified files.

* Polished the stuff that is added to buffile.c. Mostly comments that
clarify its reason for existing. Also added Assert()s.

Note that I added Heikki as an author in the commit message.
Technically, Heikki didn't actually write code for parallel CREATE
INDEX, but he did loads of independently useful work on merging + temp
file I/O that went into Postgres 10 (though this wasn't listed in the
v10 release notes). That work was done in large part to help the
parallel CREATE INDEX patch, and it did in fact help it quite
noticeably, so I think that this is warranted. Remember that with
parallel CREATE INDEX, the leader's merge occurs serially, so anything
that we can do to speed that part up is very helpful.

This revision does seem very close, but I'll hold off on changing the
status of the patch for a few more days, to give you time to give some
feedback.


Thanks Peter for the updated patch.

I gone through the changes and perform the basic testing. Changes
looks good and haven't found any unusual during testing

Thanks,
Rushabh Lathia

pgsql-hackers by date:

Previous
From: Alvaro Hernandez
Date:
Subject: Re: Finalizing logical replication limitations as well as potentialfeatures
Next
From: Kohei KaiGai
Date:
Subject: Re: FP16 Support?