Re: Parallel tuplesort (for parallel B-Tree index creation) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Parallel tuplesort (for parallel B-Tree index creation)
Date
Msg-id CAM3SWZR+jd5nq0L_Kou-1N5BfDFwv32adBfgGvwt0sEvwGjxGw@mail.gmail.com
Whole thread Raw
In response to Re: Parallel tuplesort (for parallel B-Tree index creation)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Parallel tuplesort (for parallel B-Tree index creation)
List pgsql-hackers
On Thu, Sep 22, 2016 at 8:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Sep 22, 2016 at 3:51 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> It'd be good if you could overlap the final merges in the workers with the
>> merge in the leader. ISTM it would be quite straightforward to replace the
>> final tape of each worker with a shared memory queue, so that the leader
>> could start merging and returning tuples as soon as it gets the first tuple
>> from each worker. Instead of having to wait for all the workers to complete
>> first.
>
> If you do that, make sure to have the leader read multiple tuples at a
> time from each worker whenever possible.  It makes a huge difference
> to performance.  See bc7fcab5e36b9597857fa7e3fa6d9ba54aaea167.

That requires some kind of mutual exclusion mechanism, like an LWLock.
It's possible that merging everything lazily is actually the faster
approach, given this, and given the likely bottleneck on I/O at htis
stage. It's also certainly simpler to not overlap things. This is
something I've read about before [1], with "eager evaluation" sorting
not necessarily coming out ahead IIRC.

[1] http://digitalcommons.ohsu.edu/cgi/viewcontent.cgi?article=1193&context=csetech
-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Possibly too stringent Assert() in b-tree code
Next
From: Fabien COELHO
Date:
Subject: Re: raised checkpoint limit & manual checkpoint