Re: Parallel tuplesort (for parallel B-Tree index creation) - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Parallel tuplesort (for parallel B-Tree index creation)
Date
Msg-id b4615f37-70e7-58e4-3e68-6122b02f15c9@iki.fi
Whole thread Raw
In response to Parallel tuplesort (for parallel B-Tree index creation)  (Peter Geoghegan <pg@heroku.com>)
Responses Re: Parallel tuplesort (for parallel B-Tree index creation)  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 08/02/2016 01:18 AM, Peter Geoghegan wrote:
> No merging in parallel
> ----------------------
>
> Currently, merging worker *output* runs may only occur in the leader
> process. In other words, we always keep n worker processes busy with
> scanning-and-sorting (and maybe some merging), but then all processes
> but the leader process grind to a halt (note that the leader process
> can participate as a scan-and-sort tuplesort worker, just as it will
> everywhere else, which is why I specified "parallel_workers = 7" but
> talked about 8 workers).
>
> One leader process is kept busy with merging these n output runs on
> the fly, so things will bottleneck on that, which you saw in the
> example above. As already described, workers will sometimes merge in
> parallel, but only their own runs -- never another worker's runs. I
> did attempt to address the leader merge bottleneck by implementing
> cross-worker run merging in workers. I got as far as implementing a
> very rough version of this, but initial results were disappointing,
> and so that was not pursued further than the experimentation stage.
>
> Parallel merging is a possible future improvement that could be added
> to what I've come up with, but I don't think that it will move the
> needle in a really noticeable way.

It'd be good if you could overlap the final merges in the workers with 
the merge in the leader. ISTM it would be quite straightforward to 
replace the final tape of each worker with a shared memory queue, so 
that the leader could start merging and returning tuples as soon as it 
gets the first tuple from each worker. Instead of having to wait for all 
the workers to complete first.

- Heikki




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618”: Permission denied”
Next
From: Robert Haas
Date:
Subject: Re: Parallel tuplesort (for parallel B-Tree index creation)