Home > mailing lists

Re: Parallel tuplesort (for parallel B-Tree index creation) - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: Parallel tuplesort (for parallel B-Tree index creation)
Date	November 10, 2016 01:03:23
Msg-id	CAM3SWZSBseLRRPFT8wktML57QAFayt2e1zeaZVJp7bZg6a4gTg@mail.gmail.com Whole thread Raw
In response to	Re: Parallel tuplesort (for parallel B-Tree index creation) (Peter Geoghegan <pg@heroku.com>)
List	pgsql-hackers

Tree view

On Wed, Nov 9, 2016 at 4:54 PM, Peter Geoghegan <pg@heroku.com> wrote:
> It's more complicated than that. As I said, I think that Knuth
> basically had it right with his sweet spot of 7. I think that commit
> df700e6b40195d28dc764e0c694ac8cef90d4638 was effective in large part
> because a one-pass merge avoided certain overheads not inherent to
> polyphase merge, like all that memory accounting stuff, extra palloc()
> traffic, etc. The expanded use of per tape buffering we have even in
> multi-pass cases likely makes that much less true for us these days.

Also, logtape.c fragmentation made multiple merge pass cases
experience increased random I/O in a way that was only an accident of
our implementation. We've fixed that now, but that problem must have
added further cost that df700e6b40195d28dc764e0c694ac8cef90d4638
*masked* when it was commited in 2006. (I do think that the problem
with the merge heap maintenance fixed recently in
24598337c8d214ba8dcf354130b72c49636bba69 was the biggest problem that
the 2006 work masked, though).

-- 
Peter Geoghegan

pgsql-hackers by date:

From: Peter Geoghegan
Date: 10 November 2016, 00:54:45
Subject: Re: Parallel tuplesort (for parallel B-Tree index creation)

From: Robert Haas
Date: 10 November 2016, 02:57:31
Subject: Re: Parallel tuplesort (for parallel B-Tree index creation)

Re: Parallel tuplesort (for parallel B-Tree index creation) - Mailing list pgsql-hackers

Previous

Next