Home > mailing lists

Re: Parallel tuplesort (for parallel B-Tree index creation) - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: Parallel tuplesort (for parallel B-Tree index creation)
Date	September 10, 2016 18:10:59
Msg-id	CAM3SWZTtaxXn3P0hkDQh3S7WA_-9_3qfhqRhpZfcBCAQd9OhsA@mail.gmail.com Whole thread Raw
In response to	Re: Parallel tuplesort (for parallel B-Tree index creation) (Claudio Freire <klaussfreire@gmail.com>)
List	pgsql-hackers

Tree view

On Fri, Sep 9, 2016 at 5:22 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
> Since it is true that doing so would make it impossible to keep the
> asserts about tupindex in tuplesort_heap_root_displace, I guess it
> depends on how useful those asserts are (ie: how likely it is that
> those conditions could be violated, and how damaging it could be if
> they were). If it is decided the refactor is desirable, I'd suggest
> making the common siftup producedure static inline, to allow
> tuplesort_heap_root_displace to inline and specialize it, since it
> will be called with checkIndex=False and that simplifies the resulting
> code considerably.

Right. I want to keep it as a separate function for all these reasons.
I also think that I'll end up further optimizing what I've called
tuplesort_heap_root_displace in the future, to adopt to clustered
input. I'm thinking of something like Timsort's "galloping mode". What
I've come up with here still needs 2 comparisons and a swap per call
for presorted input. There is still a missed opportunity for clustered
or (inverse) correlated input -- we can make merging opportunistically
skip ahead to determine that the root tape's 100th tuple (say) would
still fit in the root position of the merge minheap. So, immediately
return 100 tuples from the root's tape without bothering to compare
them to anything. Do a binary search to find the best candidate
minheap root before the 100th tuple if a guess of 100 doesn't work
out. Adapt to trends. Stuff like that.

-- 
Peter Geoghegan

pgsql-hackers by date:

From: Tom Lane
Date: 10 September 2016, 16:51:12
Subject: Re: Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn

From: Andrew Borodin
Date: 10 September 2016, 18:19:24
Subject: Re: GiST penalty functions [PoC]

Re: Parallel tuplesort (for parallel B-Tree index creation) - Mailing list pgsql-hackers

Previous

Next