Home > mailing lists

Re: Is tuplesort_heap_siftup() a misnomer? - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: Is tuplesort_heap_siftup() a misnomer?
Date	September 8, 2016 17:19:15
Msg-id	CAM3SWZTEENdg1TvjOLRpDk0sA2tPDc9tNrk6_Oq-GjCvd2f9Yw@mail.gmail.com Whole thread Raw
In response to	Re: Is tuplesort_heap_siftup() a misnomer? (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses	Re: Is tuplesort_heap_siftup() a misnomer? Re: Is tuplesort_heap_siftup() a misnomer?
List	pgsql-hackers

Tree view

On Thu, Sep 8, 2016 at 12:01 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> I still think tuplesort_heap_siftup is a confusing name, although I'm not
> sure that Peter's "compact" is much better. I suggest that we rename it to
> tuplesort_heap_delete_top(). In comments within the function, explain that
> the *loop* corresponds to the "siftup" in Knuth's book.

I'm also fine with that name.

> Interestingly, as Knuth explains the siftup algorithm, it performs a
> "replace the top" operation, rather than "remove the top". But we use it to
> remove the top, by moving the last element to the top and running the
> algorithm after that. Peter's other patch, to change the way we currently
> replace the top node, to do it in one pass rather than as delete+insert, is
> actually Knuth's siftup algorithm.

Knuth must have a strange sense of humor.

> Peter, looking at your "displace" patch in this light, I think
> tuplesort_heap_root_displace() and tuplesort_heap_delete_top() (as I'm
> calling it now), should share a common subroutine. Displace replaces the top
> node with the new node passed as argument, and then calls the subroutine to
> push it down to the right place. Delete_top replaces the top node with the
> last value in the heap, and then calls the subroutine. Or perhaps delete_top
> should remove the last value from the heap, and then call displace() with
> it, to re-insert it.

I can see the value in that, but I'd still rather not. The code reuse
win isn't all that large, and having a separate
tuplesort_heap_root_displace() routine allows us to explain what's
going on for merging, to assert what we're able to assert with tape
numbers vs. heap position, and to lose the HEAPCOMPARE()/checkIndex
cruft that the existing routines need (if only barely) to support
replacement selection. I would be surprised if with a very tight inner
loop like this, HEAPCOMPARE() has an appreciable overhead (maybe it
increases pipeline stalls).

-- 
Peter Geoghegan

pgsql-hackers by date:

From: Claudio Freire
Date: 08 September 2016, 17:18:55
Subject: Re: Parallel tuplesort (for parallel B-Tree index creation)

From: Peter Geoghegan
Date: 08 September 2016, 17:23:22
Subject: Re: Parallel tuplesort (for parallel B-Tree index creation)

Re: Is tuplesort_heap_siftup() a misnomer? - Mailing list pgsql-hackers

Previous

Next