On 06/02/14 18:21, Jeff Janes wrote:
> Did you try sorting already-sorted, reverse
> sorted, or pipe-organ shaped data sets? We will also need to test it on
> strings. I usually use md5(random()::text) to generate strings for such
> purposes, at least for a first pass.
Attached is version 2 of the patch, which fixes the performance on
constant-input.
Summary (low numbers better, vs 858ec11):
Random ints: 83% compares, level on time.
Sorted ints: level compares, 70% time.
Reverse-sorted ints: 10% compares, 15% time (!)
Constant ints: level compares, 75% time
Pipe-organ ints: 80% compares, 107% time
Random text: 83% compares, 106% time
> Another optimization that is possible is to do only one comparison at each
> level (between the two children) all the way down the heap, and then
> compare the parent to the chain of smallest children in reverse order.
> This can substantially reduce the number of comparisons on average,
> because most tuples sift most of the way down the heap (because most of the
> tuples are at the bottom of the heap). Robert submitted a patch to do this
> in the main siftdown routine (which for some reason is
> named tuplesort_heap_siftup, rather than siftdown), and I found it a
> substantial improvement but it never got pushed forward. I think Robert
> was concerned it might make some obscure cases worse rather than better,
> and no one put it through enough serious testing to overcome that concern.
I've done an implementation of this (also attached: siftdown_bounce).
Summary:
Random ints: 72% compares, 104% time.
Sorted ints: 200% compares, 270% time. (ouch)
Reverse-sorted ints: 7% compares, 12% time
Constant ints: 150% compares, 275% time
Pipe-organ ints: 93% compares, 115% time
Random text: 72% compares, 91% time
- I don't like the performance on already-sorted input. Thinking
into this, a sorted array is already a well-formed heap so optimal
behaviour is a single compare per item (lacking the global knowledge).
This "bounce" method will always do a full walk down the tree and
back, hence the poor behaviour. Unless the planner can tell
the sort when sorted input is possible I don't think we dare use
this despite the benefit on random input.
The tests were run with an instrumented variant of each patch, counting
compares and time for just the heapify portion of the inittapes()
routine. Test script (jgh.sql and results spreadsheet) also attached.
--
Cheers,
Jeremy