> That's worth trying out. It might also then be worth trying to push both unordered values -- the big one up / the
smallone down. I've seen other implementations do that, but don't remember where, or what it's called.
It is important that we do not do 2 compares two avoid one copy (assignment to temporary) as you did in your patch
earlierin this thread, cause compares are usually pretty costly (also two compares are then inlined, bloating the
binary).
Assigning a sort tuple to a temporary translates to pretty simple assembly code, so my suggested modification should
outperform.It cuts down the cost of the inner loop by ca. 40% comparing the assembly. And it avoids having to re-read
memoryon each comparison, as the temporary can be held in registers.
> "Unbounded" means no bounds check on the array. That's not possible in its current form, so I think you misunderstood
something.
Sorry for the confusion. I didn't mean unbounded in the "array bound checking" sense, but in the "unrestricted number
ofloops" sense.
> I only remember implementations tracking loop iterations, not swaps. You'd need evidence that this is better.
Well, the idea was to include the presorted check somehow. Stopping after a certain number of iterations is surely more
safethan stopping after a number of swaps, but we would then implicitly also stop our presort check. We could change
thatthough: Count loop iterations and on bailout continue with a pure presort check, but from the last position of the
insertionsort -- not all over again -- by comparing against the maximum value recorded during the insertion sort.
Thoughts?
> An important part not mentioned yet: This might only be worth doing if the previous recursion level performed no
swapsduring partitioning and the current pivot candidates are ordered.
Agreed.
> It's currently 7, but should really be something like 10. A script that repeats tests for, say, 7 through 18 should
showa concave-up shape in the times. The bottom of the bowl should shift to higher values, and that minimum is what
shouldbe compared.
Yeah, as alluded to before, it should be closer to 10 nowadays.
In any case it should be changed at least from 7 to 8, cause then we could at least discard the additional check for n
>7 in the quicksort code path (see /src/include/lib/sort_template.h#L322). Currently we check n < 7 and a few lines
downwe check for n > 7, if we check n < 8 for insertion sort then the second check becomes obsolete.
Benjamin Coutu
http://www.zeyos.com