Re: Insertion Sort Improvements - Mailing list pgsql-hackers

From John Naylor
Subject Re: Insertion Sort Improvements
Date
Msg-id CAFBsxsHPrvpqfNpHHC6R9fTe_us=AJ4_REWdUO9NuHcwPFF1tA@mail.gmail.com
Whole thread Raw
In response to Insertion Sort Improvements  (Benjamin Coutu <ben.coutu@zeyos.com>)
Responses Re: Insertion Sort Improvements
List pgsql-hackers
On Thu, Aug 25, 2022 at 5:55 PM Benjamin Coutu <ben.coutu@zeyos.com> wrote:
>
> Hello,
>
> Inspired by the recent discussions[1][2] around sort improvements, I took a look around the code and noticed the use
ofa somewhat naive version of insertion sort within the broader quicksort code.
 
>
> The current implementation (see sort_template.h) is practically the textbook version of insertion sort:

Right. I think it's worth looking into. When I tested dual-pivot
quicksort, a threshold of 18 seemed best for my inputs, so making
insertion sort more efficient could tilt the balance more in favor of
dual-pivot. (It already seems slightly ahead, but as I mentioned in
the thread you linked, removing branches for null handling would make
it more compelling).

> DO_ASSIGN and DO_COPY macros would have to be declared analogue to DO_SWAP via the template.

I don't think you need these macros, since this optimization is only
convenient if you know the type at compile time. See the attached,
which I had laying around when I was looking at PDQuicksort. I believe
it's similar to what you have in mind. (Ignore the part about
"unguarded", it's irrelevant out of context). Notice that it avoids
unnecessary copies, but has two calls to DO_COMPARE, which is *not*
small for tuples.

> Anyways, there might be some low hanging fruit here. If it turns out to be significantly faster one might even
considerincreasing the insertion sort threshold from < 7 to < 10 sort elements. But that is a whole other discussion
foranother day.
 

I think it belongs around 10 now anyway. I also don't think you'll see
much benefit with your proposal at a threshold of 7 -- I suspect it'd
be more enlightening to test a range of thresholds with and without
the patch, to see how the inflection point shifts. That worked pretty
well when testing dual-pivot.

-- 
John Naylor
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: making relfilenodes 56 bits
Next
From: Justin Pryzby
Date:
Subject: Re: Bump MIN_WINNT to 0x0600 (Vista) as minimal runtime in 16~