Re: A qsort template - Mailing list pgsql-hackers

From John Naylor
Subject Re: A qsort template
Date
Msg-id CAFBsxsFVG8-Q-98C__N=brinC6pAFs7kGjAZj5LO77M8VmTn-Q@mail.gmail.com
Whole thread Raw
In response to Re: A qsort template  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: A qsort template
List pgsql-hackers
On Tue, Jun 29, 2021 at 2:56 AM Thomas Munro <thomas.munro@gmail.com> wrote:

> Here's a version that includes a rather hackish test module that you
> might find useful to explore various weird effects.  Testing sorting
> routines is really hard, of course... there's a zillion parameters and
> things you could do in the data and cache effects etc etc.  One of the

That module is incredibly useful!

Yeah, while brushing up on recent findings on sorting, it's clear there's a huge amount of options with different tradeoffs. I did see your tweet last year about the "small sort" threshold that was tested on a VAX machine, but hadn't given it any thought til now. Looking around, I've seen quite a range, always with the caveat of "it depends". A couple interesting variations:

Golang uses 12, with an extra tweak:

// Do ShellSort pass with gap 6
// It could be written in this simplified form cause b-a <= 12
for i := a + 6; i < b; i++ {
    if data.Less(i, i-6) {
        data.Swap(i, i-6)
    }
}
insertionSort(data, a, b)

Andrei Alexandrescu gave a couple talks discussing the small-sort part of quicksort, and demonstrated a ruthlessly-optimized make-heap + unguarded-insertion-sort, using a threshold of 256. He reported a 6% speed-up sorting a million doubles, IIRC:


That might not be workable for us, but it's a fun talk. 

> main things that jumps out pretty clearly though with these simple
> tests is that sorting 6 byte ItemPointerData objects is *really slow*
> compared to more natural object sizes (look at the times and the
> MEMORY values in the scripts).  Another is that specialised sort
> functions are much faster than traditional qsort (being one of the
> goals of this exercise).  Sadly, the 64 bit comparison technique is
> not looking too good in the output of this test.

One of the points of the talk I linked to is "if doing the sensible thing makes things worse, try something silly instead".

Anyway, I'll play around with the scripts and see if something useful pops out.

--
John Naylor
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Dump public schema ownership & seclabels
Next
From: Daniel Gustafsson
Date:
Subject: Re: Have I found an interval arithmetic bug?