Home > mailing lists

Re: A qsort template - Mailing list pgsql-hackers

From	John Naylor
Subject	Re: A qsort template
Date	June 29, 2021 16:40:53
Msg-id	CAFBsxsFVG8-Q-98C__N=brinC6pAFs7kGjAZj5LO77M8VmTn-Q@mail.gmail.com Whole thread Raw
In response to	Re: A qsort template (Thomas Munro <thomas.munro@gmail.com>)
Responses	Re: A qsort template
List	pgsql-hackers

Tree view

On Tue, Jun 29, 2021 at 2:56 AM Thomas Munro <thomas.munro@gmail.com> wrote:

> Here's a version that includes a rather hackish test module that you
> might find useful to explore various weird effects. Testing sorting
> routines is really hard, of course... there's a zillion parameters and
> things you could do in the data and cache effects etc etc. One of the

That module is incredibly useful!

Yeah, while brushing up on recent findings on sorting, it's clear there's a huge amount of options with different tradeoffs. I did see your tweet last year about the "small sort" threshold that was tested on a VAX machine, but hadn't given it any thought til now. Looking around, I've seen quite a range, always with the caveat of "it depends". A couple interesting variations:

Golang uses 12, with an extra tweak:

// Do ShellSort pass with gap 6
// It could be written in this simplified form cause b-a <= 12
for i := a + 6; i < b; i++ {
if data.Less(i, i-6) {
data.Swap(i, i-6)
}
}

insertionSort(data, a, b)

Andrei Alexandrescu gave a couple talks discussing the small-sort part of quicksort, and demonstrated a ruthlessly-optimized make-heap + unguarded-insertion-sort, using a threshold of 256. He reported a 6% speed-up sorting a million doubles, IIRC:

video: https://www.youtube.com/watch?v=FJJTYQYB1JQ

slides: https://github.com/CppCon/CppCon2019/blob/master/Presentations/speed_is_found_in_the_minds_of_people/speed_is_found_in_the_minds_of_people__andrei_alexandrescu__cppcon_2019.pdf

That might not be workable for us, but it's a fun talk.

> main things that jumps out pretty clearly though with these simple
> tests is that sorting 6 byte ItemPointerData objects is *really slow*
> compared to more natural object sizes (look at the times and the
> MEMORY values in the scripts). Another is that specialised sort
> functions are much faster than traditional qsort (being one of the
> goals of this exercise). Sadly, the 64 bit comparison technique is
> not looking too good in the output of this test.

One of the points of the talk I linked to is "if doing the sensible thing makes things worse, try something silly instead".

Anyway, I'll play around with the scripts and see if something useful pops out.

--
John Naylor
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Tom Lane
Date: 29 June 2021, 16:32:08
Subject: Re: Dump public schema ownership & seclabels

From: Daniel Gustafsson
Date: 29 June 2021, 16:49:45
Subject: Re: Have I found an interval arithmetic bug?

Re: A qsort template - Mailing list pgsql-hackers

Previous

Next