Home > mailing lists

Re: A qsort template - Mailing list pgsql-hackers

From	Thomas Munro
Subject	Re: A qsort template
Date	July 1, 2021 22:09:39
Msg-id	CA+hUKG+S5SMoG8Z2PHj0bsK70CxVLgqQR1orQJq6Cjgibu26vA@mail.gmail.com Whole thread
In response to	Re: A qsort template (John Naylor <john.naylor@enterprisedb.com>)
Responses	Re: A qsort template
List	pgsql-hackers

Tree view

On Fri, Jul 2, 2021 at 4:39 AM John Naylor <john.naylor@enterprisedb.com> wrote:
> For item pointers, it made sense to try doing math to reduce the number of branches. That made things worse, so let's
trythe opposite: Increase the number of branches so we do less math. In the attached patch (applies on top of your 0012
anda .txt to avoid confusing the CF bot), I test a new comparator with this approach, and also try a wider range of
thresholds.The thresholds don't seem to make any noticeable difference with this data type, but the new comparator
(cmp=idsbelow) gives a nice speedup in this test: 

> NOTICE:  [traditional qsort] order=random, threshold=7, cmp=32, test=0, time=4.964657

> NOTICE:  order=random, threshold=7, cmp=std, test=0, time=2.810627

> NOTICE:  order=random, threshold=7, cmp=ids, test=0, time=1.692711

Oooh.  So, the awkwardness of the 64 maths with unaligned inputs (even
though we obtain all inputs with 16 bit loads) was hurting, and you
realised the same sort of thing might be happening also with the 32
bit version and went the other way.  (It'd be nice to understand
exactly why.)

I tried your 16 bit comparison version on Intel, AMD and Apple CPUs
and the results were all in the same ballpark.  For random input, I
see something like ~1.7x speedup over traditional qsort from
specialising (cmp=std), and ~2.7x from going 16 bit (cmp=ids).  For
increasing and decreasing input, it's ~2x speedup from specialising
and ~4x speedup from going 16 bit.  Beautiful.

One thing I'm wondering about is whether it's worth having stuff to
support future experimentation like ST_SORT_SMALL_THRESHOLD and
ST_COMPARE_RET_TYPE in the tree, or whether we should pare it back to
the minimal changes that definitely produce results.  I think I'd like
to keep those changes: even if it may be some time, possibly an
infinite amount, before we figure out how to tune the thresholds
profitably, giving them names instead of using magic numbers seems
like progress.

The Alexandrescu talk was extremely entertaining, thanks.

pgsql-hackers by date:

From: Daniel Gustafsson
Date: 01 July 2021, 21:40:27
Subject: Re: SSL/TLS instead of SSL in docs

From: Hywel Carver
Date: 01 July 2021, 22:56:37
Subject: Re: Removing unneeded self joins

Re: A qsort template - Mailing list pgsql-hackers

Previous

Next