Re: A qsort template - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: A qsort template
Date
Msg-id CA+hUKGJRbzaAOUtBUcjF5hLtaSHnJUqXmtiaLEoi53zeWSizeA@mail.gmail.com
Whole thread Raw
In response to Re: A qsort template  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: A qsort template
Re: A qsort template
Re: A qsort template
Re: A qsort template
List pgsql-hackers
Hi,

David Rowley privately reported a performance regression when sorting
single ints with a lot of duplicates, in a case that previously hit
qsort_ssup() but now hits qsort_tuple_int32() and then has to call the
tiebreaker comparator.  Note that this comes up only for sorts in a
query, not for eg index builds which always have to tiebreak on item
ptr.  I don't have data right now but that'd likely be due to:

+ * XXX: For now, there is no specialization for cases where datum1 is
+ * authoritative and we don't even need to fall back to a callback at all (that
+ * would be true for types like int4/int8/timestamp/date, but not true for
+ * abbreviations of text or multi-key sorts.  There could be!  Is it worth it?

Upthread we were discussing which variations it'd be worth investing
extra text segment space on to gain speedup and we put those hard
decisions off for future work, but on reflection, we probably should
tackle this particular point to avoid a regression.  I think something
like the attached achieves that (draft, not tested much yet, could
perhaps find a tidier way to code the decision tree).  In short:
variants qsort_tuple_{int32,signed,unsigned}() no longer fall back,
but new variants qsort_tuple_{int32,signed,unsigned}_tiebreak() do.

We should perhaps also reconsider the other XXX comment about finding
a way to skip the retest of column 1 in the tiebreak comparator.
Perhaps you'd just install a different comparetup function, eg
comparetup_index_btree_tail (which would sharing code), so no need to
multiply specialisations for that.

Planning to look at this more closely after I've sorted out some other
problems, but thought I'd post this draft/problem report early in case
John or others have thoughts or would like to run some experiments.

Attachment

pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Schema variables - new implementation for Postgres 15+1
Next
From: Peter Geoghegan
Date:
Subject: Re: Improving btree performance through specializing by key shape, take 2