Re: Why do we still perform a check for pre-sorted input within qsort variants? - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Why do we still perform a check for pre-sorted input within qsort variants?
Date
Msg-id CA+TgmoY9QuVV5m90U0+VuhJais3yyq885te=5aipojXaXToafQ@mail.gmail.com
Whole thread Raw
In response to Re: Why do we still perform a check for pre-sorted input within qsort variants?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Why do we still perform a check for pre-sorted input within qsort variants?
List pgsql-hackers
On Mon, Feb 25, 2013 at 5:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Peter Geoghegan <peter.geoghegan86@gmail.com> writes:
>> In the past, Robert and I have criticised the fact that our qsort
>> implementation (and the various specialisations thereof) each perform
>> a check for pre-sorted input. This check does not appear in the
>> original NetBSD qsort that we lifted our implementation from, and
>> presumably isn't described by the document 'Qsort routine from Bentley
>> & McIlroy's "Engineering a Sort Function"' that that implementation is
>> based on.
>
> FWIW, I've been suspicious of that pre-sorted check since the day it
> went in.  Bentley was my faculty adviser for awhile in grad school,
> and I know him to be *way* too smart to have missed anything as simple
> as that.  But I didn't have hard evidence on which to object to it
> at the time, and indeed testing seemed to say it was a good idea:
> http://www.postgresql.org/message-id/18732.1142967137@sss.pgh.pa.us
>
> If you can provide a refutation I will gladly see it out of there.

The thing that bothers me about that check (which may be different
than the thing that bothers Peter) is that we do it recursively at
every level.  I have a sneaking suspicion that it only makes sense to
do that check for subpartitions above a certain size.  I'm perfectly
willing to grant that presorted data is a sufficiently important
special case that we ought to check for it once, and I also suspect
it's a good idea to check large partitions so as to speed up the
sorting of almost-ordered data.  But I'm skeptical that it makes sense
to check it at every recursive level.

I did attempt to do some tinkering with this while I was playing with
it, but I didn't come up with anything really compelling.  You can
reduce the number of comparisons on particular workloads by tinkering
with the algorithm, but then somebody else ends up doing more
comparisons, so it's hard to say whether you've really made things
better.  Or at least I found it so.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Why do we still perform a check for pre-sorted input within qsort variants?
Next
From: Robert Haas
Date:
Subject: Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)