Why do we still perform a check for pre-sorted input within qsort variants? - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Why do we still perform a check for pre-sorted input within qsort variants?
Date
Msg-id CAEYLb_Xn4-6f1ofsf2qduf24dDCVHbQidt7JPpdL_RiT1zBJ6A@mail.gmail.com
Whole thread Raw
Responses Re: Why do we still perform a check for pre-sorted input within qsort variants?
List pgsql-hackers
In the past, Robert and I have criticised the fact that our qsort
implementation (and the various specialisations thereof) each perform
a check for pre-sorted input. This check does not appear in the
original NetBSD qsort that we lifted our implementation from, and
presumably isn't described by the document 'Qsort routine from Bentley
& McIlroy's "Engineering a Sort Function"' that that implementation is
based on. Note that the presorted check occurs on each and every
iteration, and not just the first, which was felt to be the really
objectionable point [1]. As noted within qsort_arg.c:
*      Remove ill-considered "swap_cnt" switch to insertion sort,*      in favor of a simple check for presorted
input.

Looking at the archives from around the period that this was committed
(October 2006), there doesn't seem to have been any discussion of this
particular alteration. Our qsort implementation seems to have rather
bad worse-case performance [2] that I do not believe can be accounted
for by the usual ways in which quicksort is known to "go quadratic" -
we already take various measures against those that make them very
unlikely.

It looks like Tom was quite right about the swap_cnt optimisation
(it's ill-considered), because the NetBSD people recognised this fact
in 2009 [3], almost 3 years later. They didn't decide to do something
else instead, though.

I hope to avoid starting an epic thread on this at this point in the
cycle. I'd just like to enquire if it's something that's been given
any thought recently.

[1] http://www.postgresql.org/message-id/CA+TgmoaGCyGKUN1a6BfYoORbanw5pZZUFv-VASWBumD8T5fdEw@mail.gmail.com

[2] http://www.postgresql.org/message-id/CAEYLb_W++UhrcWprzG9TyBVF7Sn-c1s9oLbABvAvPGdeP2DFSQ@mail.gmail.com
(A contrived test-case sees qsort_arg perform 2,247,314 comparisons to
Timsort's 60,351)

[3]
http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/stdlib/qsort.c?rev=1.20&content-type=text/x-cvsweb-markup&only_with_tag=MAIN

-- 
Regards,
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Kevin Hale Boyes
Date:
Subject: Re: [DOCS] Contrib module "xml2" status
Next
From: Craig Ringer
Date:
Subject: Re: Show type in psql SELECT