Thread: qsort, once again

qsort, once again

From
Tom Lane
Date:
I was just looking at the behavior of src/port/qsort.c on the test case
that Jerry Sievers was complaining about in pgsql-admin this morning.
I found out what the real weak spot is: it's got nothing directly to do
with good or bad pivots, it's this code right here:
   if (swap_cnt == 0)   {                            /* Switch to insertion sort */       for (pm = (char *) a + es; pm
<(char *) a + n * es; pm += es)           for (pl = pm; pl > (char *) a && cmp(pl - es, pl) > 0;                pl -=
es)              swap(pl, pl - es);       return;   }
 

In other words, if qsort hits a subfile for which the chosen pivot is a
perfect pivot (no swaps are necessary), it switches to insertion sort.
Which is O(N^2).  In Jerry's test case this happens on a subfile of
736357 elements, and you can say goodnight to that process ....

What I'm thinking is that we ought to have a limit on this, ie not
switch to insertion sort if n is larger than 1000 or so, ie

-    if (swap_cnt == 0)
+    if (swap_cnt == 0 && n < 1000)

I'm wondering what the authors were expecting the insertion sort to
handle exactly.  Does anyone have a copy of the paper that's referenced
in the code comment?

/** Qsort routine from Bentley & McIlroy's "Engineering a Sort Function".*/

I tried looking for this at ACM but they seem not to have it.
        regards, tom lane


Re: qsort, once again

From
"Jonah H. Harris"
Date:
On 3/16/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm wondering what the authors were expecting the insertion sort to
handle exactly.  Does anyone have a copy of the paper that's referenced
in the code comment?

/*
* Qsort routine from Bentley & McIlroy's "Engineering a Sort Function".
*/

Yes, I have it somewhere, let me dig it up for ya. 


--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

Re: qsort, once again

From
"Dann Corbit"
Date:

I sent him  a copy

 


From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Jonah H. Harris
Sent: Thursday, March 16, 2006 11:43 AM
To: Tom Lane
Cc: pgsql-hackers@postgresql.org; Jerry Sievers
Subject: Re: [HACKERS] qsort, once again

 

On 3/16/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm wondering what the authors were expecting the insertion sort to
handle exactly.  Does anyone have a copy of the paper that's referenced
in the code comment?

/*
* Qsort routine from Bentley & McIlroy's "Engineering a Sort Function".
*/


Yes, I have it somewhere, let me dig it up for ya. 



--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

Re: qsort, once again

From
"Jonah H. Harris"
Date:
doh!

On 3/16/06, Dann Corbit <DCorbit@connx.com> wrote:

I sent him  a copy

 


From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Jonah H. Harris
Sent: Thursday, March 16, 2006 11:43 AM
To: Tom Lane
Cc: pgsql-hackers@postgresql.org; Jerry Sievers
Subject: Re: [HACKERS] qsort, once again

 

On 3/16/06, Tom Lane < tgl@sss.pgh.pa.us> wrote:

I'm wondering what the authors were expecting the insertion sort to
handle exactly.  Does anyone have a copy of the paper that's referenced
in the code comment?

/*
* Qsort routine from Bentley & McIlroy's "Engineering a Sort Function".
*/


Yes, I have it somewhere, let me dig it up for ya. 



--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324




--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

Re: qsort, once again

From
Tom Lane
Date:
"Jonah H. Harris" <jonah.harris@gmail.com> writes:
> doh!

> On 3/16/06, Dann Corbit <DCorbit@connx.com> wrote:
>> I sent him  a copy

Got it, thanks guys ...
        regards, tom lane


Re: qsort, once again

From
"Jonah H. Harris"
Date:
On 3/16/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Jonah H. Harris" <jonah.harris@gmail.com> writes:
> doh!

Sorry for the duplicate (again) Tom...  I wish Gmail would auto-reply to all.  grr.  Anyway, this is copied for others and the archives:

There's a program I had used in the past to test qsort implementations on McIlroy's site:

http://www.cs.dartmouth.edu/~doug/aqsort.c

Just thought I'd share it since we seem to be talking sorts a lot :)


--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

Re: qsort, once again

From
Tom Lane
Date:
"Dann Corbit" <DCorbit@connx.com> writes:
> I sent him  a copy

Thanks.  This is really interesting: the switch to insertion sort on
perfect pivot is simply not there in Bentley & McIlroy's paper.  So
it was added later, and evidently not tested as carefully as it should
have been.  At this point I'm more than half tempted to take it out
entirely.

So we still have a problem of software archaeology: who added the
insertion sort switch to the NetBSD version, and on what grounds?
        regards, tom lane


Re: qsort, once again

From
"Dann Corbit"
Date:
I have seen a lot of dumb "fixes" to sort routines.

In a commercial sort function I saw some time ago, the check for a
condition that causes qsort to go quadratic was removed.  There was a
comment there that said the engineer "didn't see any improvement in
performance".  Of course, the check was not there to improve performance
but to prevent disaster.

Obviously, he failed to check some very important data sets.  If a sort
routine cannot handle (among other things):
1.  Already sorted
2.  Almost sorted
3.  Reverse sorted
4.  Organ pipe ( the inspiration for "Engineering a Sort Function" )
5.  Small number of distinct values

Then it will cause problems in real-life use.

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Thursday, March 16, 2006 12:09 PM
> To: Dann Corbit
> Cc: Jonah H. Harris; pgsql-hackers@postgresql.org; Jerry Sievers
> Subject: Re: [HACKERS] qsort, once again
>
> "Dann Corbit" <DCorbit@connx.com> writes:
> > I sent him  a copy
>
> Thanks.  This is really interesting: the switch to insertion sort on
> perfect pivot is simply not there in Bentley & McIlroy's paper.  So
> it was added later, and evidently not tested as carefully as it should
> have been.  At this point I'm more than half tempted to take it out
> entirely.
>
> So we still have a problem of software archaeology: who added the
> insertion sort switch to the NetBSD version, and on what grounds?
>
>             regards, tom lane


Re: qsort, once again

From
"Jonah H. Harris"
Date:
On 3/16/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
So we still have a problem of software archaeology: who added the
insertion sort switch to the NetBSD version, and on what grounds?


AFAICS, the insertion sort was added in BSD 4.4-lite and was inherited by NetBSD in CVS version 1.1.1.2.

The previous version in NetBSD (before 4.4-lite) also included an insertion sort with the comment:

/*
* Knuth, Vol. 3, page 116, Algorithm Q, step b, argues that a single pass
* of straight insertion sort after partitioning is complete is better than
* sorting each small partition as it is created. This isn't correct in this
* implementation because comparisons require at least one (and often two)
* function calls and are likely to be the dominating expense of the sort.
* Doing a final insertion sort does more comparisons than are necessary
* because it compares the "edges" and medians of the partitions which are
* known to be already sorted.
*
* This is also the reasoning behind selecting a small THRESH value (see
* Knuth, page 122, equation 26), since the quicksort algorithm does less
* comparisons than the insertion sort.
*/


--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

Re: qsort, once again

From
Tom Lane
Date:
"Jonah H. Harris" <jonah.harris@gmail.com> writes:
> On 3/16/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> So we still have a problem of software archaeology: who added the
>> insertion sort switch to the NetBSD version, and on what grounds?

> AFAICS, the insertion sort was added in BSD 4.4-lite and was inherited by
> NetBSD in CVS version 1.1.1.2.

> The previous version in NetBSD (before 4.4-lite) also included an insertion
> sort with the comment:

> /*
>  * Knuth, Vol. 3, page 116, Algorithm Q, step b, argues that a single pass
>  * of straight insertion sort after partitioning is complete is better than
>  * sorting each small partition as it is created.  This isn't correct in th=
> is
>  * implementation because comparisons require at least one (and often two)
>  * function calls and are likely to be the dominating expense of the sort.
>  * Doing a final insertion sort does more comparisons than are necessary
>  * because it compares the "edges" and medians of the partitions which are
>  * known to be already sorted.

I think this is talking about replacing the bottom-level insertion sorts
(the "if n < 7" case at the top of the function) with a single
insertion-sort pass over the whole array at the end of the recursion.
Not really the same consideration.  Bentley & McIlroy specifically
reject that idea for the same reasons mentioned in this comment, but
I see no discussion in the paper of the swap_cnt idea in the BSD code.

I just repeated the random-data test I made here:
http://archives.postgresql.org/pgsql-hackers/2006-02/msg00590.php
using a version of qsort.c with the "if (swap_cnt == 0)" segment
removed entirely (and hence also removing the computation of swap_cnt,
which saves a cycle or two in the inner loops).  CVS tip seems to be
faster at this than what we had on Feb 15, so I also repeated the
test using the unmodified qsort.c.  Here's the results:

100 runs with CVS-tip qsort.c, sorted into ascending runtime order:

Time: 337.520 ms
Time: 338.335 ms
Time: 338.990 ms
Time: 339.116 ms
Time: 339.282 ms
Time: 339.398 ms
Time: 339.457 ms
Time: 339.623 ms
Time: 339.880 ms
Time: 339.929 ms
Time: 340.258 ms
Time: 340.266 ms
Time: 340.431 ms
Time: 341.042 ms
Time: 341.053 ms
Time: 341.224 ms
Time: 341.506 ms
Time: 341.851 ms
Time: 343.298 ms
Time: 344.750 ms
Time: 347.685 ms
Time: 350.444 ms
Time: 354.230 ms
Time: 361.638 ms
Time: 383.529 ms
Time: 386.735 ms
Time: 389.246 ms
Time: 389.675 ms
Time: 395.183 ms
Time: 406.176 ms
Time: 408.327 ms
Time: 444.531 ms
Time: 507.625 ms
Time: 513.625 ms
Time: 527.424 ms
Time: 530.318 ms
Time: 537.642 ms
Time: 549.844 ms
Time: 554.505 ms
Time: 561.502 ms
Time: 590.159 ms
Time: 606.344 ms
Time: 661.434 ms
Time: 686.666 ms
Time: 697.818 ms
Time: 717.949 ms
Time: 786.298 ms
Time: 843.092 ms
Time: 855.292 ms
Time: 869.975 ms
Time: 874.331 ms
Time: 887.904 ms
Time: 1027.981 ms
Time: 1053.258 ms
Time: 1101.546 ms
Time: 1304.076 ms
Time: 1318.657 ms
Time: 1743.875 ms
Time: 1986.987 ms
Time: 2045.737 ms
Time: 2192.382 ms
Time: 2263.969 ms
Time: 2328.861 ms
Time: 2381.596 ms
Time: 2389.563 ms
Time: 2452.888 ms
Time: 2503.476 ms
Time: 2575.836 ms
Time: 2597.095 ms
Time: 2645.894 ms
Time: 2686.030 ms
Time: 2932.877 ms
Time: 3009.602 ms
Time: 3077.843 ms
Time: 3106.678 ms
Time: 3213.136 ms
Time: 3279.999 ms
Time: 3674.831 ms
Time: 3707.489 ms
Time: 3765.390 ms
Time: 3782.216 ms
Time: 3972.455 ms
Time: 3974.928 ms
Time: 4344.905 ms
Time: 4800.817 ms
Time: 4837.485 ms
Time: 4944.734 ms
Time: 5172.095 ms
Time: 5434.390 ms
Time: 5599.830 ms
Time: 5608.934 ms
Time: 5684.670 ms
Time: 5740.895 ms
Time: 6112.534 ms
Time: 7521.372 ms
Time: 7609.963 ms
Time: 7636.002 ms
Time: 8090.015 ms
Time: 8805.475 ms
Time: 9244.463 ms

100 runs with modified qsort.c:

Time: 378.068 ms
Time: 379.056 ms
Time: 379.485 ms
Time: 379.486 ms
Time: 379.612 ms
Time: 379.832 ms
Time: 379.981 ms
Time: 380.643 ms
Time: 380.824 ms
Time: 380.870 ms
Time: 380.920 ms
Time: 381.215 ms
Time: 381.441 ms
Time: 381.475 ms
Time: 381.848 ms
Time: 381.895 ms
Time: 381.969 ms
Time: 381.993 ms
Time: 382.034 ms
Time: 382.202 ms
Time: 382.361 ms
Time: 382.585 ms
Time: 382.692 ms
Time: 382.734 ms
Time: 382.929 ms
Time: 382.985 ms
Time: 383.107 ms
Time: 383.335 ms
Time: 383.505 ms
Time: 383.514 ms
Time: 383.635 ms
Time: 383.674 ms
Time: 383.704 ms
Time: 383.744 ms
Time: 383.821 ms
Time: 383.832 ms
Time: 383.918 ms
Time: 383.922 ms
Time: 384.125 ms
Time: 384.127 ms
Time: 384.394 ms
Time: 384.500 ms
Time: 384.517 ms
Time: 384.517 ms
Time: 384.590 ms
Time: 384.615 ms
Time: 384.620 ms
Time: 384.678 ms
Time: 384.705 ms
Time: 384.715 ms
Time: 384.786 ms
Time: 384.957 ms
Time: 385.386 ms
Time: 385.537 ms
Time: 385.602 ms
Time: 385.685 ms
Time: 385.800 ms
Time: 385.832 ms
Time: 385.845 ms
Time: 385.846 ms
Time: 385.910 ms
Time: 386.094 ms
Time: 386.270 ms
Time: 386.360 ms
Time: 386.420 ms
Time: 386.841 ms
Time: 387.040 ms
Time: 387.121 ms
Time: 387.173 ms
Time: 387.196 ms
Time: 387.310 ms
Time: 387.421 ms
Time: 387.494 ms
Time: 387.548 ms
Time: 387.563 ms
Time: 388.201 ms
Time: 388.270 ms
Time: 389.499 ms
Time: 389.770 ms
Time: 389.941 ms
Time: 390.113 ms
Time: 390.347 ms
Time: 390.846 ms
Time: 392.564 ms
Time: 393.605 ms
Time: 394.892 ms
Time: 395.656 ms
Time: 396.319 ms
Time: 396.585 ms
Time: 412.060 ms
Time: 421.903 ms
Time: 477.489 ms
Time: 516.069 ms
Time: 563.814 ms
Time: 571.577 ms
Time: 584.821 ms
Time: 689.520 ms
Time: 705.267 ms
Time: 779.866 ms
Time: 841.646 ms

So at least on randomized data, the swap_cnt thing is a serious loser.
Need to run some tests on special-case inputs though.  Anyone have a
test suite they like?
        regards, tom lane


Re: qsort, once again

From
"Dann Corbit"
Date:
[snip]
> So at least on randomized data, the swap_cnt thing is a serious loser.
> Need to run some tests on special-case inputs though.  Anyone have a
> test suite they like?
>
>             regards, tom lane

Here is a distribution maker that will create some torture tests for
sorting programs.

Lacks a main() function.

Attachment

Re: qsort, once again

From
"Dann Corbit"
Date:
Actually, if you compile with CREATE_DISTRIBS defined, it does define a
main() function and create sample distributions.

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> owner@postgresql.org] On Behalf Of Dann Corbit
> Sent: Thursday, March 16, 2006 1:28 PM
> To: Tom Lane; Jonah H. Harris
> Cc: pgsql-hackers@postgresql.org; Jerry Sievers
> Subject: Re: [HACKERS] qsort, once again
>
> [snip]
> > So at least on randomized data, the swap_cnt thing is a serious
loser.
> > Need to run some tests on special-case inputs though.  Anyone have a
> > test suite they like?
> >
> >             regards, tom lane
>
> Here is a distribution maker that will create some torture tests for
> sorting programs.
>
> Lacks a main() function.


Re: qsort, once again

From
"Dann Corbit"
Date:
After you run the program with CREATE_DISTRIBS defined, you will have
the following integer distributions of 4 million entries:
03/16/2006  01:34 PM        16,000,000 trig.int
03/16/2006  01:34 PM        16,000,000 camel.int
03/16/2006  01:34 PM        16,000,000 reverse.int
03/16/2006  01:34 PM        16,000,000 random.int
03/16/2006  01:34 PM        16,000,000 ramp.int
03/16/2006  01:34 PM        16,000,000 twenty.int
03/16/2006  01:34 PM        16,000,000 perverse.int
03/16/2006  01:34 PM        16,000,000 ten.int
03/16/2006  01:34 PM        16,000,000 sorted.int
03/16/2006  01:34 PM        16,000,000 five.int
03/16/2006  01:34 PM        16,000,000 two.int
03/16/2006  01:34 PM        16,000,000 constant.int
03/16/2006  01:34 PM        16,000,000 fib.int

And you will have the following double distributions with 4 million
entries:
03/16/2006  01:34 PM        32,000,000 ten.dbl
03/16/2006  01:34 PM        32,000,000 sorted.dbl
03/16/2006  01:34 PM        32,000,000 fib.dbl
03/16/2006  01:34 PM        32,000,000 reverse.dbl
03/16/2006  01:34 PM        32,000,000 random.dbl
03/16/2006  01:34 PM        32,000,000 trig.dbl
03/16/2006  01:34 PM        32,000,000 ramp.dbl
03/16/2006  01:34 PM        32,000,000 constant.dbl
03/16/2006  01:34 PM        32,000,000 twenty.dbl
03/16/2006  01:34 PM        32,000,000 five.dbl
03/16/2006  01:34 PM        32,000,000 two.dbl
03/16/2006  01:34 PM        32,000,000 perverse.dbl
03/16/2006  01:34 PM        32,000,000 camel.dbl

> -----Original Message-----
> From: Dann Corbit
> Sent: Thursday, March 16, 2006 1:31 PM
> To: Dann Corbit; Tom Lane; Jonah H. Harris
> Cc: pgsql-hackers@postgresql.org; Jerry Sievers
> Subject: RE: [HACKERS] qsort, once again
>
> Actually, if you compile with CREATE_DISTRIBS defined, it does define
a
> main() function and create sample distributions.
>
> > -----Original Message-----
> > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> > owner@postgresql.org] On Behalf Of Dann Corbit
> > Sent: Thursday, March 16, 2006 1:28 PM
> > To: Tom Lane; Jonah H. Harris
> > Cc: pgsql-hackers@postgresql.org; Jerry Sievers
> > Subject: Re: [HACKERS] qsort, once again
> >
> > [snip]
> > > So at least on randomized data, the swap_cnt thing is a serious
loser.
> > > Need to run some tests on special-case inputs though.  Anyone have
a
> > > test suite they like?
> > >
> > >             regards, tom lane
> >
> > Here is a distribution maker that will create some torture tests for
> > sorting programs.
> >
> > Lacks a main() function.


Re: qsort, once again

From
Tom Lane
Date:
>> So at least on randomized data, the swap_cnt thing is a serious loser.
>> Need to run some tests on special-case inputs though.  Anyone have a
>> test suite they like?

> Here is a distribution maker that will create some torture tests for
> sorting programs.

I fleshed out the sort tester that Bentley & McIlroy give pseudocode for
in their paper (attached below in case anyone wants to hack on it).  Not
very surprisingly, it shows the unmodified B&M algorithm as
significantly better than the BSD-lite version:

Our current BSD qsort:
distribution SAWTOOTH: max cratio 12.9259, average 0.870261 over 252 tests
distribution RAND: max cratio 1.07917, average 0.505924 over 252 tests
distribution STAGGER: max cratio 12.9259, average 1.03706 over 252 tests
distribution PLATEAU: max cratio 12.9259, average 0.632514 over 252 tests
distribution SHUFFLE: max cratio 12.9259, average 1.21631 over 252 tests
method COPY: max cratio 3.87533, average 0.666927 over 210 tests
method REVERSE: max cratio 5.6248, average 0.710284 over 210 tests
method FREVERSE: max cratio 12.9259, average 1.58323 over 210 tests
method BREVERSE: max cratio 5.72661, average 1.13674 over 210 tests
method SORT: max cratio 0.758625, average 0.350092 over 210 tests
method DITHER: max cratio 3.13417, average 0.667222 over 210 tests
Overall: average cratio 0.852415 over 1260 tests

without the swap_cnt code:
distribution SAWTOOTH: max cratio 5.6248, average 0.745818 over 252 tests
distribution RAND: max cratio 1.07917, average 0.510097 over 252 tests
distribution STAGGER: max cratio 5.6248, average 1.0494 over 252 tests
distribution PLATEAU: max cratio 3.57655, average 0.411549 over 252 tests
distribution SHUFFLE: max cratio 5.72661, average 1.05988 over 252 tests
method COPY: max cratio 3.87533, average 0.712122 over 210 tests
method REVERSE: max cratio 5.6248, average 0.751011 over 210 tests
method FREVERSE: max cratio 4.80869, average 0.690224 over 210 tests
method BREVERSE: max cratio 5.72661, average 1.13673 over 210 tests
method SORT: max cratio 0.806618, average 0.539829 over 210 tests
method DITHER: max cratio 3.13417, average 0.702174 over 210 tests
Overall: average cratio 0.755348 over 1260 tests

("cratio" is the ratio of the actual number of comparison function calls
to the theoretical expectation of N*lg2(N).)  The insertion sort
switchover is a loser for both average and worst-case measurements.

I tried Dann's distributions too, with N = 100000:

Our current BSD qsort:
dist fib: cratio 0.0694229
dist camel: cratio 0.0903228
dist constant: cratio 0.0602126
dist five: cratio 0.132288
dist ramp: cratio 4.29937
dist random: cratio 1.09286
dist reverse: cratio 0.5663
dist sorted: cratio 0.18062
dist ten: cratio 0.174781
dist twenty: cratio 0.238098
dist two: cratio 0.090365
dist perverse: cratio 0.334503
dist trig: cratio 0.679846
Overall: max cratio 4.29937, average cratio 0.616076 over 13 tests

without the swap_cnt code:
dist fib: cratio 0.0694229
dist camel: cratio 0.0903228
dist constant: cratio 0.0602126
dist five: cratio 0.132288
dist ramp: cratio 4.29937
dist random: cratio 1.09286
dist reverse: cratio 0.89184
dist sorted: cratio 0.884907
dist ten: cratio 0.174781
dist twenty: cratio 0.238098
dist two: cratio 0.090365
dist perverse: cratio 0.334503
dist trig: cratio 0.679846
Overall: max cratio 4.29937, average cratio 0.695293 over 13 tests

In this set of tests the behavior is just about identical, except for
the case of already-sorted input, where the BSD coding runs in O(N)
instead of O(N lg2 N) time.  So that evidently is why some unknown
person put in the special case.

Some further experimentation destroys my original proposal to limit the
size of subfile we'll use the swap_cnt code for: it turns out that that
eliminates the BSD code's advantage for presorted input (at least for
inputs bigger than the limit) without doing anything much in return.

So my feeling is we should just remove the swap_cnt code and return to
the original B&M algorithm.  Being much faster than expected for
presorted input doesn't justify being far slower than expected for
other inputs, IMHO.  In the context of Postgres I doubt that perfectly
sorted input shows up very often anyway.

Comments?

            regards, tom lane


Attachment

Re: qsort, once again

From
Darcy Buskermolen
Date:
On Thursday 16 March 2006 12:09, Tom Lane wrote:
> "Dann Corbit" <DCorbit@connx.com> writes:
> > I sent him  a copy
>
> Thanks.  This is really interesting: the switch to insertion sort on
> perfect pivot is simply not there in Bentley & McIlroy's paper.  So
> it was added later, and evidently not tested as carefully as it should
> have been.  At this point I'm more than half tempted to take it out
> entirely.
>
> So we still have a problem of software archaeology: who added the
> insertion sort switch to the NetBSD version, and on what grounds?

This is when that particular code was pushed in, as to why exactly, you'll 
have to ask mycroft.
http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/stdlib/qsort.c.diff?r1=1.3&r2=1.4&only_with_tag=MAIN


>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

-- 
Darcy Buskermolen
Wavefire Technologies Corp.

http://www.wavefire.com
ph: 250.717.0200
fx: 250.763.1759


Re: qsort, once again

From
Tom Lane
Date:
Darcy Buskermolen <darcy@wavefire.com> writes:
> On Thursday 16 March 2006 12:09, Tom Lane wrote:
>> So we still have a problem of software archaeology: who added the
>> insertion sort switch to the NetBSD version, and on what grounds?

> This is when that particular code was pushed in, as to why exactly, you'll 
> have to ask mycroft.
> http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/stdlib/qsort.c.diff?r1=1.3&r2=1.4&only_with_tag=MAIN

Interesting.  It looks to me like he replaced the former
vaguely-Knuth-based coding with B&M's code, but kept the insertion-
sort-after-no-swap special case that was in the previous code.  I'll
betcha he didn't test to see whether this was actually such a great
idea ...
        regards, tom lane


Re: qsort, once again

From
"Dann Corbit"
Date:
> So my feeling is we should just remove the swap_cnt code and return to
> the original B&M algorithm.  Being much faster than expected for
> presorted input doesn't justify being far slower than expected for
> other inputs, IMHO.  In the context of Postgres I doubt that perfectly
> sorted input shows up very often anyway.
>
> Comments?

Checking for presorted input is O(n).
If the input is random, an average of 3 elements will be tested.
So adding an in-order check of the data should not be too expensive.

I would benchmark several approaches and see which one is best when used
in-place.



Re: qsort, once again

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> owner@postgresql.org] On Behalf Of Dann Corbit
> Sent: Thursday, March 16, 2006 4:42 PM
> To: Tom Lane
> Cc: Jonah H. Harris; pgsql-hackers@postgresql.org; Jerry Sievers
> Subject: Re: [HACKERS] qsort, once again
>
> > So my feeling is we should just remove the swap_cnt code and return
to
> > the original B&M algorithm.  Being much faster than expected for
> > presorted input doesn't justify being far slower than expected for
> > other inputs, IMHO.  In the context of Postgres I doubt that
perfectly
> > sorted input shows up very often anyway.
> >
> > Comments?
>
> Checking for presorted input is O(n).
> If the input is random, an average of 3 elements will be tested.
> So adding an in-order check of the data should not be too expensive.
>
> I would benchmark several approaches and see which one is best when
used
> in-place.

Even if "hunks" of the input are sorted, the test is a very good idea.

Recall that we are sorting recursively and so we divide the data into
chunks.

Consider an example...

Quicksort of a field that contains Sex as 'M' for male, 'F' for female,
or NULL for unknown.

The median selection is going to pick one of 'M', 'F', or NULL.
After pass 1 of qsort we will have two partitions.  One partition will
have all of one type and the other partition will have the other two
types.

An in-order check will tell us that the monotone partition is sorted and
we are done with it.

Imagine also a table that was clustered but for which we have not
updated statistics.  Perhaps it is 98% sorted.  Checking for order in
our partitions is probably a good idea.

I think you could also get a good optimization if you are checking for
partitions and find a big section of the partition is not ordered (even
though the whole thing is not).  If you could perk the ordered size up
the tree, you could just add another partition to the merge list and
sort the unordered part.

In "C Unleashed" I call this idea partition discovery mergesort.


Re: qsort, once again

From
Tom Lane
Date:
"Dann Corbit" <DCorbit@connx.com> writes:
>> So my feeling is we should just remove the swap_cnt code and return
>> to the original B&M algorithm.

> Even if "hunks" of the input are sorted, the test is a very good idea.

Yah know, guys, Bentley and McIlroy are each smarter than any five of
us, and I'm quite certain it occurred to them to try prechecking for
sorted input.  If that case is not in their code then it's probably
because it's a net loss.  Unless you have reason to think that sorted
input is *more* common than other cases for the Postgres environment,
which is certainly a fact not in evidence.

(Bentley was my thesis adviser for awhile before he went to Bell Labs,
so my respect for him is based on direct personal experience.  McIlroy
I only know by reputation, but he's sure got a ton of that.)

> Imagine also a table that was clustered but for which we have not
> updated statistics.  Perhaps it is 98% sorted.  Checking for order in
> our partitions is probably a good idea.

If we are using the sort code rather than the recently-clustered index
for such a case, then we have problems elsewhere.  This scenario is not
a good argument that the sort code needs to be specialized to handle
this case at the expense of other cases; the place to be fixing it is
the planner or the statistics-management code.
        regards, tom lane


Re: qsort, once again

From
"Dann Corbit"
Date:
Well, my point was that it is a snap to implement and test.

It will be better, worse, or the same.

I agree that Bentley is a bloody genius.

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Thursday, March 16, 2006 9:27 PM
> To: Dann Corbit
> Cc: Jonah H. Harris; pgsql-hackers@postgresql.org; Jerry Sievers
> Subject: Re: [HACKERS] qsort, once again
>
> "Dann Corbit" <DCorbit@connx.com> writes:
> >> So my feeling is we should just remove the swap_cnt code and return
> >> to the original B&M algorithm.
>
> > Even if "hunks" of the input are sorted, the test is a very good
idea.
>
> Yah know, guys, Bentley and McIlroy are each smarter than any five of
> us, and I'm quite certain it occurred to them to try prechecking for
> sorted input.  If that case is not in their code then it's probably
> because it's a net loss.  Unless you have reason to think that sorted
> input is *more* common than other cases for the Postgres environment,
> which is certainly a fact not in evidence.
>
> (Bentley was my thesis adviser for awhile before he went to Bell Labs,
> so my respect for him is based on direct personal experience.  McIlroy
> I only know by reputation, but he's sure got a ton of that.)
>
> > Imagine also a table that was clustered but for which we have not
> > updated statistics.  Perhaps it is 98% sorted.  Checking for order
in
> > our partitions is probably a good idea.
>
> If we are using the sort code rather than the recently-clustered index
> for such a case, then we have problems elsewhere.  This scenario is
not
> a good argument that the sort code needs to be specialized to handle
> this case at the expense of other cases; the place to be fixing it is
> the planner or the statistics-management code.
>
>             regards, tom lane


Re: qsort, once again

From
Tom Lane
Date:
"Dann Corbit" <DCorbit@connx.com> writes:
> Well, my point was that it is a snap to implement and test.

Well, having done this, I have to eat my words: it does seem to be a
pretty good idea.

The following test numbers are using Bentley & McIlroy's test framework,
but modified to test only the case N=10000 rather than the four smaller
N values they originally used.  I did that because it exposes quadratic
behavior more obviously, and the variance in N made it harder to compare
comparison ratios for different cases.  I also added a "NEARSORT" test
method, which sorts the input distribution and then exchanges two
elements chosen at random.  I did that because I was concerned that
nearly sorted input would be the worst case for the presorted-input
check, as it would waste the most cycles before failing on such input.

With our existing qsort code, the results look like

distribution SAWTOOTH: max cratio 94.17, min 0.08, average 1.56 over 105 tests
distribution RAND: max cratio 1.06, min 0.08, average 0.51 over 105 tests
distribution STAGGER: max cratio 6.08, min 0.23, average 1.01 over 105 tests
distribution PLATEAU: max cratio 94.17, min 0.08, average 2.12 over 105 tests
distribution SHUFFLE: max cratio 94.17, min 0.23, average 1.92 over 105 tests
method COPY: max cratio 6.08, min 0.08, average 0.72 over 75 tests
method REVERSE: max cratio 5.34, min 0.08, average 0.69 over 75 tests
method FREVERSE: max cratio 94.17, min 0.08, average 5.71 over 75 tests
method BREVERSE: max cratio 3.86, min 0.08, average 1.41 over 75 tests
method SORT: max cratio 0.82, min 0.08, average 0.31 over 75 tests
method NEARSORT: max cratio 0.82, min 0.08, average 0.36 over 75 tests
method DITHER: max cratio 5.52, min 0.18, average 0.77 over 75 tests
Overall: average cratio 1.42 over 525 tests

("cratio" is the ratio of the actual number of comparison function calls
to the theoretical expectation, N log2(N))

That's pretty awful: there are several test cases that make it use
nearly 100 times the expected number of comparisons.

Removing the swap_cnt test to bring it close to B&M's original
recommendations, we get

distribution SAWTOOTH: max cratio 3.85, min 0.08, average 0.70 over 105 tests
distribution RAND: max cratio 1.06, min 0.08, average 0.52 over 105 tests
distribution STAGGER: max cratio 6.08, min 0.58, average 1.12 over 105 tests
distribution PLATEAU: max cratio 3.70, min 0.08, average 0.34 over 105 tests
distribution SHUFFLE: max cratio 3.86, min 0.86, average 1.24 over 105 tests
method COPY: max cratio 6.08, min 0.08, average 0.76 over 75 tests
method REVERSE: max cratio 5.34, min 0.08, average 0.75 over 75 tests
method FREVERSE: max cratio 4.56, min 0.08, average 0.73 over 75 tests
method BREVERSE: max cratio 3.86, min 0.08, average 1.41 over 75 tests
method SORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests
method NEARSORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests
method DITHER: max cratio 3.73, min 0.18, average 0.72 over 75 tests
Overall: average cratio 0.78 over 525 tests

which is a whole lot better as to both average and worst cases.

I then added some code to check for presorted input (just after the
n<7 insertion sort code):

#ifdef CHECK_SORTED
    presorted = 1;
    for (pm = (char *) a + es; pm < (char *) a + n * es; pm += es)
    {
        if (cmp(pm - es, pm) > 0)
        {
            presorted = 0;
            break;
        }
    }
    if (presorted)
        return;
#endif

This gives

distribution SAWTOOTH: max cratio 3.88, min 0.08, average 0.62 over 105 tests
distribution RAND: max cratio 1.06, min 0.08, average 0.46 over 105 tests
distribution STAGGER: max cratio 6.15, min 0.08, average 0.98 over 105 tests
distribution PLATEAU: max cratio 3.79, min 0.08, average 0.31 over 105 tests
distribution SHUFFLE: max cratio 3.91, min 0.08, average 1.09 over 105 tests
method COPY: max cratio 6.15, min 0.08, average 0.72 over 75 tests
method REVERSE: max cratio 5.34, min 0.08, average 0.76 over 75 tests
method FREVERSE: max cratio 4.58, min 0.08, average 0.73 over 75 tests
method BREVERSE: max cratio 3.91, min 0.08, average 1.44 over 75 tests
method SORT: max cratio 0.08, min 0.08, average 0.08 over 75 tests
method NEARSORT: max cratio 0.89, min 0.08, average 0.39 over 75 tests
method DITHER: max cratio 3.73, min 0.18, average 0.72 over 75 tests
Overall: average cratio 0.69 over 525 tests

So the worst case seems only very marginally worse, and there is a
definite improvement in the average case, even for inputs that aren't
entirely sorted.  Importantly, the "near sorted" case that I thought
might send it into quadratic behavior doesn't seem to do that.

So, unless anyone wants to do further testing, I'll go ahead and commit
these changes.

            regards, tom lane

PS: Just as a comparison point, here are the results when testing HPUX's
library qsort:

distribution SAWTOOTH: max cratio 7.00, min 0.08, average 0.76 over 105 tests
distribution RAND: max cratio 1.11, min 0.08, average 0.53 over 105 tests
distribution STAGGER: max cratio 7.05, min 0.58, average 1.24 over 105 tests
distribution PLATEAU: max cratio 7.00, min 0.08, average 0.43 over 105 tests
distribution SHUFFLE: max cratio 7.00, min 0.86, average 1.54 over 105 tests
method COPY: max cratio 6.70, min 0.08, average 0.79 over 75 tests
method REVERSE: max cratio 7.05, min 0.08, average 0.78 over 75 tests
method FREVERSE: max cratio 7.00, min 0.08, average 0.77 over 75 tests
method BREVERSE: max cratio 7.00, min 0.08, average 2.11 over 75 tests
method SORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests
method NEARSORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests
method DITHER: max cratio 4.06, min 0.16, average 0.74 over 75 tests
Overall: average cratio 0.90 over 525 tests

and here are the results using glibc's qsort, which of course isn't
quicksort at all but some kind of merge sort:

distribution SAWTOOTH: max cratio 0.90, min 0.49, average 0.65 over 105 tests
distribution RAND: max cratio 0.91, min 0.49, average 0.76 over 105 tests
distribution STAGGER: max cratio 0.92, min 0.49, average 0.70 over 105 tests
distribution PLATEAU: max cratio 0.84, min 0.49, average 0.54 over 105 tests
distribution SHUFFLE: max cratio 0.64, min 0.49, average 0.52 over 105 tests
method COPY: max cratio 0.92, min 0.49, average 0.66 over 75 tests
method REVERSE: max cratio 0.92, min 0.49, average 0.68 over 75 tests
method FREVERSE: max cratio 0.92, min 0.49, average 0.67 over 75 tests
method BREVERSE: max cratio 0.92, min 0.49, average 0.68 over 75 tests
method SORT: max cratio 0.49, min 0.49, average 0.49 over 75 tests
method NEARSORT: max cratio 0.55, min 0.49, average 0.51 over 75 tests
method DITHER: max cratio 0.92, min 0.50, average 0.74 over 75 tests
Overall: average cratio 0.63 over 525 tests

PPS: final version of test framework attached for the archives.


Attachment

Re: qsort, once again

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> and here are the results using glibc's qsort, which of course isn't
> quicksort at all but some kind of merge sort:
> ...
> Overall: average cratio 0.63 over 525 tests

That looks better both on average and in the worst case. Are the time
constants that much worse that the merge sort still takes longer?

-- 
greg



Re: qsort, once again

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> That looks better both on average and in the worst case. Are the time
> constants that much worse that the merge sort still takes longer?

Keep in mind that this is only counting the number of
comparison-function calls; it's not accounting for any other effects.
In particular, for a large sort operation quicksort might win because of
its more cache-friendly memory access patterns.

The whole question of our qsort vs the system library's qsort probably
needs to be revisited, however, now that we've identified and fixed this
particular performance issue.
        regards, tom lane


Re: qsort, once again

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > That looks better both on average and in the worst case. Are the time
> > constants that much worse that the merge sort still takes longer?
> 
> Keep in mind that this is only counting the number of
> comparison-function calls; it's not accounting for any other effects.
> In particular, for a large sort operation quicksort might win because of
> its more cache-friendly memory access patterns.

My question explicitly recognized that possibility. I'm just a little
skeptical since the comparison function in Postgres is often not some simple
bit of tightly optimized C code, but rather a complex locale sensitive
comparison function or even a bit of SQL expression to evaluate.

Cache effectiveness is may be a minimal factor anyways when the comparison is
executing more than a minimal amount of code. And one extra comparison is
going to cost a lot more too.

-- 
greg



Re: qsort, once again

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> My question explicitly recognized that possibility. I'm just a little
> skeptical since the comparison function in Postgres is often not some simple
> bit of tightly optimized C code, but rather a complex locale sensitive
> comparison function or even a bit of SQL expression to evaluate.

Yeah, I'd guess the same way, but OTOH at least a few people have
reported that our qsort code is consistently faster than glibc's (and
that was before this fix).  See this thread:
http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php

Currently I believe that we only use our qsort on Solaris, not any other
platform, so if you think that glibc's qsort is better then you've
already got your wish.  It seems to need more investigation though.
In particular, I'm thinking that the various adjustments we've made
to the sort support code over the past month probably invalidate any
previous testing of the point, and that we ought to go back and redo
those comparisons.
        regards, tom lane