Thread: qsort, once again
I was just looking at the behavior of src/port/qsort.c on the test case that Jerry Sievers was complaining about in pgsql-admin this morning. I found out what the real weak spot is: it's got nothing directly to do with good or bad pivots, it's this code right here: if (swap_cnt == 0) { /* Switch to insertion sort */ for (pm = (char *) a + es; pm <(char *) a + n * es; pm += es) for (pl = pm; pl > (char *) a && cmp(pl - es, pl) > 0; pl -= es) swap(pl, pl - es); return; } In other words, if qsort hits a subfile for which the chosen pivot is a perfect pivot (no swaps are necessary), it switches to insertion sort. Which is O(N^2). In Jerry's test case this happens on a subfile of 736357 elements, and you can say goodnight to that process .... What I'm thinking is that we ought to have a limit on this, ie not switch to insertion sort if n is larger than 1000 or so, ie - if (swap_cnt == 0) + if (swap_cnt == 0 && n < 1000) I'm wondering what the authors were expecting the insertion sort to handle exactly. Does anyone have a copy of the paper that's referenced in the code comment? /** Qsort routine from Bentley & McIlroy's "Engineering a Sort Function".*/ I tried looking for this at ACM but they seem not to have it. regards, tom lane
I'm wondering what the authors were expecting the insertion sort to
handle exactly. Does anyone have a copy of the paper that's referenced
in the code comment?
/*
* Qsort routine from Bentley & McIlroy's "Engineering a Sort Function".
*/
Yes, I have it somewhere, let me dig it up for ya.
--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324
I sent him a copy
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Jonah H. Harris
Sent: Thursday, March 16, 2006 11:43 AM
To: Tom Lane
Cc: pgsql-hackers@postgresql.org; Jerry Sievers
Subject: Re: [HACKERS] qsort, once again
On 3/16/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm wondering what the authors were expecting the insertion sort to
handle exactly. Does anyone have a copy of the paper that's referenced
in the code comment?
/*
* Qsort routine from Bentley & McIlroy's "Engineering a Sort Function".
*/
Yes, I have it somewhere, let me dig it up for ya.
--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324
I sent him a copy
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Jonah H. Harris
Sent: Thursday, March 16, 2006 11:43 AM
To: Tom Lane
Cc: pgsql-hackers@postgresql.org; Jerry Sievers
Subject: Re: [HACKERS] qsort, once again
On 3/16/06, Tom Lane < tgl@sss.pgh.pa.us> wrote:
I'm wondering what the authors were expecting the insertion sort to
handle exactly. Does anyone have a copy of the paper that's referenced
in the code comment?
/*
* Qsort routine from Bentley & McIlroy's "Engineering a Sort Function".
*/
Yes, I have it somewhere, let me dig it up for ya.
--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324
--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324
"Jonah H. Harris" <jonah.harris@gmail.com> writes: > doh! > On 3/16/06, Dann Corbit <DCorbit@connx.com> wrote: >> I sent him a copy Got it, thanks guys ... regards, tom lane
"Jonah H. Harris" <jonah.harris@gmail.com> writes:
> doh!
Sorry for the duplicate (again) Tom... I wish Gmail would auto-reply to all. grr. Anyway, this is copied for others and the archives:
There's a program I had used in the past to test qsort implementations on McIlroy's site:
http://www.cs.dartmouth.edu/~doug/aqsort.c
Just thought I'd share it since we seem to be talking sorts a lot :)
--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324
"Dann Corbit" <DCorbit@connx.com> writes: > I sent him a copy Thanks. This is really interesting: the switch to insertion sort on perfect pivot is simply not there in Bentley & McIlroy's paper. So it was added later, and evidently not tested as carefully as it should have been. At this point I'm more than half tempted to take it out entirely. So we still have a problem of software archaeology: who added the insertion sort switch to the NetBSD version, and on what grounds? regards, tom lane
I have seen a lot of dumb "fixes" to sort routines. In a commercial sort function I saw some time ago, the check for a condition that causes qsort to go quadratic was removed. There was a comment there that said the engineer "didn't see any improvement in performance". Of course, the check was not there to improve performance but to prevent disaster. Obviously, he failed to check some very important data sets. If a sort routine cannot handle (among other things): 1. Already sorted 2. Almost sorted 3. Reverse sorted 4. Organ pipe ( the inspiration for "Engineering a Sort Function" ) 5. Small number of distinct values Then it will cause problems in real-life use. > -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Thursday, March 16, 2006 12:09 PM > To: Dann Corbit > Cc: Jonah H. Harris; pgsql-hackers@postgresql.org; Jerry Sievers > Subject: Re: [HACKERS] qsort, once again > > "Dann Corbit" <DCorbit@connx.com> writes: > > I sent him a copy > > Thanks. This is really interesting: the switch to insertion sort on > perfect pivot is simply not there in Bentley & McIlroy's paper. So > it was added later, and evidently not tested as carefully as it should > have been. At this point I'm more than half tempted to take it out > entirely. > > So we still have a problem of software archaeology: who added the > insertion sort switch to the NetBSD version, and on what grounds? > > regards, tom lane
So we still have a problem of software archaeology: who added the
insertion sort switch to the NetBSD version, and on what grounds?
AFAICS, the insertion sort was added in BSD 4.4-lite and was inherited by NetBSD in CVS version 1.1.1.2.
The previous version in NetBSD (before 4.4-lite) also included an insertion sort with the comment:
/*
* Knuth, Vol. 3, page 116, Algorithm Q, step b, argues that a single pass
* of straight insertion sort after partitioning is complete is better than
* sorting each small partition as it is created. This isn't correct in this
* implementation because comparisons require at least one (and often two)
* function calls and are likely to be the dominating expense of the sort.
* Doing a final insertion sort does more comparisons than are necessary
* because it compares the "edges" and medians of the partitions which are
* known to be already sorted.
*
* This is also the reasoning behind selecting a small THRESH value (see
* Knuth, page 122, equation 26), since the quicksort algorithm does less
* comparisons than the insertion sort.
*/
--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324
"Jonah H. Harris" <jonah.harris@gmail.com> writes: > On 3/16/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> So we still have a problem of software archaeology: who added the >> insertion sort switch to the NetBSD version, and on what grounds? > AFAICS, the insertion sort was added in BSD 4.4-lite and was inherited by > NetBSD in CVS version 1.1.1.2. > The previous version in NetBSD (before 4.4-lite) also included an insertion > sort with the comment: > /* > * Knuth, Vol. 3, page 116, Algorithm Q, step b, argues that a single pass > * of straight insertion sort after partitioning is complete is better than > * sorting each small partition as it is created. This isn't correct in th= > is > * implementation because comparisons require at least one (and often two) > * function calls and are likely to be the dominating expense of the sort. > * Doing a final insertion sort does more comparisons than are necessary > * because it compares the "edges" and medians of the partitions which are > * known to be already sorted. I think this is talking about replacing the bottom-level insertion sorts (the "if n < 7" case at the top of the function) with a single insertion-sort pass over the whole array at the end of the recursion. Not really the same consideration. Bentley & McIlroy specifically reject that idea for the same reasons mentioned in this comment, but I see no discussion in the paper of the swap_cnt idea in the BSD code. I just repeated the random-data test I made here: http://archives.postgresql.org/pgsql-hackers/2006-02/msg00590.php using a version of qsort.c with the "if (swap_cnt == 0)" segment removed entirely (and hence also removing the computation of swap_cnt, which saves a cycle or two in the inner loops). CVS tip seems to be faster at this than what we had on Feb 15, so I also repeated the test using the unmodified qsort.c. Here's the results: 100 runs with CVS-tip qsort.c, sorted into ascending runtime order: Time: 337.520 ms Time: 338.335 ms Time: 338.990 ms Time: 339.116 ms Time: 339.282 ms Time: 339.398 ms Time: 339.457 ms Time: 339.623 ms Time: 339.880 ms Time: 339.929 ms Time: 340.258 ms Time: 340.266 ms Time: 340.431 ms Time: 341.042 ms Time: 341.053 ms Time: 341.224 ms Time: 341.506 ms Time: 341.851 ms Time: 343.298 ms Time: 344.750 ms Time: 347.685 ms Time: 350.444 ms Time: 354.230 ms Time: 361.638 ms Time: 383.529 ms Time: 386.735 ms Time: 389.246 ms Time: 389.675 ms Time: 395.183 ms Time: 406.176 ms Time: 408.327 ms Time: 444.531 ms Time: 507.625 ms Time: 513.625 ms Time: 527.424 ms Time: 530.318 ms Time: 537.642 ms Time: 549.844 ms Time: 554.505 ms Time: 561.502 ms Time: 590.159 ms Time: 606.344 ms Time: 661.434 ms Time: 686.666 ms Time: 697.818 ms Time: 717.949 ms Time: 786.298 ms Time: 843.092 ms Time: 855.292 ms Time: 869.975 ms Time: 874.331 ms Time: 887.904 ms Time: 1027.981 ms Time: 1053.258 ms Time: 1101.546 ms Time: 1304.076 ms Time: 1318.657 ms Time: 1743.875 ms Time: 1986.987 ms Time: 2045.737 ms Time: 2192.382 ms Time: 2263.969 ms Time: 2328.861 ms Time: 2381.596 ms Time: 2389.563 ms Time: 2452.888 ms Time: 2503.476 ms Time: 2575.836 ms Time: 2597.095 ms Time: 2645.894 ms Time: 2686.030 ms Time: 2932.877 ms Time: 3009.602 ms Time: 3077.843 ms Time: 3106.678 ms Time: 3213.136 ms Time: 3279.999 ms Time: 3674.831 ms Time: 3707.489 ms Time: 3765.390 ms Time: 3782.216 ms Time: 3972.455 ms Time: 3974.928 ms Time: 4344.905 ms Time: 4800.817 ms Time: 4837.485 ms Time: 4944.734 ms Time: 5172.095 ms Time: 5434.390 ms Time: 5599.830 ms Time: 5608.934 ms Time: 5684.670 ms Time: 5740.895 ms Time: 6112.534 ms Time: 7521.372 ms Time: 7609.963 ms Time: 7636.002 ms Time: 8090.015 ms Time: 8805.475 ms Time: 9244.463 ms 100 runs with modified qsort.c: Time: 378.068 ms Time: 379.056 ms Time: 379.485 ms Time: 379.486 ms Time: 379.612 ms Time: 379.832 ms Time: 379.981 ms Time: 380.643 ms Time: 380.824 ms Time: 380.870 ms Time: 380.920 ms Time: 381.215 ms Time: 381.441 ms Time: 381.475 ms Time: 381.848 ms Time: 381.895 ms Time: 381.969 ms Time: 381.993 ms Time: 382.034 ms Time: 382.202 ms Time: 382.361 ms Time: 382.585 ms Time: 382.692 ms Time: 382.734 ms Time: 382.929 ms Time: 382.985 ms Time: 383.107 ms Time: 383.335 ms Time: 383.505 ms Time: 383.514 ms Time: 383.635 ms Time: 383.674 ms Time: 383.704 ms Time: 383.744 ms Time: 383.821 ms Time: 383.832 ms Time: 383.918 ms Time: 383.922 ms Time: 384.125 ms Time: 384.127 ms Time: 384.394 ms Time: 384.500 ms Time: 384.517 ms Time: 384.517 ms Time: 384.590 ms Time: 384.615 ms Time: 384.620 ms Time: 384.678 ms Time: 384.705 ms Time: 384.715 ms Time: 384.786 ms Time: 384.957 ms Time: 385.386 ms Time: 385.537 ms Time: 385.602 ms Time: 385.685 ms Time: 385.800 ms Time: 385.832 ms Time: 385.845 ms Time: 385.846 ms Time: 385.910 ms Time: 386.094 ms Time: 386.270 ms Time: 386.360 ms Time: 386.420 ms Time: 386.841 ms Time: 387.040 ms Time: 387.121 ms Time: 387.173 ms Time: 387.196 ms Time: 387.310 ms Time: 387.421 ms Time: 387.494 ms Time: 387.548 ms Time: 387.563 ms Time: 388.201 ms Time: 388.270 ms Time: 389.499 ms Time: 389.770 ms Time: 389.941 ms Time: 390.113 ms Time: 390.347 ms Time: 390.846 ms Time: 392.564 ms Time: 393.605 ms Time: 394.892 ms Time: 395.656 ms Time: 396.319 ms Time: 396.585 ms Time: 412.060 ms Time: 421.903 ms Time: 477.489 ms Time: 516.069 ms Time: 563.814 ms Time: 571.577 ms Time: 584.821 ms Time: 689.520 ms Time: 705.267 ms Time: 779.866 ms Time: 841.646 ms So at least on randomized data, the swap_cnt thing is a serious loser. Need to run some tests on special-case inputs though. Anyone have a test suite they like? regards, tom lane
[snip] > So at least on randomized data, the swap_cnt thing is a serious loser. > Need to run some tests on special-case inputs though. Anyone have a > test suite they like? > > regards, tom lane Here is a distribution maker that will create some torture tests for sorting programs. Lacks a main() function.
Attachment
Actually, if you compile with CREATE_DISTRIBS defined, it does define a main() function and create sample distributions. > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- > owner@postgresql.org] On Behalf Of Dann Corbit > Sent: Thursday, March 16, 2006 1:28 PM > To: Tom Lane; Jonah H. Harris > Cc: pgsql-hackers@postgresql.org; Jerry Sievers > Subject: Re: [HACKERS] qsort, once again > > [snip] > > So at least on randomized data, the swap_cnt thing is a serious loser. > > Need to run some tests on special-case inputs though. Anyone have a > > test suite they like? > > > > regards, tom lane > > Here is a distribution maker that will create some torture tests for > sorting programs. > > Lacks a main() function.
After you run the program with CREATE_DISTRIBS defined, you will have the following integer distributions of 4 million entries: 03/16/2006 01:34 PM 16,000,000 trig.int 03/16/2006 01:34 PM 16,000,000 camel.int 03/16/2006 01:34 PM 16,000,000 reverse.int 03/16/2006 01:34 PM 16,000,000 random.int 03/16/2006 01:34 PM 16,000,000 ramp.int 03/16/2006 01:34 PM 16,000,000 twenty.int 03/16/2006 01:34 PM 16,000,000 perverse.int 03/16/2006 01:34 PM 16,000,000 ten.int 03/16/2006 01:34 PM 16,000,000 sorted.int 03/16/2006 01:34 PM 16,000,000 five.int 03/16/2006 01:34 PM 16,000,000 two.int 03/16/2006 01:34 PM 16,000,000 constant.int 03/16/2006 01:34 PM 16,000,000 fib.int And you will have the following double distributions with 4 million entries: 03/16/2006 01:34 PM 32,000,000 ten.dbl 03/16/2006 01:34 PM 32,000,000 sorted.dbl 03/16/2006 01:34 PM 32,000,000 fib.dbl 03/16/2006 01:34 PM 32,000,000 reverse.dbl 03/16/2006 01:34 PM 32,000,000 random.dbl 03/16/2006 01:34 PM 32,000,000 trig.dbl 03/16/2006 01:34 PM 32,000,000 ramp.dbl 03/16/2006 01:34 PM 32,000,000 constant.dbl 03/16/2006 01:34 PM 32,000,000 twenty.dbl 03/16/2006 01:34 PM 32,000,000 five.dbl 03/16/2006 01:34 PM 32,000,000 two.dbl 03/16/2006 01:34 PM 32,000,000 perverse.dbl 03/16/2006 01:34 PM 32,000,000 camel.dbl > -----Original Message----- > From: Dann Corbit > Sent: Thursday, March 16, 2006 1:31 PM > To: Dann Corbit; Tom Lane; Jonah H. Harris > Cc: pgsql-hackers@postgresql.org; Jerry Sievers > Subject: RE: [HACKERS] qsort, once again > > Actually, if you compile with CREATE_DISTRIBS defined, it does define a > main() function and create sample distributions. > > > -----Original Message----- > > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- > > owner@postgresql.org] On Behalf Of Dann Corbit > > Sent: Thursday, March 16, 2006 1:28 PM > > To: Tom Lane; Jonah H. Harris > > Cc: pgsql-hackers@postgresql.org; Jerry Sievers > > Subject: Re: [HACKERS] qsort, once again > > > > [snip] > > > So at least on randomized data, the swap_cnt thing is a serious loser. > > > Need to run some tests on special-case inputs though. Anyone have a > > > test suite they like? > > > > > > regards, tom lane > > > > Here is a distribution maker that will create some torture tests for > > sorting programs. > > > > Lacks a main() function.
>> So at least on randomized data, the swap_cnt thing is a serious loser. >> Need to run some tests on special-case inputs though. Anyone have a >> test suite they like? > Here is a distribution maker that will create some torture tests for > sorting programs. I fleshed out the sort tester that Bentley & McIlroy give pseudocode for in their paper (attached below in case anyone wants to hack on it). Not very surprisingly, it shows the unmodified B&M algorithm as significantly better than the BSD-lite version: Our current BSD qsort: distribution SAWTOOTH: max cratio 12.9259, average 0.870261 over 252 tests distribution RAND: max cratio 1.07917, average 0.505924 over 252 tests distribution STAGGER: max cratio 12.9259, average 1.03706 over 252 tests distribution PLATEAU: max cratio 12.9259, average 0.632514 over 252 tests distribution SHUFFLE: max cratio 12.9259, average 1.21631 over 252 tests method COPY: max cratio 3.87533, average 0.666927 over 210 tests method REVERSE: max cratio 5.6248, average 0.710284 over 210 tests method FREVERSE: max cratio 12.9259, average 1.58323 over 210 tests method BREVERSE: max cratio 5.72661, average 1.13674 over 210 tests method SORT: max cratio 0.758625, average 0.350092 over 210 tests method DITHER: max cratio 3.13417, average 0.667222 over 210 tests Overall: average cratio 0.852415 over 1260 tests without the swap_cnt code: distribution SAWTOOTH: max cratio 5.6248, average 0.745818 over 252 tests distribution RAND: max cratio 1.07917, average 0.510097 over 252 tests distribution STAGGER: max cratio 5.6248, average 1.0494 over 252 tests distribution PLATEAU: max cratio 3.57655, average 0.411549 over 252 tests distribution SHUFFLE: max cratio 5.72661, average 1.05988 over 252 tests method COPY: max cratio 3.87533, average 0.712122 over 210 tests method REVERSE: max cratio 5.6248, average 0.751011 over 210 tests method FREVERSE: max cratio 4.80869, average 0.690224 over 210 tests method BREVERSE: max cratio 5.72661, average 1.13673 over 210 tests method SORT: max cratio 0.806618, average 0.539829 over 210 tests method DITHER: max cratio 3.13417, average 0.702174 over 210 tests Overall: average cratio 0.755348 over 1260 tests ("cratio" is the ratio of the actual number of comparison function calls to the theoretical expectation of N*lg2(N).) The insertion sort switchover is a loser for both average and worst-case measurements. I tried Dann's distributions too, with N = 100000: Our current BSD qsort: dist fib: cratio 0.0694229 dist camel: cratio 0.0903228 dist constant: cratio 0.0602126 dist five: cratio 0.132288 dist ramp: cratio 4.29937 dist random: cratio 1.09286 dist reverse: cratio 0.5663 dist sorted: cratio 0.18062 dist ten: cratio 0.174781 dist twenty: cratio 0.238098 dist two: cratio 0.090365 dist perverse: cratio 0.334503 dist trig: cratio 0.679846 Overall: max cratio 4.29937, average cratio 0.616076 over 13 tests without the swap_cnt code: dist fib: cratio 0.0694229 dist camel: cratio 0.0903228 dist constant: cratio 0.0602126 dist five: cratio 0.132288 dist ramp: cratio 4.29937 dist random: cratio 1.09286 dist reverse: cratio 0.89184 dist sorted: cratio 0.884907 dist ten: cratio 0.174781 dist twenty: cratio 0.238098 dist two: cratio 0.090365 dist perverse: cratio 0.334503 dist trig: cratio 0.679846 Overall: max cratio 4.29937, average cratio 0.695293 over 13 tests In this set of tests the behavior is just about identical, except for the case of already-sorted input, where the BSD coding runs in O(N) instead of O(N lg2 N) time. So that evidently is why some unknown person put in the special case. Some further experimentation destroys my original proposal to limit the size of subfile we'll use the swap_cnt code for: it turns out that that eliminates the BSD code's advantage for presorted input (at least for inputs bigger than the limit) without doing anything much in return. So my feeling is we should just remove the swap_cnt code and return to the original B&M algorithm. Being much faster than expected for presorted input doesn't justify being far slower than expected for other inputs, IMHO. In the context of Postgres I doubt that perfectly sorted input shows up very often anyway. Comments? regards, tom lane
Attachment
On Thursday 16 March 2006 12:09, Tom Lane wrote: > "Dann Corbit" <DCorbit@connx.com> writes: > > I sent him a copy > > Thanks. This is really interesting: the switch to insertion sort on > perfect pivot is simply not there in Bentley & McIlroy's paper. So > it was added later, and evidently not tested as carefully as it should > have been. At this point I'm more than half tempted to take it out > entirely. > > So we still have a problem of software archaeology: who added the > insertion sort switch to the NetBSD version, and on what grounds? This is when that particular code was pushed in, as to why exactly, you'll have to ask mycroft. http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/stdlib/qsort.c.diff?r1=1.3&r2=1.4&only_with_tag=MAIN > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings -- Darcy Buskermolen Wavefire Technologies Corp. http://www.wavefire.com ph: 250.717.0200 fx: 250.763.1759
Darcy Buskermolen <darcy@wavefire.com> writes: > On Thursday 16 March 2006 12:09, Tom Lane wrote: >> So we still have a problem of software archaeology: who added the >> insertion sort switch to the NetBSD version, and on what grounds? > This is when that particular code was pushed in, as to why exactly, you'll > have to ask mycroft. > http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/stdlib/qsort.c.diff?r1=1.3&r2=1.4&only_with_tag=MAIN Interesting. It looks to me like he replaced the former vaguely-Knuth-based coding with B&M's code, but kept the insertion- sort-after-no-swap special case that was in the previous code. I'll betcha he didn't test to see whether this was actually such a great idea ... regards, tom lane
> So my feeling is we should just remove the swap_cnt code and return to > the original B&M algorithm. Being much faster than expected for > presorted input doesn't justify being far slower than expected for > other inputs, IMHO. In the context of Postgres I doubt that perfectly > sorted input shows up very often anyway. > > Comments? Checking for presorted input is O(n). If the input is random, an average of 3 elements will be tested. So adding an in-order check of the data should not be too expensive. I would benchmark several approaches and see which one is best when used in-place.
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers- > owner@postgresql.org] On Behalf Of Dann Corbit > Sent: Thursday, March 16, 2006 4:42 PM > To: Tom Lane > Cc: Jonah H. Harris; pgsql-hackers@postgresql.org; Jerry Sievers > Subject: Re: [HACKERS] qsort, once again > > > So my feeling is we should just remove the swap_cnt code and return to > > the original B&M algorithm. Being much faster than expected for > > presorted input doesn't justify being far slower than expected for > > other inputs, IMHO. In the context of Postgres I doubt that perfectly > > sorted input shows up very often anyway. > > > > Comments? > > Checking for presorted input is O(n). > If the input is random, an average of 3 elements will be tested. > So adding an in-order check of the data should not be too expensive. > > I would benchmark several approaches and see which one is best when used > in-place. Even if "hunks" of the input are sorted, the test is a very good idea. Recall that we are sorting recursively and so we divide the data into chunks. Consider an example... Quicksort of a field that contains Sex as 'M' for male, 'F' for female, or NULL for unknown. The median selection is going to pick one of 'M', 'F', or NULL. After pass 1 of qsort we will have two partitions. One partition will have all of one type and the other partition will have the other two types. An in-order check will tell us that the monotone partition is sorted and we are done with it. Imagine also a table that was clustered but for which we have not updated statistics. Perhaps it is 98% sorted. Checking for order in our partitions is probably a good idea. I think you could also get a good optimization if you are checking for partitions and find a big section of the partition is not ordered (even though the whole thing is not). If you could perk the ordered size up the tree, you could just add another partition to the merge list and sort the unordered part. In "C Unleashed" I call this idea partition discovery mergesort.
"Dann Corbit" <DCorbit@connx.com> writes: >> So my feeling is we should just remove the swap_cnt code and return >> to the original B&M algorithm. > Even if "hunks" of the input are sorted, the test is a very good idea. Yah know, guys, Bentley and McIlroy are each smarter than any five of us, and I'm quite certain it occurred to them to try prechecking for sorted input. If that case is not in their code then it's probably because it's a net loss. Unless you have reason to think that sorted input is *more* common than other cases for the Postgres environment, which is certainly a fact not in evidence. (Bentley was my thesis adviser for awhile before he went to Bell Labs, so my respect for him is based on direct personal experience. McIlroy I only know by reputation, but he's sure got a ton of that.) > Imagine also a table that was clustered but for which we have not > updated statistics. Perhaps it is 98% sorted. Checking for order in > our partitions is probably a good idea. If we are using the sort code rather than the recently-clustered index for such a case, then we have problems elsewhere. This scenario is not a good argument that the sort code needs to be specialized to handle this case at the expense of other cases; the place to be fixing it is the planner or the statistics-management code. regards, tom lane
Well, my point was that it is a snap to implement and test. It will be better, worse, or the same. I agree that Bentley is a bloody genius. > -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Thursday, March 16, 2006 9:27 PM > To: Dann Corbit > Cc: Jonah H. Harris; pgsql-hackers@postgresql.org; Jerry Sievers > Subject: Re: [HACKERS] qsort, once again > > "Dann Corbit" <DCorbit@connx.com> writes: > >> So my feeling is we should just remove the swap_cnt code and return > >> to the original B&M algorithm. > > > Even if "hunks" of the input are sorted, the test is a very good idea. > > Yah know, guys, Bentley and McIlroy are each smarter than any five of > us, and I'm quite certain it occurred to them to try prechecking for > sorted input. If that case is not in their code then it's probably > because it's a net loss. Unless you have reason to think that sorted > input is *more* common than other cases for the Postgres environment, > which is certainly a fact not in evidence. > > (Bentley was my thesis adviser for awhile before he went to Bell Labs, > so my respect for him is based on direct personal experience. McIlroy > I only know by reputation, but he's sure got a ton of that.) > > > Imagine also a table that was clustered but for which we have not > > updated statistics. Perhaps it is 98% sorted. Checking for order in > > our partitions is probably a good idea. > > If we are using the sort code rather than the recently-clustered index > for such a case, then we have problems elsewhere. This scenario is not > a good argument that the sort code needs to be specialized to handle > this case at the expense of other cases; the place to be fixing it is > the planner or the statistics-management code. > > regards, tom lane
"Dann Corbit" <DCorbit@connx.com> writes: > Well, my point was that it is a snap to implement and test. Well, having done this, I have to eat my words: it does seem to be a pretty good idea. The following test numbers are using Bentley & McIlroy's test framework, but modified to test only the case N=10000 rather than the four smaller N values they originally used. I did that because it exposes quadratic behavior more obviously, and the variance in N made it harder to compare comparison ratios for different cases. I also added a "NEARSORT" test method, which sorts the input distribution and then exchanges two elements chosen at random. I did that because I was concerned that nearly sorted input would be the worst case for the presorted-input check, as it would waste the most cycles before failing on such input. With our existing qsort code, the results look like distribution SAWTOOTH: max cratio 94.17, min 0.08, average 1.56 over 105 tests distribution RAND: max cratio 1.06, min 0.08, average 0.51 over 105 tests distribution STAGGER: max cratio 6.08, min 0.23, average 1.01 over 105 tests distribution PLATEAU: max cratio 94.17, min 0.08, average 2.12 over 105 tests distribution SHUFFLE: max cratio 94.17, min 0.23, average 1.92 over 105 tests method COPY: max cratio 6.08, min 0.08, average 0.72 over 75 tests method REVERSE: max cratio 5.34, min 0.08, average 0.69 over 75 tests method FREVERSE: max cratio 94.17, min 0.08, average 5.71 over 75 tests method BREVERSE: max cratio 3.86, min 0.08, average 1.41 over 75 tests method SORT: max cratio 0.82, min 0.08, average 0.31 over 75 tests method NEARSORT: max cratio 0.82, min 0.08, average 0.36 over 75 tests method DITHER: max cratio 5.52, min 0.18, average 0.77 over 75 tests Overall: average cratio 1.42 over 525 tests ("cratio" is the ratio of the actual number of comparison function calls to the theoretical expectation, N log2(N)) That's pretty awful: there are several test cases that make it use nearly 100 times the expected number of comparisons. Removing the swap_cnt test to bring it close to B&M's original recommendations, we get distribution SAWTOOTH: max cratio 3.85, min 0.08, average 0.70 over 105 tests distribution RAND: max cratio 1.06, min 0.08, average 0.52 over 105 tests distribution STAGGER: max cratio 6.08, min 0.58, average 1.12 over 105 tests distribution PLATEAU: max cratio 3.70, min 0.08, average 0.34 over 105 tests distribution SHUFFLE: max cratio 3.86, min 0.86, average 1.24 over 105 tests method COPY: max cratio 6.08, min 0.08, average 0.76 over 75 tests method REVERSE: max cratio 5.34, min 0.08, average 0.75 over 75 tests method FREVERSE: max cratio 4.56, min 0.08, average 0.73 over 75 tests method BREVERSE: max cratio 3.86, min 0.08, average 1.41 over 75 tests method SORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests method NEARSORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests method DITHER: max cratio 3.73, min 0.18, average 0.72 over 75 tests Overall: average cratio 0.78 over 525 tests which is a whole lot better as to both average and worst cases. I then added some code to check for presorted input (just after the n<7 insertion sort code): #ifdef CHECK_SORTED presorted = 1; for (pm = (char *) a + es; pm < (char *) a + n * es; pm += es) { if (cmp(pm - es, pm) > 0) { presorted = 0; break; } } if (presorted) return; #endif This gives distribution SAWTOOTH: max cratio 3.88, min 0.08, average 0.62 over 105 tests distribution RAND: max cratio 1.06, min 0.08, average 0.46 over 105 tests distribution STAGGER: max cratio 6.15, min 0.08, average 0.98 over 105 tests distribution PLATEAU: max cratio 3.79, min 0.08, average 0.31 over 105 tests distribution SHUFFLE: max cratio 3.91, min 0.08, average 1.09 over 105 tests method COPY: max cratio 6.15, min 0.08, average 0.72 over 75 tests method REVERSE: max cratio 5.34, min 0.08, average 0.76 over 75 tests method FREVERSE: max cratio 4.58, min 0.08, average 0.73 over 75 tests method BREVERSE: max cratio 3.91, min 0.08, average 1.44 over 75 tests method SORT: max cratio 0.08, min 0.08, average 0.08 over 75 tests method NEARSORT: max cratio 0.89, min 0.08, average 0.39 over 75 tests method DITHER: max cratio 3.73, min 0.18, average 0.72 over 75 tests Overall: average cratio 0.69 over 525 tests So the worst case seems only very marginally worse, and there is a definite improvement in the average case, even for inputs that aren't entirely sorted. Importantly, the "near sorted" case that I thought might send it into quadratic behavior doesn't seem to do that. So, unless anyone wants to do further testing, I'll go ahead and commit these changes. regards, tom lane PS: Just as a comparison point, here are the results when testing HPUX's library qsort: distribution SAWTOOTH: max cratio 7.00, min 0.08, average 0.76 over 105 tests distribution RAND: max cratio 1.11, min 0.08, average 0.53 over 105 tests distribution STAGGER: max cratio 7.05, min 0.58, average 1.24 over 105 tests distribution PLATEAU: max cratio 7.00, min 0.08, average 0.43 over 105 tests distribution SHUFFLE: max cratio 7.00, min 0.86, average 1.54 over 105 tests method COPY: max cratio 6.70, min 0.08, average 0.79 over 75 tests method REVERSE: max cratio 7.05, min 0.08, average 0.78 over 75 tests method FREVERSE: max cratio 7.00, min 0.08, average 0.77 over 75 tests method BREVERSE: max cratio 7.00, min 0.08, average 2.11 over 75 tests method SORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests method NEARSORT: max cratio 0.86, min 0.08, average 0.56 over 75 tests method DITHER: max cratio 4.06, min 0.16, average 0.74 over 75 tests Overall: average cratio 0.90 over 525 tests and here are the results using glibc's qsort, which of course isn't quicksort at all but some kind of merge sort: distribution SAWTOOTH: max cratio 0.90, min 0.49, average 0.65 over 105 tests distribution RAND: max cratio 0.91, min 0.49, average 0.76 over 105 tests distribution STAGGER: max cratio 0.92, min 0.49, average 0.70 over 105 tests distribution PLATEAU: max cratio 0.84, min 0.49, average 0.54 over 105 tests distribution SHUFFLE: max cratio 0.64, min 0.49, average 0.52 over 105 tests method COPY: max cratio 0.92, min 0.49, average 0.66 over 75 tests method REVERSE: max cratio 0.92, min 0.49, average 0.68 over 75 tests method FREVERSE: max cratio 0.92, min 0.49, average 0.67 over 75 tests method BREVERSE: max cratio 0.92, min 0.49, average 0.68 over 75 tests method SORT: max cratio 0.49, min 0.49, average 0.49 over 75 tests method NEARSORT: max cratio 0.55, min 0.49, average 0.51 over 75 tests method DITHER: max cratio 0.92, min 0.50, average 0.74 over 75 tests Overall: average cratio 0.63 over 525 tests PPS: final version of test framework attached for the archives.
Attachment
Tom Lane <tgl@sss.pgh.pa.us> writes: > and here are the results using glibc's qsort, which of course isn't > quicksort at all but some kind of merge sort: > ... > Overall: average cratio 0.63 over 525 tests That looks better both on average and in the worst case. Are the time constants that much worse that the merge sort still takes longer? -- greg
Greg Stark <gsstark@mit.edu> writes: > That looks better both on average and in the worst case. Are the time > constants that much worse that the merge sort still takes longer? Keep in mind that this is only counting the number of comparison-function calls; it's not accounting for any other effects. In particular, for a large sort operation quicksort might win because of its more cache-friendly memory access patterns. The whole question of our qsort vs the system library's qsort probably needs to be revisited, however, now that we've identified and fixed this particular performance issue. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Greg Stark <gsstark@mit.edu> writes: > > That looks better both on average and in the worst case. Are the time > > constants that much worse that the merge sort still takes longer? > > Keep in mind that this is only counting the number of > comparison-function calls; it's not accounting for any other effects. > In particular, for a large sort operation quicksort might win because of > its more cache-friendly memory access patterns. My question explicitly recognized that possibility. I'm just a little skeptical since the comparison function in Postgres is often not some simple bit of tightly optimized C code, but rather a complex locale sensitive comparison function or even a bit of SQL expression to evaluate. Cache effectiveness is may be a minimal factor anyways when the comparison is executing more than a minimal amount of code. And one extra comparison is going to cost a lot more too. -- greg
Greg Stark <gsstark@mit.edu> writes: > My question explicitly recognized that possibility. I'm just a little > skeptical since the comparison function in Postgres is often not some simple > bit of tightly optimized C code, but rather a complex locale sensitive > comparison function or even a bit of SQL expression to evaluate. Yeah, I'd guess the same way, but OTOH at least a few people have reported that our qsort code is consistently faster than glibc's (and that was before this fix). See this thread: http://archives.postgresql.org/pgsql-hackers/2005-12/msg00610.php Currently I believe that we only use our qsort on Solaris, not any other platform, so if you think that glibc's qsort is better then you've already got your wish. It seems to need more investigation though. In particular, I'm thinking that the various adjustments we've made to the sort support code over the past month probably invalidate any previous testing of the point, and that we ought to go back and redo those comparisons. regards, tom lane