Re: qsort, once again - Mailing list pgsql-hackers

From Tom Lane
Subject Re: qsort, once again
Date
Msg-id 21426.1142544039@sss.pgh.pa.us
Whole thread Raw
In response to Re: qsort, once again  ("Jonah H. Harris" <jonah.harris@gmail.com>)
List pgsql-hackers
"Jonah H. Harris" <jonah.harris@gmail.com> writes:
> On 3/16/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> So we still have a problem of software archaeology: who added the
>> insertion sort switch to the NetBSD version, and on what grounds?

> AFAICS, the insertion sort was added in BSD 4.4-lite and was inherited by
> NetBSD in CVS version 1.1.1.2.

> The previous version in NetBSD (before 4.4-lite) also included an insertion
> sort with the comment:

> /*
>  * Knuth, Vol. 3, page 116, Algorithm Q, step b, argues that a single pass
>  * of straight insertion sort after partitioning is complete is better than
>  * sorting each small partition as it is created.  This isn't correct in th=
> is
>  * implementation because comparisons require at least one (and often two)
>  * function calls and are likely to be the dominating expense of the sort.
>  * Doing a final insertion sort does more comparisons than are necessary
>  * because it compares the "edges" and medians of the partitions which are
>  * known to be already sorted.

I think this is talking about replacing the bottom-level insertion sorts
(the "if n < 7" case at the top of the function) with a single
insertion-sort pass over the whole array at the end of the recursion.
Not really the same consideration.  Bentley & McIlroy specifically
reject that idea for the same reasons mentioned in this comment, but
I see no discussion in the paper of the swap_cnt idea in the BSD code.

I just repeated the random-data test I made here:
http://archives.postgresql.org/pgsql-hackers/2006-02/msg00590.php
using a version of qsort.c with the "if (swap_cnt == 0)" segment
removed entirely (and hence also removing the computation of swap_cnt,
which saves a cycle or two in the inner loops).  CVS tip seems to be
faster at this than what we had on Feb 15, so I also repeated the
test using the unmodified qsort.c.  Here's the results:

100 runs with CVS-tip qsort.c, sorted into ascending runtime order:

Time: 337.520 ms
Time: 338.335 ms
Time: 338.990 ms
Time: 339.116 ms
Time: 339.282 ms
Time: 339.398 ms
Time: 339.457 ms
Time: 339.623 ms
Time: 339.880 ms
Time: 339.929 ms
Time: 340.258 ms
Time: 340.266 ms
Time: 340.431 ms
Time: 341.042 ms
Time: 341.053 ms
Time: 341.224 ms
Time: 341.506 ms
Time: 341.851 ms
Time: 343.298 ms
Time: 344.750 ms
Time: 347.685 ms
Time: 350.444 ms
Time: 354.230 ms
Time: 361.638 ms
Time: 383.529 ms
Time: 386.735 ms
Time: 389.246 ms
Time: 389.675 ms
Time: 395.183 ms
Time: 406.176 ms
Time: 408.327 ms
Time: 444.531 ms
Time: 507.625 ms
Time: 513.625 ms
Time: 527.424 ms
Time: 530.318 ms
Time: 537.642 ms
Time: 549.844 ms
Time: 554.505 ms
Time: 561.502 ms
Time: 590.159 ms
Time: 606.344 ms
Time: 661.434 ms
Time: 686.666 ms
Time: 697.818 ms
Time: 717.949 ms
Time: 786.298 ms
Time: 843.092 ms
Time: 855.292 ms
Time: 869.975 ms
Time: 874.331 ms
Time: 887.904 ms
Time: 1027.981 ms
Time: 1053.258 ms
Time: 1101.546 ms
Time: 1304.076 ms
Time: 1318.657 ms
Time: 1743.875 ms
Time: 1986.987 ms
Time: 2045.737 ms
Time: 2192.382 ms
Time: 2263.969 ms
Time: 2328.861 ms
Time: 2381.596 ms
Time: 2389.563 ms
Time: 2452.888 ms
Time: 2503.476 ms
Time: 2575.836 ms
Time: 2597.095 ms
Time: 2645.894 ms
Time: 2686.030 ms
Time: 2932.877 ms
Time: 3009.602 ms
Time: 3077.843 ms
Time: 3106.678 ms
Time: 3213.136 ms
Time: 3279.999 ms
Time: 3674.831 ms
Time: 3707.489 ms
Time: 3765.390 ms
Time: 3782.216 ms
Time: 3972.455 ms
Time: 3974.928 ms
Time: 4344.905 ms
Time: 4800.817 ms
Time: 4837.485 ms
Time: 4944.734 ms
Time: 5172.095 ms
Time: 5434.390 ms
Time: 5599.830 ms
Time: 5608.934 ms
Time: 5684.670 ms
Time: 5740.895 ms
Time: 6112.534 ms
Time: 7521.372 ms
Time: 7609.963 ms
Time: 7636.002 ms
Time: 8090.015 ms
Time: 8805.475 ms
Time: 9244.463 ms

100 runs with modified qsort.c:

Time: 378.068 ms
Time: 379.056 ms
Time: 379.485 ms
Time: 379.486 ms
Time: 379.612 ms
Time: 379.832 ms
Time: 379.981 ms
Time: 380.643 ms
Time: 380.824 ms
Time: 380.870 ms
Time: 380.920 ms
Time: 381.215 ms
Time: 381.441 ms
Time: 381.475 ms
Time: 381.848 ms
Time: 381.895 ms
Time: 381.969 ms
Time: 381.993 ms
Time: 382.034 ms
Time: 382.202 ms
Time: 382.361 ms
Time: 382.585 ms
Time: 382.692 ms
Time: 382.734 ms
Time: 382.929 ms
Time: 382.985 ms
Time: 383.107 ms
Time: 383.335 ms
Time: 383.505 ms
Time: 383.514 ms
Time: 383.635 ms
Time: 383.674 ms
Time: 383.704 ms
Time: 383.744 ms
Time: 383.821 ms
Time: 383.832 ms
Time: 383.918 ms
Time: 383.922 ms
Time: 384.125 ms
Time: 384.127 ms
Time: 384.394 ms
Time: 384.500 ms
Time: 384.517 ms
Time: 384.517 ms
Time: 384.590 ms
Time: 384.615 ms
Time: 384.620 ms
Time: 384.678 ms
Time: 384.705 ms
Time: 384.715 ms
Time: 384.786 ms
Time: 384.957 ms
Time: 385.386 ms
Time: 385.537 ms
Time: 385.602 ms
Time: 385.685 ms
Time: 385.800 ms
Time: 385.832 ms
Time: 385.845 ms
Time: 385.846 ms
Time: 385.910 ms
Time: 386.094 ms
Time: 386.270 ms
Time: 386.360 ms
Time: 386.420 ms
Time: 386.841 ms
Time: 387.040 ms
Time: 387.121 ms
Time: 387.173 ms
Time: 387.196 ms
Time: 387.310 ms
Time: 387.421 ms
Time: 387.494 ms
Time: 387.548 ms
Time: 387.563 ms
Time: 388.201 ms
Time: 388.270 ms
Time: 389.499 ms
Time: 389.770 ms
Time: 389.941 ms
Time: 390.113 ms
Time: 390.347 ms
Time: 390.846 ms
Time: 392.564 ms
Time: 393.605 ms
Time: 394.892 ms
Time: 395.656 ms
Time: 396.319 ms
Time: 396.585 ms
Time: 412.060 ms
Time: 421.903 ms
Time: 477.489 ms
Time: 516.069 ms
Time: 563.814 ms
Time: 571.577 ms
Time: 584.821 ms
Time: 689.520 ms
Time: 705.267 ms
Time: 779.866 ms
Time: 841.646 ms

So at least on randomized data, the swap_cnt thing is a serious loser.
Need to run some tests on special-case inputs though.  Anyone have a
test suite they like?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Separate BLCKSZ for data and logging
Next
From: "Dann Corbit"
Date:
Subject: Re: qsort, once again