Thread: Sort performance

Sort performance

From
Gregory Stark
Date:
I'm not sure if this is good news or bad news. Either some kudos are due to
the gang that worked on the external sort performance or something's very
wrong with the qsort implementation in glibc because I'm seeing Postgres's
external sort perform better than qsort.

This is despite Postgres external sorts having to execute filesystem calls
pushing buffers back and forth between user-space and kernel-space, which
seems hard to believe. I feel like something's got to be pretty far wrong with
the qsort call here for this to be possible.

At first I chalked this up to qsort having O(n^2) behaviour occasionally but
a) This is glibc where qsort is actually mergesort which should behave pretty
similarly to Postgres's mergesort and b) the input data is randomized pretty
well so it really ought be a problem even were it qsort.
   Mem     Runs    Time   ----    ----    ----   1MB    18    8.25s   10MB    3    5.6s   100MB    qsort    6.1s

The input is a table with one column, a text field. It contains
/usr/share/dict/words ordered by random() and then repeated a bunch of times.
(Sorry about the imprecision, I set this table up a while ago and don't
remember exactly what I did). a

The machine has plenty of RAM and isn't swapping or running any other
services.


--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


Re: Sort performance

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
>     Mem     Runs    Time
>     ----    ----    ----
>     1MB    18    8.25s
>     10MB    3    5.6s
>     100MB    qsort    6.1s

I'm confused what this means exactly?  Are you saying that in the first
two cases, there were 18 and 3 sorted runs generated in the initial
pass, and in the third case we just did the sort in memory using qsort?

How many items are being sorted, exactly?  Since it's text, it probably
also makes a big difference what LC_COLLATE setting you are using.
Non-C sort locale could mean that the strcoll() calls swamp all else.

How long does it take sort(1) to do the same task?
        regards, tom lane