Re: B-Tree support function number 3 (strxfrm() optimization) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: B-Tree support function number 3 (strxfrm() optimization)
Date
Msg-id CAM3SWZRvK3jCpnewxM9OMKW_JvyheaC45u6sFcQQ9E1vt=9sqQ@mail.gmail.com
Whole thread Raw
In response to Re: B-Tree support function number 3 (strxfrm() optimization)  (Peter Geoghegan <pg@heroku.com>)
Responses Re: B-Tree support function number 3 (strxfrm() optimization)
List pgsql-hackers
On Tue, Aug 5, 2014 at 9:32 PM, Peter Geoghegan <pg@heroku.com> wrote:
> I knew that I'd heard that at least once. Apparently some other
> database systems have external sorts that tend to be faster than
> equivalent internal sorts. I'd guess that that is an artifact of their
> having a substandard internal sort, though.

This *almost* applies to patched Postgres if you pick a benchmark that
is very sympathetic to my patch. To my surprise, work_mem = '10MB'
(which results in an external tape sort) is sometimes snapping at the
heels of a work_mem = '5GB' setting (which results in an in-memory
quicksort).

I have a 338 MB table, that consists or a single text column of 8 byte
strings strings, with high cardinality. I ran VACUUM FREEZE, and took
all the usual precautions of that kind. On the test table n_distinct =
-1, and there is no discernible physical/logical correlation.

The external sort case stabilized as follows:

LOG:  duration: 9731.776 ms  statement: select * from (select * from
besttest order by tt offset 10000000) i;
LOG:  duration: 9742.948 ms  statement: select * from (select * from
besttest order by tt offset 10000000) i;
LOG:  duration: 9733.918 ms  statement: select * from (select * from
besttest order by tt offset 10000000) i;

The in-memory case stabilized as follows:

LOG:  duration: 0.059 ms  statement: set work_mem = '5GB';
LOG:  duration: 9665.731 ms  statement: select * from (select * from
besttest order by tt offset 10000000) i;
LOG:  duration: 9602.841 ms  statement: select * from (select * from
besttest order by tt offset 10000000) i;
LOG:  duration: 9609.107 ms  statement: select * from (select * from
besttest order by tt offset 10000000) i;

FWIW, master performed as follows with work_mem = '10MB':

LOG:  duration: 60456.943 ms  statement: select * from (select * from
besttest order by tt offset 10000000) i;
LOG:  duration: 60368.987 ms  statement: select * from (select * from
besttest order by tt offset 10000000) i;
LOG:  duration: 61223.942 ms  statement: select * from (select * from
besttest order by tt offset 10000000) i;

And master did quite a lot better with work_mem = '5GB', in a way that
fits with my prejudices about how quicksort is supposed to perform
relative to tape sort:

LOG:  duration: 0.060 ms  statement: set work_mem = '5GB';
LOG:  duration: 41697.659 ms  statement: select * from (select * from
besttest order by tt offset 10000000) i;
LOG:  duration: 41755.496 ms  statement: select * from (select * from
besttest order by tt offset 10000000) i;
LOG:  duration: 41883.888 ms  statement: select * from (select * from
besttest order by tt offset 10000000) i;

work_mem = '10MB' continues to be the external sort sweet spot for
this hardware, for whatever reason - I can add a few seconds to
execution time by increasing or decreasing the setting a bit. I'm
using an Intel Core i7-3520M CPU @ 2.90GHz, with a /proc/cpuinfo
reported L3 cache size of 4096 KB. I have been very careful to take
into account power saving features throughout all
experimentation/benchmarking of this patch and previous abbreviated
key patches - failing to do so is a good way to end up with complete
garbage when investigating this kind of thing.

Anyway, I'm not sure what this tells us about quicksort and tape sort,
but I think there might be an interesting and more general insight to
be gained here. I'd have thought that tape sort wastes memory
bandwidth by copying to operating system buffers to the extent that
things are slowed down considerably (this is after all a test
performed with lots of memory available, even when work_mem = '10
MB'). And even if that wasn't a significant factor, I'd expect
quicksort to win decisively anyway. Why does this happen?

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: psql: show only failed queries
Next
From: Michael Paquier
Date:
Subject: Re: Fixed redundant i18n strings in json