Re: B-Tree support function number 3 (strxfrm() optimization) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: B-Tree support function number 3 (strxfrm() optimization)
Date
Msg-id CAM3SWZSxEUJJKcxBCHFNUf=8Yx582Rwgk=8T1xSpdBD3VK5Xgg@mail.gmail.com
Whole thread Raw
In response to Re: B-Tree support function number 3 (strxfrm() optimization)  (Peter Geoghegan <pg@heroku.com>)
Responses Re: B-Tree support function number 3 (strxfrm() optimization)
List pgsql-hackers
I've looked at another (admittedly sympathetic) dataset that is
publicly available: the flickr "tags" dataset [1]. I end up with a
single table of tags; it's a large enough table, at 388 MB, but the
tuples are not very wide. There are 7,656,031 tags/tuples.

Predictably enough, this query is very fast when an internal sort is
used on a patched Postgres:

select * from (select * from tags order by tag offset 100000000) ii;

Git master takes about 25 seconds to execute the query. Patched takes
about 6.8 seconds. That seems very good, but this is not really new
information.

However, with work_mem set low enough to get an external sort, the
difference is more interesting. If I set work_mem to 10 MB, then the
query takes about 10.7 seconds to execute with a suitably patched
Postgres. Whereas on master, it consistently takes a full 69 seconds.
That's the largest improvement I've seen so far, for any case.

I must admit that this did surprise me, but then I don't grok tape
sort. What's particularly interesting here is that when work_mem is
cranked up to 512MB, which is a high setting, but still not high
enough to do an internal sort, the difference closes in a bit. Instead
of 41 runs, there are only 2. Patched now takes 16.3 seconds.
Meanwhile, master is somewhat improved, and consistently takes 65
seconds to complete the sort.

This probably has something to do with CPU cache effects. I believe
that all world class external sorting algorithms are cache sensitive.
I'm not sure what the outcome would have been had there not been a
huge amount of memory available for the OS cache to use, which there
was. I think there is probably something to learn about how to improve
tape sort here.

Does anyone recall hearing complaints around higher work_mem settings
regressing performance?

[1]
http://www.isi.edu/integration/people/lerman/load.html?src=http://www.isi.edu/~lerman/downloads/flickr/flickr_taxonomies.html
, bottom of page
-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Append to a GUC parameter ?
Next
From: Alvaro Herrera
Date:
Subject: Re: Minmax indexes