Re: Using quicksort for every external sort run - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Using quicksort for every external sort run
Date
Msg-id 27716590-56b5-baf0-71ee-d27f309c1acc@2ndquadrant.com
Whole thread Raw
In response to Re: Using quicksort for every external sort run  (Peter Geoghegan <pg@heroku.com>)
Responses Re: Using quicksort for every external sort run  (Peter Geoghegan <pg@heroku.com>)
Re: Using quicksort for every external sort run  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

So, let me sum this up, the way I understand the current status.


1) overall, the patch seems to be a clear performance improvement

There's far more "green" cells than "red" ones in the spreadsheets, and 
the patch often shaves off 30-75% of the sort duration. Improvements are 
pretty much all over the board, for all data sets (low/high/unique 
cardinality, initial ordering) and data types.


2) it's unlikely we can improve the performance further

The regressions are limited to low work_mem settings, which we believe 
are not representative (or at least not as much as the higher work_mem 
values), for two main reasons.

Firstly, if you need to sort a lot of data (e.g. 10M, as benchmarked), 
it's quite reasonable to use larger work_mem values. It'd be a bit 
backwards to reject a patch that gets you 2-4x speedup with enough 
memory, on the grounds that it may have negative impact with 
unreasonably small work_mem values.

Secondly, master is faster only if there's enough on-CPU cache for the 
replacement sort (for the memtuples heap), but the benchmark is not 
realistic in this respect as it only ran 1 query at a time, so it used 
the whole cache (6MB for i5, 12MB for Xeon).

In reality there will be multiple processes running at the same time 
(e.g backends when running parallel query), significantly reducing the 
amount of cache per process, making the replacement sort inefficient and 
thus eliminating the regressions (by making the master slower).


3) replacement_sort_mem GUC

I'm not quite sure what's the plan with this GUC. It was useful for 
development, but it seems to me it's pretty difficult to tune it in 
practice (especially if you don't know the internals, which users 
generally don't).

The current patch includes the new GUC right next to work_mem, which 
seems rather unfortunate - I do expect users to simply mess with 
assuming "more is better" which seems to be rather poor idea.

So I think we should either remove the GUC entirely, or move it to the 
developer section next to trace_sort (and removing it from the conf).

I'm wondering whether 16MB default is not a bit too much, actually. As 
explained before, that's not the amount of cache we should expect per 
process, so maybe ~2-4MB would be a better default value?

Also, not what I'm re-reading the docs for the GUC, I realize it also 
depends on how the input data is correlated - that seems like a rather 
useless criteria for tuning, though, because that varies per sort node, 
so using that for a GUC value set in postgresql.conf does not seem very 
wise. Actually even on per-query basis that's rather dubious, as it 
depends on how the sort node gets data (some nodes preserve ordering, 
some don't).

BTW couldn't we tune the value automatically for each sort, using the 
pg_stats.correlation for the sort keys, when available (increasing the 
replacement_sort_mem when correlation is close to 1.0)? Wouldn't that 
improve at least some of the regressions?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Move PinBuffer and UnpinBuffer to atomics
Next
From: Michael Paquier
Date:
Subject: Re: remove wal_level archive