On Mon, Sep 11, 2017 at 8:50 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Sep 11, 2017 at 11:47 AM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> The question is what is the optimal replacement_sort_tuples value? I
>> assume it's the number of tuples that effectively uses CPU caches, at
>> least that's what our docs say. So I think you're right it to 1B rows
>> may break this assumption, and make it perform worse.
>>
>> But perhaps the fact that we're testing with multiple work_mem values,
>> and with smaller data sets (100k or 1M rows) makes this a non-issue?
>
> I am not sure that's the case -- I think that before Peter's changes
> it was pretty easy to find cases where lowering work_mem made sorting
> ordered data go faster.
Before enhancements to merging by both Heikki and myself went into
Postgres 10, there were cases where replacement selection of the first
run did relatively well, and cases where it did badly despite managing
to avoid a merge. The former were cases where work_mem was low, and
the latter were cases where it was high. This was true because the
heap used is cache inefficient. When it was small/cache efficient, and
when there was ordering to exploit, replacement selection could win.
Your recollection there sounds accurate to me.
These days, even a small, cache-efficient heap is generally not good
enough to beat quicksorting, since merging attained the ability to
exploit presortedness itself within commit 24598337c, and memory
utilization for merging got better, too. Very roughly speaking,
merging attained the same advantage that replacement selection had all
along, and replacement selection lost all ability to compete.
-- 
Peter Geoghegan
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers