Re: 9.5: Better memory accounting, towards memory-bounded HashAgg - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: 9.5: Better memory accounting, towards memory-bounded HashAgg
Date
Msg-id CAMkU=1wguPocxMU+2dcLctrVFJgA4hB-EP8=xxB6JhA6CgsccA@mail.gmail.com
Whole thread Raw
In response to Re: 9.5: Better memory accounting, towards memory-bounded HashAgg  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On Sun, Dec 28, 2014 at 12:45 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Sun, Dec 28, 2014 at 12:37 PM, Jeff Davis <pgsql@j-davis.com> wrote:
> Do others have similar numbers? I'm quite surprised at how little
> work_mem seems to matter for these plans (HashJoin might be a different
> story though). I feel like I made a mistake -- can someone please do a
> sanity check on my numbers?

I have seen external sorts that were quicker than internal sorts
before. With my abbreviated key patch, under certain circumstances
external sorts are faster, while presumably the same thing is true of
int4 attribute sorts today. Actually, I saw a 10MB work_mem setting
that was marginally faster than a multi-gigabyte one that fit the
entire sort in memory. It probably has something to do with caching
effects dominating over the expense of more comparisons, since higher
work_mem settings that still resulted in an external sort were slower
than the 10MB setting.

I was surprised by this too, but it has been independently reported by
Jeff Janes.

I don't recall (at the moment) seeing our external sort actually faster than quick-sort, but I've very reliably seen external sorts get faster with less memory than with more.  It is almost certainly a CPU caching issue.  Very large simple binary heaps are horrible on the CPU cache. And for sort-by-reference values, quick sort is also pretty bad.  

With a slow enough data bus between the CPU and main memory, I don't doubt that a 'tapesort' with small work_mem could actually be faster than quicksort with large work_mem.  But I don't recall seeing it myself.  But I'd be surprised that a tapesort as currently implemented would be faster than a quicksort if the tapesort is using just one byte less memory than the quicksort is.

But to Jeff Davis's question, yes, tapesort is not very sensitive to work_mem, and to the extent it is sensitive it is in the other direction of more memory being bad.  Once work_mem is so small that it takes multiple passes over the data to do the merge, then small memory would really be a problem.  But on modern hardware you have to get pretty absurd settings before that happens.

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Luis Menina
Date:
Subject: Re: pg_ctl {start,restart,reload} bad handling of stdout file descriptor
Next
From: Heikki Linnakangas
Date:
Subject: Re: Documentation of bt_page_items()'s ctid field