Re: Memory-Bounded Hash Aggregation - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Memory-Bounded Hash Aggregation
Date
Msg-id 20190704142417.5lw4qzpog6vpue2b@development
Whole thread Raw
In response to Re: Memory-Bounded Hash Aggregation  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On Wed, Jul 03, 2019 at 07:03:06PM -0700, Jeff Davis wrote:
>On Wed, 2019-07-03 at 02:17 +0200, Tomas Vondra wrote:
>> What does "partitioned hash strategy" do? It's probably explained in
>> one
>> of the historical discussions, but I'm not sure which one. I assume
>> it
>> simply hashes the group keys and uses that to partition the data, and
>> then
>> passing it to hash aggregate.
>
>Yes. When spilling, it is cheap to partition on the hash value at the
>same time, which dramatically reduces the need to spill multiple times.
>Previous discussions:
>
>
>> Unfortunately the second link does not work :-(
>
>It's supposed to be:
>
>
>https://www.postgresql.org/message-id/CAGTBQpa__-NP7%3DkKwze_enkqw18vodRxKkOmNhxAPzqkruc-8g%40mail.gmail.com
>
>
>> I'm not going to block Approach 1, althought I'd really like to see
>> something that helps with array_agg.
>
>I have a WIP patch that I just posted. It doesn't yet work with
>ARRAY_AGG, but I think it can be made to work by evicting the entire
>hash table, serializing the transition states, and then later combining
>them.
>
>> Aren't all three approaches a way to "fix" hash aggregate? In any
>> case,
>> it's certainly reasonable to make incremental changes. The question
>> is
>> whether "approach 1" is sensible step towards some form of "approach
>> 3"
>
>Disk-based hashing certainly seems like a reasonable algorithm on paper
>that has some potential advantages over sorting. It certainly seems
>sensible to me that we explore the disk-based hashing strategy first,
>and then we would at least know what we are missing (if anything) by
>going with the hybrid approach later.
>
>There's also a fair amount of design space to explore in the hybrid
>strategy. That could take a while to converge, especially if we don't
>have anything in place to compare against.
>

Makes sense. I haven't thought about how the hybrid approach would be
implemented very much, so I can't quite judge how complicated would it be
to extend "approach 1" later. But if you think it's a sensible first step,
I trust you. And I certainly agree we need something to compare the other
approaches against.


>> > * It means we have a hash table and sort running concurrently, each
>> >  using memory. Andres said this might not be a problem[3], but I'm
>> >  not convinced that the problem is zero. If you use small work_mem
>> >  for the write phase of sorting, you'll end up with a lot of runs
>> > to
>> >  merge later and that has some kind of cost.
>> >
>>
>> Why would we need to do both concurrently? I thought we'd empty the
>> hash
>> table before doing the sort, no?
>
>So you are saying we spill the tuples into a tuplestore, then feed the
>tuplestore through a tuplesort? Seems inefficient, but I guess we can.
>

I think the question is whether we see this as "emergency fix" (for cases
that are misestimated and could/would fail with OOM at runtime), or as
something that is meant to make "hash agg" more widely applicable.

I personally see it as an emergency fix, in which cases it's perfectly
fine if it's not 100% efficient, assuming it kicks in only rarely.
Effectively, we're betting on hash agg, and from time to time we lose.

But even if we see it as a general optimization technique it does not have
to be perfectly efficient, as long as it's properly costed (so the planner
only uses it when appropriate).

If we have a better solution (in terms of efficiency, code complexity,
etc.) then sure - let's use that. But considering we've started this
discussion in ~2015 and we still don't have anything, I wouldn't hold my
breath. Let's do something good enough, and maybe improve it later.

>> Do we actually need to handle that case? How many such aggregates are
>> there? I think it's OK to just ignore that case (and keep doing what
>> we do
>> now), and require serial/deserial functions for anything better.
>
>Punting on a few cases is fine with me, if the user has a way to fix
>it.
>

+1 to doing that


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Introduce MIN/MAX aggregate functions to pg_lsn
Next
From: Tomas Vondra
Date:
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)