Re: Memory-Bounded Hash Aggregation - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Memory-Bounded Hash Aggregation |
Date | |
Msg-id | 20190711155520.zqipi23o3yybx7ym@development Whole thread Raw |
In response to | Re: Memory-Bounded Hash Aggregation (Jeff Davis <pgsql@j-davis.com>) |
Responses |
Re: Memory-Bounded Hash Aggregation
|
List | pgsql-hackers |
On Wed, Jul 03, 2019 at 07:03:06PM -0700, Jeff Davis wrote: >On Wed, 2019-07-03 at 02:17 +0200, Tomas Vondra wrote: >> What does "partitioned hash strategy" do? It's probably explained in >> one >> of the historical discussions, but I'm not sure which one. I assume >> it >> simply hashes the group keys and uses that to partition the data, and >> then >> passing it to hash aggregate. > >Yes. When spilling, it is cheap to partition on the hash value at the >same time, which dramatically reduces the need to spill multiple times. >Previous discussions: > > >> Unfortunately the second link does not work :-( > >It's supposed to be: > > >https://www.postgresql.org/message-id/CAGTBQpa__-NP7%3DkKwze_enkqw18vodRxKkOmNhxAPzqkruc-8g%40mail.gmail.com > > >> I'm not going to block Approach 1, althought I'd really like to see >> something that helps with array_agg. > >I have a WIP patch that I just posted. It doesn't yet work with >ARRAY_AGG, but I think it can be made to work by evicting the entire >hash table, serializing the transition states, and then later combining >them. > >> Aren't all three approaches a way to "fix" hash aggregate? In any >> case, >> it's certainly reasonable to make incremental changes. The question >> is >> whether "approach 1" is sensible step towards some form of "approach >> 3" > >Disk-based hashing certainly seems like a reasonable algorithm on paper >that has some potential advantages over sorting. It certainly seems >sensible to me that we explore the disk-based hashing strategy first, >and then we would at least know what we are missing (if anything) by >going with the hybrid approach later. > >There's also a fair amount of design space to explore in the hybrid >strategy. That could take a while to converge, especially if we don't >have anything in place to compare against. > Makes sense. I haven't thought about how the hybrid approach would be implemented very much, so I can't quite judge how complicated would it be to extend "approach 1" later. But if you think it's a sensible first step, I trust you. And I certainly agree we need something to compare the other approaches against. >> > * It means we have a hash table and sort running concurrently, each >> > using memory. Andres said this might not be a problem[3], but I'm >> > not convinced that the problem is zero. If you use small work_mem >> > for the write phase of sorting, you'll end up with a lot of runs >> > to >> > merge later and that has some kind of cost. >> > >> >> Why would we need to do both concurrently? I thought we'd empty the >> hash >> table before doing the sort, no? > >So you are saying we spill the tuples into a tuplestore, then feed the >tuplestore through a tuplesort? Seems inefficient, but I guess we can. > I think the question is whether we see this as "emergency fix" (for cases that are misestimated and could/would fail with OOM at runtime), or as something that is meant to make "hash agg" more widely applicable. I personally see it as an emergency fix, in which cases it's perfectly fine if it's not 100% efficient, assuming it kicks in only rarely. Effectively, we're betting on hash agg, and from time to time we lose. But even if we see it as a general optimization technique it does not have to be perfectly efficient, as long as it's properly costed (so the planner only uses it when appropriate). If we have a better solution (in terms of efficiency, code complexity, etc.) then sure - let's use that. But considering we've started this discussion in ~2015 and we still don't have anything, I wouldn't hold my breath. Let's do something good enough, and maybe improve it later. >> Do we actually need to handle that case? How many such aggregates are >> there? I think it's OK to just ignore that case (and keep doing what >> we do >> now), and require serial/deserial functions for anything better. > >Punting on a few cases is fine with me, if the user has a way to fix >it. > +1 to doing that regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: