Re: 9.5: Memory-bounded HashAgg - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: 9.5: Memory-bounded HashAgg
Date
Msg-id 53FFA705.5010209@fuzzy.cz
Whole thread Raw
In response to Re: 9.5: Memory-bounded HashAgg  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: 9.5: Memory-bounded HashAgg
List pgsql-hackers
On 26.8.2014 21:38, Jeff Davis wrote:
> On Tue, 2014-08-26 at 12:39 +0300, Heikki Linnakangas wrote:
>> I think this is enough for this commitfest - we have consensus on
>> the design. For the next one, please address those open items, and
>> resubmit.
> 
> Agreed, return with feedback.
> 
> I need to get the accounting patch in first, which needs to address 
> some performance issues, but there's a chance of wrapping those up 
> quickly.

Sounds good to me.

I'd like to coordinate our efforts on this a bit, if you're interested.

I've been working on the hashjoin-like batching approach PoC (because I
proposed it, so it's fair I do the work), and I came to the conclusion
that it's pretty much impossible to implement on top of dynahash. I
ended up replacing it with a hashtable (similar to the one in the
hashjoin patch, unsurprisingly), which supports the batching approach
well, and is more memory efficient and actually faster (I see ~25%
speedup in most cases, although YMMV).

I plan to address this in 4 patches:

(1) replacement of dynahash by the custom hash table (done)

(2) memory accounting (not sure what's your plan, I've used the   approach I proposed on 23/8 for now, with a few
bugfixes/cleanups)

(3) applying your HashWork patch on top of this (I have this mostly   completed, but need to do more testing over the
weekend)

(4) extending this with the batching I proposed, initially only for   aggregates with states that we can
serialize/deserializeeasily   (e.g. types passed by value) - I'd like to hack on this next week
 

So at this point I have (1) and (2) pretty much ready, (3) is almost
complete and I plan to start hacking on (4). Also, this does not address
the open items listed in your message.


But I agree this is more complex than the patch you proposed. So if you
choose to pursue your patch, I have no problem with that - I'll then
rebase my changes on top of your patch and submit them separately.


regards
Tomas



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Per table autovacuum vacuum cost limit behaviour strange
Next
From: Robert Haas
Date:
Subject: Re: Function to know last log write timestamp