Home > mailing lists

Re: 9.5: Memory-bounded HashAgg - Mailing list pgsql-hackers

From	Jeff Davis
Subject	Re: 9.5: Memory-bounded HashAgg
Date	August 21, 2014 07:49:53
Msg-id	1408607390.2335.247.camel@jeff-desktop Whole thread Raw
In response to	Re: 9.5: Memory-bounded HashAgg (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: 9.5: Memory-bounded HashAgg
List	pgsql-hackers

Tree view

On Wed, 2014-08-20 at 14:32 -0400, Robert Haas wrote:
> Well, I think you're certainly right that a hash table lookup is more
> expensive than modulo on a 32-bit integer; so much is obvious.  But if
> the load factor is not too large, I think that it's not a *lot* more
> expensive, so it could be worth it if it gives us other advantages.
> As I see it, the advantage of Jeff's approach is that it doesn't
> really matter whether our estimates are accurate or not.  We don't
> have to decide at the beginning how many batches to do, and then
> possibly end up using too much or too little memory per batch if we're
> wrong; we can let the amount of memory actually used during execution
> determine the number of batches.  That seems good.  Of course, a hash
> join can increase the number of batches on the fly, but only by
> doubling it, so you might go from 4 batches to 8 when 5 would really
> have been enough.  And a hash join also can't *reduce* the number of
> batches on the fly, which might matter a lot.  Getting the number of
> batches right avoids I/O, which is a lot more expensive than CPU.

My approach uses partition counts that are powers-of-two also, so I
don't think that's a big differentiator. In principle my algorithm could
adapt to other partition counts, but I'm not sure how big of an
advantage there is.

I think the big benefit of my approach is that it doesn't needlessly
evict groups from the hashtable. Consider input like 0, 1, 0, 2, ..., 0,
N. For large N, if you evict group 0, you have to write out about N
tuples; but if you leave it in, you only have to write out about N/2
tuples. The hashjoin approach doesn't give you any control over
eviction, so you only have about 1/P chance of saving the skew group
(where P is the ultimate number of partitions). With my approach, we'd
always keep the skew group in memory (unless we're very unlucky, and the
hash table fills up before we even see the skew value).

Regards,Jeff Davis

pgsql-hackers by date:

From: Heikki Linnakangas
Date: 21 August 2014, 07:42:35
Subject: Re: [TODO] Process pg_hba.conf keywords as case-insensitive

From: Julien Rouhaud
Date: 21 August 2014, 08:18:29
Subject: [TODO] Track number of files ready to be archived in pg_stat_archiver

Re: 9.5: Memory-bounded HashAgg - Mailing list pgsql-hackers

Previous

Next