Re: Macro customizable hashtable / bitmapscan & aggregation perf - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Macro customizable hashtable / bitmapscan & aggregation perf
Date
Msg-id ec2f8c1d-329b-e7b6-1406-f74af789c1f0@2ndquadrant.com
Whole thread Raw
In response to Re: Macro customizable hashtable / bitmapscan & aggregation perf  (Andres Freund <andres@anarazel.de>)
Responses Re: Macro customizable hashtable / bitmapscan & aggregation perf  (Andres Freund <andres@anarazel.de>)
Re: Macro customizable hashtable / bitmapscan & aggregation perf  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On 10/02/2016 02:17 AM, Andres Freund wrote:
> Hi,> ...
>
>>>> I think a crucial part of the benchmarking will be identifying and
>>>> measuring corner cases, e.g. a lot of rows with the same key, etc.
>>>> Although that probably is not a major issue for the two places
>>>> switched to the new implementation (e.g. execGrouping merges the
>>>> duplicates to a single group, by definition).
>>>
>>> Hm. I don't really see a case where that's going to cause issues - all
>>> the execGrouping.c users only store one key, and either ignore later
>>> ones, or add the data to the initial tuple in the group.  I don't really
>>> see any case where repeated keys should cause an issue for a hash table?
>>>
>>
>> Yep, that's pretty much what I suggested. I was wondering more about the
>> other places where this hash table might be used - I've been thinking about
>> hashjoins, but now that I think of it - that's a fairly specialized and
>> tuned code. In any case, let's not complicate the discussion for now.
>
> My point is that I don't really see any potential usecase where
> that's an issue.
>

OK, I think we agree the two places modified by the submitted patch 
series are fine. Let's leave discussion about places modified by future 
patches for the future ;-)

>
>> A few comments, after quickly skimming through the first two patches:
>>
>> 1) SH_CREATE does this:
>>
>>     /* round up size to the next power of 2, eases lookups */
>>     if (size < 2)
>>         size = 2;
>>     else
>>         size = sh_pow2_int(size);
>>
>> I very much doubt a hash table with 2 buckets is very useful. What about
>> using some reasonable lower boundary - IIRC hashjoins use 1024 buckets,
>> which might be a bit too much, but perhaps 256 would work?
>
> For some cases that'll waste a good bit of space. Consider
> e.g. catcache.c. Callers can (and do) specify more appropriate sizes for
> the individual uses.
>

Hmmm, OK. I have my doubts about those hardcoded constants, but you're 
right 256 is probably an overkill.

>
>> 5) SH_RESIZE
>>
>> Do we expect to shrink hash tables?
>
> Not at the moment. Should be doable, but I'm not sure it's worth
> developing and testing - I don't see any usecases in pg atm.
>

OK, sure. Then what about adding this to SH_RESIZE?

Assert(oldsize <= newsize);

I assumed we might shrink the catcache after invalidation, for example 
(but maybe I don't really know how that works).

>
>> It's not entirely clear why is it guaranteed that there's always
>> an element with optimal position (when load factor < 1)? Adding an
>>  explanation or a link to a paper would be helpful, I guess.
>
> As the load factor always has to be below 1, there always has to be
> an empty bucket. Every bucket after an empty bucket has to be at
> it's optimal position.
>

Hmmm, thanks to moving the entries after delete? Adding an assert() 
enforcing this seems like a good idea.

>
>> 7) Should hash agg size estimation for the planner consider the fillfactor?
>>
>> I think we should account for fillfactor - we should account for the
>> allocated memory, even if there's free space in the hash table. After all,
>> that's what hashjoins do.
>
> We didn't really for dynahash (as it basically assumed a fillfactor of 1
> all the time), not sure if changing it won't also cause issues.
>

That's a case of glass half-full vs. half-empty, I guess. If we assumed 
load factor 1.0, then I see it as accounting for load factor (while you 
see it as not accounting of it).

I don't see why this should cause issues - of course, the hash table may 
be a bit larger, exceed work_mem and make the tidbitmap lossy, or switch 
HashAggregate to GroupAggregate. But I don't think that's a major issue, 
as those decisions depend on estimates anyway, so we can't really 
guarantee them.

Also, with load factor 0.8 the table is only ~20% larger, so even if we 
don't include that into work_mem it's a reasonably small error (easily 
smaller than errors in the pre-9.5 HashJoin accounting, which did not 
include chunk headers etc.).

So I won't fight for this, but I don't see why not to account for it.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: PATCH: two slab-like memory allocators
Next
From: Tomas Vondra
Date:
Subject: Re: PATCH: two slab-like memory allocators