Re: Explanation for bug #13908: hash joins are badly broken - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Explanation for bug #13908: hash joins are badly broken
Date
Msg-id 56B66300.7000405@2ndquadrant.com
Whole thread Raw
In response to Re: Explanation for bug #13908: hash joins are badly broken  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Explanation for bug #13908: hash joins are badly broken
List pgsql-hackers

On 02/06/2016 09:55 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On 02/06/2016 06:47 PM, Tom Lane wrote:
>>> I note also that while the idea of ExecHashRemoveNextSkewBucket is
>>> to reduce memory consumed by the skew table to make it available to
>>> the main hash table, in point of fact it's unlikely that the freed
>>> space will be of any use at all, since it will be in tuple-sized
>>> chunks far too small for dense_alloc's requests. So the spaceUsed
>>> bookkeeping being done there is an academic exercise unconnected to
>>> reality, and we need to rethink what the space management plan is.
>
>> I don't follow. Why would these three things (sizes of allocations in
>> skew buckets, chunks in dense allocator and accounting) be related?
>
> Well, what we're trying to do is ensure that the total amount of
> space used by the hashjoin table doesn't exceed spaceAllowed. My
> point is that it's kind of cheating to ignore space
> used-and-then-freed if your usage pattern is such that that space
> isn't likely to be reusable. A freed skew tuple represents space that
> would be reusable for another skew tuple, but is probably *not*
> reusable for the main hash table; so treating that space as
> interchangeable is wrong I think.

Ah, I see. And I agree that treating those areas as equal is wrong.

> I'm not entirely sure where to go with that thought, but maybe the
> answer is that we should just treat the skew table and main table
> storage pools as entirely separate with independent limits. That's
> not what's happening right now, though.

What about using the dense allocation even for the skew buckets, but not 
one context for all skew buckets but one per bucket? Then when we delete 
a bucket, we simply destroy the context (and free the chunks, just like 
we do with the current dense allocator).

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Optimization- Check the set of conditionals on a WHERE clause against CHECK constraints.
Next
From: Tom Lane
Date:
Subject: Re: Explanation for bug #13908: hash joins are badly broken