Re: Explanation for bug #13908: hash joins are badly broken - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Explanation for bug #13908: hash joins are badly broken
Date
Msg-id 22997.1454792149@sss.pgh.pa.us
Whole thread Raw
In response to Re: Explanation for bug #13908: hash joins are badly broken  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Explanation for bug #13908: hash joins are badly broken
List pgsql-hackers
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 02/06/2016 06:47 PM, Tom Lane wrote:
>> I note also that while the idea of ExecHashRemoveNextSkewBucket is
>> to reduce memory consumed by the skew table to make it available to
>> the main hash table, in point of fact it's unlikely that the freed
>> space will be of any use at all, since it will be in tuple-sized
>> chunks far too small for dense_alloc's requests. So the spaceUsed
>> bookkeeping being done there is an academic exercise unconnected to
>> reality, and we need to rethink what the space management plan is.

> I don't follow. Why would these three things (sizes of allocations in 
> skew buckets, chunks in dense allocator and accounting) be related?

Well, what we're trying to do is ensure that the total amount of space
used by the hashjoin table doesn't exceed spaceAllowed.  My point is
that it's kind of cheating to ignore space used-and-then-freed if your
usage pattern is such that that space isn't likely to be reusable.
A freed skew tuple represents space that would be reusable for another
skew tuple, but is probably *not* reusable for the main hash table;
so treating that space as interchangeable is wrong I think.

I'm not entirely sure where to go with that thought, but maybe the
answer is that we should just treat the skew table and main table
storage pools as entirely separate with independent limits.  That's
not what's happening right now, though.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Explanation for bug #13908: hash joins are badly broken
Next
From: Tom Lane
Date:
Subject: Re: Optimization- Check the set of conditionals on a WHERE clause against CHECK constraints.