Re: a few crazy ideas about hash joins - Mailing list pgsql-hackers

From Robert Haas
Subject Re: a few crazy ideas about hash joins
Date
Msg-id 603c8f070904030503m18ff520m65595dbe3a65fa7d@mail.gmail.com
Whole thread Raw
In response to Re: a few crazy ideas about hash joins  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: a few crazy ideas about hash joins  (Kenneth Marshall <ktm@rice.edu>)
List pgsql-hackers
On Fri, Apr 3, 2009 at 1:48 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Robert Haas wrote:
>>
>> While investigating some performance problems recently I've had cause
>> to think about the way PostgreSQL uses hash joins.  So here are a few
>> thoughts.  Some of these have been brought up before.
>>
>> 1. When the hash is not expected to spill to disk, it preserves the
>> pathkeys of the outer side of the join.  If the optimizer were allowed
>> to assume that, it could produce significantly more efficient query
>> plans in some cases.  The problem is what to do if we start executing
>> the query and find out that we have more stuff to hash than we expect,
>> such that we need multiple batches?  Now the results won't be sorted.
>> I think we could handle this as follows: Don't count on the hash join
>> to preserve pathkeys unless it helps, and only rely on it when it
>> seems as if the hash table will still fit even if it turns out to be,
>> say, three times as big as expected.  But if you are counting on the
>> hash join to preserve pathkeys, then pass that information to the
>> executor.  When the executor is asked to perform a hash join, it will
>> first hash the inner side of the relation.  At that point, we know
>> whether we've succesfully gotten everything into a single batch, or
>> not.  If we have, perform the join normally.  If the worst has
>> happened and we've gone multi-batch, then perform the join and sort
>> the output before returning it.  The performance will suck, but at
>> least you'll get the right answer.
>>
>> Previous in-passing reference to this idea here:
>> http://archives.postgresql.org/pgsql-hackers/2008-09/msg00806.php
>
> Hmm, instead of a sorting the output if the worst happens, a final merge
> step as in a merge sort would be enough.

Yeah - good point.

>> 2. Consider building the hash table lazily.  I often see query planner
>> pick a hash join over a nested inner indexscan because it thinks that
>> it'll save enough time making hash probes rather than index probes to
>> justify the time spent building the hash table up front.  But
>> sometimes the relation that's being hashed has a couple thousand rows,
>> only a tiny fraction of which will ever be retrieved from the hash
>> table.  We can predict when this is going to happen because n_distinct
>> for the outer column will be much less than the size of the inner rel.
>>  In that case, we could consider starting with an empty hash table
>> that effectively acts as a cache.  Each time a value is probed, we
>> look it up in the hash table.  If there's no entry, we use an index
>> scan to find the matching rows and insert them into the hash table.
>> Negative results must also be cached.
>
> Yeah, that would be quite nice. One problem is that our ndistinct estimates
> are not very accurate.

Well, the right solution to that problem is to fix our ndistinct estimates.  :-)

>> 3. Avoid building the exact same hash table twice in the same query.
>> This happens more often you'd think.  For example, a table may have
>> two columns creator_id and last_updater_id which both reference person
>> (id).  If you're considering a hash join between paths A and B, you
>> could conceivably check whether what is essentially a duplicate of B
>> has already been hashed somewhere within path A.  If so, you can reuse
>> that same hash table at zero startup-cost.
>
> That seems like a quite simple thing to do. But would it work for a
> multi-batch hash table?

I think not.

>> 4. As previously discussed, avoid hashing for distinct and then
>> hashing the results for a hash join on the same column with the same
>> operators.
>
> This seems essentially an extension of idea 3.

In theory, yes, but in practice, this case is easy to detect for an
arbitrary inner rel, and the other case is not.

Off to JDCon...

...Robert


pgsql-hackers by date:

Previous
From: shrish purohit
Date:
Subject: Expression based index
Next
From: Greg Stark
Date:
Subject: Re: Abwesend: [GENERAL] string_to_array with empty input