Re: a few crazy ideas about hash joins - Mailing list pgsql-hackers

From Robert Haas
Subject Re: a few crazy ideas about hash joins
Date
Msg-id 603c8f070904031402y5a47cd48w92cd09a59fa0d6f2@mail.gmail.com
Whole thread Raw
In response to Re: a few crazy ideas about hash joins  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: a few crazy ideas about hash joins  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Fri, Apr 3, 2009 at 4:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I don't see why hash_inner_and_outer can't walk the outer path looking
>> for suitable hashes to reuse.  I think the question is how aggressive
>> we want to be in performing that search.
>
> Correct, but you've got the details all wrong.  The real problem is that
> the planner might discard a join path hash(A,B) at level 2 because it
> loses compared to, say, merge(A,B).  But when we get to level three,
> perhaps hash(hash(A,B),C) would've been the best plan due to synergy
> of the two hashes.  We'll never find that out unless we keep the
> "inferior" hash path around.  We can certainly do that; the question
> is what's it going to cost us to allow more paths to survive to the
> next join level.  (And I'm afraid the answer may be "plenty"; it's a
> combinatorial explosion we're looking at here.)

That would be crazy.  I think doing it the way I suggested is correct,
just not guaranteed to catch every case.  The reality is that even if
we took Greg Stark's suggestion of detecting this situation only in
the executor, we'd still get some benefit out of this.  If we take my
intermediate approach, we'll catch more cases where this is a win.
What you're suggesting here would catch every conceivable case, but at
the expense of what I'm sure would be an unacceptable increase in
planning time for very limit benefit.

...Robert


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: reloptions with a "namespace"
Next
From: Tom Lane
Date:
Subject: Re: a few crazy ideas about hash joins