Re: a few crazy ideas about hash joins - Mailing list pgsql-hackers

From Tom Lane
Subject Re: a few crazy ideas about hash joins
Date
Msg-id 22570.1238793056@sss.pgh.pa.us
Whole thread Raw
In response to Re: a few crazy ideas about hash joins  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: a few crazy ideas about hash joins  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Apr 3, 2009 at 4:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Correct, but you've got the details all wrong. �The real problem is that
>> the planner might discard a join path hash(A,B) at level 2 because it
>> loses compared to, say, merge(A,B). �But when we get to level three,
>> perhaps hash(hash(A,B),C) would've been the best plan due to synergy
>> of the two hashes. �We'll never find that out unless we keep the
>> "inferior" hash path around. �We can certainly do that; the question
>> is what's it going to cost us to allow more paths to survive to the
>> next join level. �(And I'm afraid the answer may be "plenty"; it's a
>> combinatorial explosion we're looking at here.)

> That would be crazy.  I think doing it the way I suggested is correct,
> just not guaranteed to catch every case.  The reality is that even if
> we took Greg Stark's suggestion of detecting this situation only in
> the executor, we'd still get some benefit out of this.  If we take my
> intermediate approach, we'll catch more cases where this is a win.
> What you're suggesting here would catch every conceivable case, but at
> the expense of what I'm sure would be an unacceptable increase in
> planning time for very limit benefit.

Maybe, maybe not.  I've seen plenty of plans that have several
mergejoins stacked up on top of each other with no intervening sorts.
There is 0 chance that the planner would have produced that if it
thought that it had to re-sort at each level; something else would have
looked cheaper.  I think that your proposals will end up getting very
little of the possible benefit, because the planner will fail to choose
plan trees in which the optimization can be exploited.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: a few crazy ideas about hash joins
Next
From: Simon Riggs
Date:
Subject: Re: GetCurrentVirtualXIDs()