Re: a few crazy ideas about hash joins - Mailing list pgsql-hackers

From Tom Lane
Subject Re: a few crazy ideas about hash joins
Date
Msg-id 21826.1238790565@sss.pgh.pa.us
Whole thread Raw
In response to Re: a few crazy ideas about hash joins  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: a few crazy ideas about hash joins  (Robert Haas <robertmhaas@gmail.com>)
Re: a few crazy ideas about hash joins  (Dimitri Fontaine <dfontaine@hi-media.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> I don't see why hash_inner_and_outer can't walk the outer path looking
> for suitable hashes to reuse.  I think the question is how aggressive
> we want to be in performing that search.

Correct, but you've got the details all wrong.  The real problem is that
the planner might discard a join path hash(A,B) at level 2 because it
loses compared to, say, merge(A,B).  But when we get to level three,
perhaps hash(hash(A,B),C) would've been the best plan due to synergy
of the two hashes.  We'll never find that out unless we keep the
"inferior" hash path around.  We can certainly do that; the question
is what's it going to cost us to allow more paths to survive to the
next join level.  (And I'm afraid the answer may be "plenty"; it's a
combinatorial explosion we're looking at here.)

>> Incidentally a similar optimization is possible for materialize or
>> even sorts.

The planner already goes to great lengths to avoid extra sorts for
multiple levels of mergejoin ... that's what all the "pathkey" thrashing
is about.  It's pretty expensive.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: a few crazy ideas about hash joins
Next
From: Alvaro Herrera
Date:
Subject: Re: reloptions with a "namespace"