Re: Hash vs. HashJoin nodes - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Hash vs. HashJoin nodes
Date
Msg-id 2135.1112245417@sss.pgh.pa.us
Whole thread Raw
In response to Re: Hash vs. HashJoin nodes  (Neil Conway <neilc@samurai.com>)
Responses Re: Hash vs. HashJoin nodes  ("Jim C. Nasby" <decibel@decibel.org>)
List pgsql-hackers
Neil Conway <neilc@samurai.com> writes:
> I think this tweak would be universally better than the existing code.

Yes, but you miss the point: there's a case where the existing code
isn't good and you aren't improving it.  Specifically, where the inner
query has high startup cost and the outer query is empty.  If you'd
pulled from the outer query first then you could avoid the inner startup
cost.

>> This could all get pretty hairy when you consider that it has to still
>> work for left joins too ...

> Right; I was planning to bail and only do this for inner joins.

Well, for outer joins the optimal strategy is simple: pull from the
outer query first.  If it's empty then you needn't touch the inner
query at all.  Otherwise you have to build the hash table.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Neil Conway
Date:
Subject: Re: Hash vs. HashJoin nodes
Next
From: Tom Lane
Date:
Subject: Re: Notes on lock table spilling