Re: Hash vs. HashJoin nodes - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Hash vs. HashJoin nodes
Date
Msg-id 1931.1112243843@sss.pgh.pa.us
Whole thread Raw
In response to Hash vs. HashJoin nodes  (Neil Conway <neilc@samurai.com>)
Responses Re: Hash vs. HashJoin nodes  (Christopher Kings-Lynne <chriskl@familyhealth.com.au>)
Re: Hash vs. HashJoin nodes  (Neil Conway <neilc@samurai.com>)
List pgsql-hackers
Neil Conway <neilc@samurai.com> writes:
> ... I'm wondering if there is any value to maintaining the hash
> vs. hash join distinction in the first place.)

One small objection is that we'd lose the ability to separately display
the time spent building the hash table in EXPLAIN ANALYZE output.  It's
probably not super important, but might be a reason to keep two plan
nodes in the tree.

I recall having looked at related ideas (not this one exactly) and being
discouraged by the fact that pulling a tuple from *either* input first
is demonstrably a losing strategy, since either input might have a very
high startup cost.  You could possibly ameliorate that by comparing the
estimated startup costs for the two inputs and pulling from the
estimated-cheaper one first.

This could all get pretty hairy when you consider that it has to still
work for left joins too ...
        regards, tom lane


pgsql-hackers by date:

Previous
From: Neil Conway
Date:
Subject: Hash vs. HashJoin nodes
Next
From: Christopher Kings-Lynne
Date:
Subject: Re: Hash vs. HashJoin nodes