"Dave Held" <dave.held@arrayservicesgrp.com> writes:
> My completely amateur guess is that the planner is able to use
> Merge Join and Hash Join on your contrived queries because you
> are only trying to join one field to a single value (i.e.:
> operator=). But the BETWEEN clause is what forces the Nested
> Loop. You can see that here:
Yeah --- both merge and hash join are only usable for equality joins.
(Thinking about it, it seems possible that mergejoin could be extended
to work for range joins, but we're certainly far from being able to
do that today.) So the basic alternatives the planner has are nestloops
with either postcode on the outside, or data_main on the outside. The
postcode-on-the-outside case would be plausible with an index on
data_main.range, but Arjen didn't have one. The data_main-on-the-outside
case could only use an index if the index was range-query-capable, which
a 2-column btree index isn't. Given the small size of the postcodes
table it's not real clear that an index probe would be much of a win
anyway over a simple sequential scan.
Comparing the nestloop case to the hash case does make one think that
there's an awful lot of overhead somewhere, though. Two int2
comparisons ought not take very long :-(. Arjen, are you interested
in getting a gprof profile of what the backend is doing in the nestloop
-with-materialize plan? Or if you don't want to mess with it, please
send me the data off-list and I'll run a profile.
regards, tom lane