Ed Loehr <ELOEHR@austin.rr.com> writes:
> 1) The only Pgsql alternative join strategies to nested-loop joins are merge
> join and hash join.
Correct...
> 2) Merge join only makes sense if the data is physically ordered by the join
> keys, and there is almost always a natural entropy away from physical sort
> order.
> Therefore, it generally makes sense to use only hash join.
Not so. A merge join can be built atop either ordered-index-scans of
the inputs, or explicitly sorted input. Postgres' cost estimates are
done for both of these cases; if the optimizer thinks that merge join
is cheapest then it probably is.
> Can I configure psql to use only hash joins?
You could try PGOPTIONS="-fn -fm" to forbid both nestloop and merge
joins, but I wouldn't really recommend it. You'll be taking enough
of a performance hit from not using nestloop when it's cheapest;
disabling mergejoin as well doesn't seem like a good idea. Really
these options are intended as optimizer debugging aids, not as settings
that users should keep in place for long periods of time.
For the record, the other switches in this family are
-fh forbid hashjoin-fs forbid sequential scan-fi forbid indexed scan
Note that -fs/-fi are for individual scans and thus don't compete
with -fn/-fm/-fh for join methods. Also, -fs and -fn are not 100%
lockouts, since the optimizer will use those methods anyway if it
has no other choice (eg, -fs is ineffective if there's no index to
do an indexscan with).
regards, tom lane