David Rowley <dgrowleyml@gmail.com> writes:
> The problem is that out of the 3 methods PostgreSQL uses to join
> tables, only 1 of them supports join conditions with an OR clause.
> Merge Join cannot do this because results can only be ordered one way
> at a time. Hash Join technically could do this, but it would require
> that it built multiple hash tables. Currently, it only builds one
> table. That leaves Nested Loop as the join method to implement joins
> with OR clauses. Unfortunately, nested loops are quadratic and the
> join condition must be evaluated once per each cartesian product row.
We can do better than that if the OR'd conditions are each amenable
to an index scan on one of the tables: then it can be a nestloop with
a bitmap-OR'd inner index scan. I thought the upthread advice to
convert the substr() condition into something that could be indexed
was on-point.
> Tom Lane did start some work [1] to allow the planner to convert some
> queries to use UNION instead of evaluating OR clauses, but, if I
> remember correctly, it didn't handle ORs in join conditions, though
> perhaps having it do that would be a natural phase 2. I don't recall
> why the work stopped.
As I recall, I was having difficulty convincing myself that
de-duplication of results (for cases where the same row satisfies
more than one of the OR'd conditions) would work correctly.
You can't just blindly make it a UNION because that might remove
identical rows that *should* appear more than once in the result.
regards, tom lane