Shaun Thomas <sthomas@optionshouse.com> writes:
> A developer was complaining about a view he created to abstract an added
> column in a left join. ...
> Curious, I whipped up this test case:
> CREATE VIEW v_slow_view AS
> SELECT foo.*, tf.small_label IS NOT NULL AS has_small_label
> FROM foo
> LEFT JOIN tiny_foo tf USING (small_label);
> SELECT bar.*, foo.*, tf.small_label IS NOT NULL AS has_small_label
> FROM bar
> LEFT JOIN foo ON (foo.id = bar.foo_id)
> LEFT JOIN tiny_foo tf USING (small_label)
> WHERE bar.id IN (750, 1750, 2750)
> ORDER BY bar.id;
> SELECT bar.*, sv.*
> FROM bar
> LEFT JOIN v_slow_view sv ON (sv.id = bar.foo_id)
> WHERE bar.id IN (750, 1750, 2750)
> ORDER BY bar.id;
These queries are not actually equivalent. In the first one, it is
impossible for "has_small_label" to read out as NULL: it will either be
true or false. However, in the second one, the IS NOT NULL is evaluated
below the LEFT JOIN to "sv", and therefore it is required that the query
return NULL for "has_small_label" in any row where bar.foo_id lacks a
join partner.
To implement that behavior correctly, we're forced to form the
foo-to-tiny_foo join first, then do the left join with bar (which'll
replace RHS columns by nulls where necessary).
And that means that you get the inefficient plan wherein the
foo-to-tiny_foo join is computed in its entirety.
9.2 does this case better, by virtue of the "parameterized plan" stuff,
which exists specifically to let us use nestloop-with-inner-indexscan
plans even when there are some join order restrictions complicating
matters.
regards, tom lane