Thread: When can joins be avoided?

When can joins be avoided?

From
Stefan Fehrenbach
Date:
Hey all!

I'm interested in the rules that Postgres uses to avoid joins. Are
these documented somewhere? If I had to look at the source code, where
would I start? They don't seem to match my intuition about which joins
could be avoided. Then again, it's quite possible that I'm wrong and
misunderstanding the semantics.

Here are some particular examples I'm interested in.
Set up some tables. Intuition: table a contains some info, table b
contains some (optional) extra info, table c contains some more
(optional) extra info.

create table c (
    c_id int primary key,
    c_foo text );

create table b (
    b_id int primary key,
    c_id int references c(c_id),
    b_bar text );

create table a (
    a_id int primary key,
    b_id int references b(b_id),
    a_baz text );


-- Now some queries (join on b and c in various ways,
-- but only ever select columns from a)

-- This joins on b, as expected.
-- (Because missing rows could reduce cardinality?
--  but making a.b_id NOT NULL doesn't help...)
  explain
   select a_baz
     from a
     join b using (b_id);

-- Making it a LEFT join avoids the join on b, as expected
  explain
   select a_baz
     from a
LEFT join b using (b_id);

-- Joins on b and c. This is very strange to me.
-- Whether or not the join on c results in any rows
-- shouldn't make any difference.
  explain
   select a_baz
     from a
left join (select *
             from b
             join c using (c_id)) bc using (b_id);

-- making the join in the subquery a LEFT join
-- avoids joining entirely (no b or c in the plan)
  explain
   select a_baz
     from a
left join (select *
             from b
        LEFT join c using (c_id)) bc using (b_id);


If anybody knows why Postgres behaves this way, please let me know :)

Cheers,
 Stefan



Re: When can joins be avoided?

From
Tom Lane
Date:
Stefan Fehrenbach <stefan.fehrenbach@gmail.com> writes:
> I'm interested in the rules that Postgres uses to avoid joins. Are
> these documented somewhere? If I had to look at the source code, where
> would I start?

src/backend/optimizer/plan/analyzejoins.c

> They don't seem to match my intuition about which joins
> could be avoided.

I believe only left joins to single tables can be elided ATM.
It's too hard to prove uniqueness of the join key in more-
complicated cases.

            regards, tom lane