Thread: outer joins strangeness

outer joins strangeness

From
Alex Pilosov
Date:
It may be just me, or I am grossly misunderstanding syntax of outer joins,
but I see that plans for my queries are different depending on how I place
join conditions and sometimes even on order of the tables.

Basically, if I mix ANSI-syntax outer joins (a left outer join b on
a.id=b.id) and "where-syntax" joins (from a,b where a.id=b.id) in the same
query, things get strange.

Example:
1:
explain select * from customers c,orders o left outer join adsl_orders ao
on ao.order_id=o.order_id
where c.cust_id=o.cust_id
and c.cust_id=152


Nested Loop  (cost=94.23..577.47 rows=2 width=290) ->  Index Scan using customers_pkey on customers c
(cost=0.00..2.02
rows=1 width=125) ->  Materialize  (cost=501.65..501.65 rows=5904 width=165)       ->  Hash Join  (cost=94.23..501.65
rows=5904width=165)             ->  Seq Scan on orders o  (cost=0.00..131.04 rows=5904
 
width=58)             ->  Hash  (cost=86.18..86.18 rows=3218 width=107)                   ->  Seq Scan on adsl_orders
ao (cost=0.00..86.18
 
rows=3218 width=107)

Query 2:

explain select * from customers c join orders o on c.cust_id=o.cust_id
left outer join adsl_orders ao on ao.order_id=o.order_id
where c.cust_id=152

Nested Loop  (cost=0.00..9.30 rows=2 width=290) ->  Nested Loop  (cost=0.00..5.06 rows=2 width=183)       ->  Index
Scanusing customers_pkey on customers c
 
(cost=0.00..2.02 rows=1 width=125)       ->  Index Scan using orders_idx1 on orders o  (cost=0.00..3.03
rows=1 width=58) ->  Index Scan using adsl_orders_pkey on adsl_orders ao
(cost=0.00..2.02 rows=1 width=107)

To me, both queries seem exactly identical in meaning, and should generate
the same plans. However, in my experience, if I use outer join anywhere in
the query, I must use "JOIN" syntax to join all other tables as well,
otherwise, my query plans are _extremely_ slow.

any hints? Or I am grossly misunderstanding outer join symantics?

-alex



Re: outer joins strangeness

From
Stephan Szabo
Date:
On Sun, 23 Sep 2001, Alex Pilosov wrote:

> It may be just me, or I am grossly misunderstanding syntax of outer joins,
> but I see that plans for my queries are different depending on how I place
> join conditions and sometimes even on order of the tables.
> 
> Example:
> 1:
> explain select * from customers c,orders o left outer join adsl_orders ao
> on ao.order_id=o.order_id
> where c.cust_id=o.cust_id
> and c.cust_id=152
> 
> 
> Nested Loop  (cost=94.23..577.47 rows=2 width=290)
>   ->  Index Scan using customers_pkey on customers c  (cost=0.00..2.02
> rows=1 width=125)
>   ->  Materialize  (cost=501.65..501.65 rows=5904 width=165)
>         ->  Hash Join  (cost=94.23..501.65 rows=5904 width=165)
>               ->  Seq Scan on orders o  (cost=0.00..131.04 rows=5904
> width=58)
>               ->  Hash  (cost=86.18..86.18 rows=3218 width=107)
>                     ->  Seq Scan on adsl_orders ao  (cost=0.00..86.18
> rows=3218 width=107)
> 
> Query 2:
> 
> explain select * from customers c join orders o on c.cust_id=o.cust_id
> left outer join adsl_orders ao on ao.order_id=o.order_id
> where c.cust_id=152
> 
> Nested Loop  (cost=0.00..9.30 rows=2 width=290)
>   ->  Nested Loop  (cost=0.00..5.06 rows=2 width=183)
>         ->  Index Scan using customers_pkey on customers c
> (cost=0.00..2.02 rows=1 width=125)
>         ->  Index Scan using orders_idx1 on orders o  (cost=0.00..3.03
> rows=1 width=58)
>   ->  Index Scan using adsl_orders_pkey on adsl_orders ao
> (cost=0.00..2.02 rows=1 width=107)
> 
> To me, both queries seem exactly identical in meaning, and should generate
> the same plans. However, in my experience, if I use outer join anywhere in
> the query, I must use "JOIN" syntax to join all other tables as well,
> otherwise, my query plans are _extremely_ slow.

Postgres treats join syntax as an explicit definition of what order to
joins in.  So, I'd guess it sees the first as: do the LOJ and then join
that to the separate table.  

And for right outer join (for example), those two queries would not
be equivalent if I read the ordering correctly.  The former syntax
would mean outer first and then the inner, whereas the second would
be inner first then the outer, and that could have different results.



Re: outer joins strangeness

From
Alex Pilosov
Date:
On Sun, 23 Sep 2001, Stephan Szabo wrote:

> On Sun, 23 Sep 2001, Alex Pilosov wrote:
> 
> > It may be just me, or I am grossly misunderstanding syntax of outer joins,
> > but I see that plans for my queries are different depending on how I place
> > join conditions and sometimes even on order of the tables.
> > 
> > Example:
> > 1:
> > explain select * from customers c,orders o left outer join adsl_orders ao
> > on ao.order_id=o.order_id
> > where c.cust_id=o.cust_id
> > and c.cust_id=152
<snip>
> > 
> > explain select * from customers c join orders o on c.cust_id=o.cust_id
> > left outer join adsl_orders ao on ao.order_id=o.order_id
> > where c.cust_id=152

> Postgres treats join syntax as an explicit definition of what order to
> joins in.  So, I'd guess it sees the first as: do the LOJ and then join
> that to the separate table.  
Yeah, I figure that's how it sees it, but that's pretty stupid from
performance reasons :P)

It _should_ realize that left outer join only constricts join order
between two tables in outer join, and joins to all other tables should
still be treated normally.

I'm going to CC this to -hackers, maybe someone will shed a light on the
internals of this. 

> And for right outer join (for example), those two queries would not
> be equivalent if I read the ordering correctly.  The former syntax
> would mean outer first and then the inner, whereas the second would
> be inner first then the outer, and that could have different results.
True. But this is not right outer join, its a left outer join...:)

Postgres should understand that left outer join does not constrict join
order...

-alex



Re: outer joins strangeness

From
Stephan Szabo
Date:
On Mon, 24 Sep 2001, Alex Pilosov wrote:

> On Sun, 23 Sep 2001, Stephan Szabo wrote:
> 
> > On Sun, 23 Sep 2001, Alex Pilosov wrote:
> > 
> > Postgres treats join syntax as an explicit definition of what order to
> > joins in.  So, I'd guess it sees the first as: do the LOJ and then join
> > that to the separate table.  
> Yeah, I figure that's how it sees it, but that's pretty stupid from
> performance reasons :P)
>
> It _should_ realize that left outer join only constricts join order
> between two tables in outer join, and joins to all other tables should
> still be treated normally.
(see below)
> 
> I'm going to CC this to -hackers, maybe someone will shed a light on the
> internals of this. 
> 
> > And for right outer join (for example), those two queries would not
> > be equivalent if I read the ordering correctly.  The former syntax
> > would mean outer first and then the inner, whereas the second would
> > be inner first then the outer, and that could have different results.
> True. But this is not right outer join, its a left outer join...:)
> 
> Postgres should understand that left outer join does not constrict join
> order...

But it can.  If your condition was a joining between the other table
and the right side of the left outer join, you'd have the same condition
as a right outer join and the left side.  The real condition I think
is that you can join a non-explicitly joined table to the <x> side of an
<x> outer join before the outer join but not to the other side.



Re: [HACKERS] outer joins strangeness

From
Tom Lane
Date:
Alex Pilosov <alex@pilosoft.com> writes:
> I'm going to CC this to -hackers, maybe someone will shed a light on the
> internals of this. 

It's not unintentional.  See
http://www.ca.postgresql.org/users-lounge/docs/7.1/postgres/explicit-joins.html
        regards, tom lane