Thread: 'fake' join and performance ?

'fake' join and performance ?

From
"Peter Galbavy"
Date:
OK, I am now confused; postgresql 7.3beta2 on OpenBSD:


photos=# select * from metadata WHERE name = 'Make' and value = 'Canon'
limit 10;

*bang*, 10 values, sub second response.

photos=# select * from metadata m, images i WHERE m.name = 'Make' and
m.value = 'Canon' limit 10;

*yawn* - see you later...

Now, 'images' is a new and currently empty table that I intend to do a join
on later, but I started building a query to test my join'ing skills and
found this;

Explain'ing for both:

photos=# explain select * from metadata WHERE name = 'Make' and value =
'Canon' limit 10;                                        QUERY PLAN
----------------------------------------------------------------------------
----------------Limit  (cost=0.00..27711.98 rows=6 width=92)  ->  Index Scan using metadata_index_2 on metadata
(cost=0.00..31072.94
rows=7 width=92)        Index Cond: (name = 'Make'::text)        Filter: (value = 'Canon'::text)
(4 rows)

photos=# explain select * from metadata m, images i WHERE m.name = 'Make'
and m.value = 'Canon' limit 10;                                            QUERY PLAN
----------------------------------------------------------------------------
------------------------Limit  (cost=0.00..27712.04 rows=6 width=816)  ->  Nested Loop  (cost=0.00..31073.00 rows=7
width=816)       ->  Index Scan using metadata_index_2 on metadata m
 
(cost=0.00..31072.94 rows=7 width=92)              Index Cond: (name = 'Make'::text)              Filter: (value =
'Canon'::text)       ->  Seq Scan on images i  (cost=0.00..0.00 rows=1 width=724)
 
(6 rows)



Er, what's that nested loop. I *know* I have shot myself in the foot
somehow, but my initial reaction was that the optimiser should just make the
'fake' (i.e. unreferenced) reference to another table go away...

peter



Re: 'fake' join and performance ?

From
Stephan Szabo
Date:
On Tue, 22 Oct 2002, Peter Galbavy wrote:

> OK, I am now confused; postgresql 7.3beta2 on OpenBSD:
>
>
> photos=# select * from metadata WHERE name = 'Make' and value = 'Canon'
> limit 10;
>
> *bang*, 10 values, sub second response.
>
> photos=# select * from metadata m, images i WHERE m.name = 'Make' and
> m.value = 'Canon' limit 10;
>
> *yawn* - see you later...
>
> ----------------------------------------------------------------------------
> ------------------------
>  Limit  (cost=0.00..27712.04 rows=6 width=816)
>    ->  Nested Loop  (cost=0.00..31073.00 rows=7 width=816)
>          ->  Index Scan using metadata_index_2 on metadata m
> (cost=0.00..31072.94 rows=7 width=92)
>                Index Cond: (name = 'Make'::text)
>                Filter: (value = 'Canon'::text)
>          ->  Seq Scan on images i  (cost=0.00..0.00 rows=1 width=724)
> (6 rows)
>
>
>
> Er, what's that nested loop. I *know* I have shot myself in the foot
> somehow, but my initial reaction was that the optimiser should just make the
> 'fake' (i.e. unreferenced) reference to another table go away...

It can't do that.  The second query would give multiple copies of each row
in metadata for each row in images.  I'm surprised that it'd be so slow if
images is completely empty though.  What does explain analyze show for the
real times.




Re: 'fake' join and performance ?

From
Tom Lane
Date:
"Peter Galbavy" <peter.galbavy@knowtion.net> writes:
> photos=# select * from metadata m, images i WHERE m.name = 'Make' and
> m.value = 'Canon' limit 10;

> Er, what's that nested loop. I *know* I have shot myself in the foot
> somehow,

Yeah, you didn't restrict the reference to images at all.

> but my initial reaction was that the optimiser should just make the
> 'fake' (i.e. unreferenced) reference to another table go away...

That would be in violation of the SQL spec.  The query is defined to
return each join row from the cross product of the FROM tables that
meets the condition of the WHERE clause.  As you wrote the query, each
metadata row that meets the WHERE clause will be returned exactly as
many times as there are rows in the images table.  There is no such
thing as an "unreferenced" FROM entry as far as SQL is concerned.
        regards, tom lane


Re: 'fake' join and performance ?

From
"Peter Galbavy"
Date:
> That would be in violation of the SQL spec.  The query is defined to
> return each join row from the cross product of the FROM tables that
> meets the condition of the WHERE clause.  As you wrote the query, each
> metadata row that meets the WHERE clause will be returned exactly as
> many times as there are rows in the images table.  There is no such
> thing as an "unreferenced" FROM entry as far as SQL is concerned.

Sounds about right. Thanks for the clarification.

Peter