On Fri, 2003-02-14 at 14:08, Tom Lane wrote:
> Brad Hilton <bhilton@vpop.net> writes:
> > ... If I modify the query slightly:
>
> > --------
> > select 1 from article_categories
> > -->
> > select 1 from articles, article_categories
> > ---------
>
> > the query takes 98 msec.
>
> Yeah, because then the sub-query is a constant (it doesn't depend on the
> current outer row at all) and so it is only evaluated once, not once per
> outer row. Unfortunately, that approach probably gives the wrong
> answers...
Ah, that makes sense. But does it surprise you that when I manipulate
the dataset such that the inner query matches 0 records, the total query
takes so much longer?
Unfortunately, after following the suggestions of several kind posters,
the resulting queries are pretty slow compared to my example which used
'exists.'
The fact that the query takes so long in certain dataset conditions is
surprising me. Watch the following results:
psql> update categories set restrict_views = FALSE;
explain analyze select * from articles where exists (select 1 from article_categories, categories where
article_categories.article_id= articles.id and categories.restrict_views = FALSE and article_categories.category_id =
categories.id) and post_status = 'publish' order by publish_time desc limit 10;Total runtime: 0.69 msec
psql> update categories set restrict_views = TRUE;
explain analyze select * from articles where exists (select 1 from article_categories, categories where
article_categories.article_id= articles.id and categories.restrict_views = FALSE and article_categories.category_id =
categories.id) and post_status = 'publish' order by publish_time desc limit 10;
Total runtime: 27490.84 msec
Is that a surprising result? I would think that the second time things
would be faster because there are no matches to the inner query. In
fact, if I execute the inner query by itself, minus the reference to the
articles table, it executes lightning fast. (0.07 msec)
-Brad