Thread: Big problems with query optimization

Big problems with query optimization

From
pgsql-bugs@postgresql.org
Date:
Dmitriy A. Anipko (anipko@tornado.nsk.ru) reports a bug with a severity of 2
The lower the number the more severe it is.

Short Description
Big problems with query optimization

Long Description
I use a 7.0 release running under SunOS on SPARC.
I found that there is no real way to perform queries to several tables if these tables contain quite a big amount of
data.
Example code shows it.
There are situations when even obvious optimizations are not performed.
Important: there is no such a bug in 6.5.* releases (running under Linux on i686).

Sample Code
Release 7.0
faqts=> explain select faqs.id from faqs;
NOTICE:  QUERY PLAN:
Seq Scan on faqs  (cost=0.00..2.40 rows=40 width=4)
EXPLAIN
faqts=> explain select faqs.id from faqs, faq_keywords;
NOTICE:  QUERY PLAN:
Nested Loop  (cost=0.00..4908.80 rows=161320 width=8)
  ->  Seq Scan on faqs  (cost=0.00..2.40 rows=40 width=4)
  ->  Seq Scan on faq_keywords  (cost=0.00..82.33 rows=4033 width=4)
EXPLAIN
faqts=>  explain select distinct on (faqs.id) faqs.id from faqs, faq_keywords;
NOTICE:  QUERY PLAN:
Unique  (cost=0.00..5319.14 rows=16132 width=8)
  ->  Nested Loop  (cost=0.00..4915.84 rows=161320 width=8)
        ->  Index Scan using faqs_pkey on faqs  (cost=0.00..9.44 rows=40 width=4)
        ->  Seq Scan on faq_keywords  (cost=0.00..82.33 rows=4033 width=4)
EXPLAIN
faqts=>

Compare with release 6.5 reaction:
faqtat=> explain select faqs.id from faqs, faq_keywords;
NOTICE:  QUERY PLAN:
Seq Scan on faqs  (cost=43.00 rows=1000 width=4)
EXPLAIN
faqtat=> explain select faqs.id from faqs;
NOTICE:  QUERY PLAN:
Seq Scan on faqs  (cost=43.00 rows=1000 width=4)
EXPLAIN
faqtat=> explain select distinct on id faqs.id from faqs, faq_keywords;
NOTICE:  QUERY PLAN:
Unique  (cost=43.00 rows=1000 width=4)
  ->  Sort  (cost=43.00 rows=1000 width=4)
        ->  Seq Scan on faqs  (cost=43.00 rows=1000 width=4)

EXPLAIN
faqtat=>



No file was uploaded with this report

Re: Big problems with query optimization

From
Tom Lane
Date:
pgsql-bugs@postgresql.org writes:
> Release 7.0
> faqts=> explain select faqs.id from faqs, faq_keywords;
> NOTICE:  QUERY PLAN:
> Nested Loop  (cost=0.00..4908.80 rows=161320 width=8)
>   ->  Seq Scan on faqs  (cost=0.00..2.40 rows=40 width=4)
>   ->  Seq Scan on faq_keywords  (cost=0.00..82.33 rows=4033 width=4)
> EXPLAIN

> Compare with release 6.5 reaction:
> faqtat=> explain select faqs.id from faqs, faq_keywords;
> NOTICE:  QUERY PLAN:
> Seq Scan on faqs  (cost=43.00 rows=1000 width=4)
> EXPLAIN

7.0 is correct.  6.5 is broken.  Read the SQL standard: "select a.f from
a" is not the same query as "select a.f from a,b".  The latter should
return each a.f value as many times as there are rows in b.

            regards, tom lane