Thread: Optimizer Bug?

Optimizer Bug?

From
"Mitterwald, Holger"
Date:
Hello, I do have a quite suspicious behavior of the Qery-Optimizer (I
think):

If I execute the following statement:
query: explain select k.name, t.lang from kneipe k, typ t where
k.typ=t.typ;
ProcessUtility: explain select k.name, t.lang from kneipe k, typ t where
k.typ=t.typ;
NOTICE:  QUERY PLAN:

Hash Join  (cost=1.20..31.05 rows=413 width=32)
  ->  Seq Scan on kneipe k  (cost=0.00..13.13 rows=413 width=16)
  ->  Hash  (cost=1.16..1.16 rows=16 width=16)
        ->  Seq Scan on typ t  (cost=0.00..1.16 rows=16 width=16)

----------------

Seems ok to me, but if I insert a (huge) table which is nowhere
referenced (here:"preis"), the
following happens:

query: explain select k.name, t.lang from kneipe k, typ t, preis p where
k.typ=t.typ;
ProcessUtility: explain select k.name, t.lang from kneipe k, typ t,
preis p where k.typ=t.typ;
NOTICE:  QUERY PLAN:

Merge Join  (cost=2948.99..3280.15 rows=673190 width=36)
  ->  Sort  (cost=2917.91..2917.91 rows=26080 width=20)
        ->  Nested Loop  (cost=0.00..746.76 rows=26080 width=20)
              ->  Seq Scan on typ t  (cost=0.00..1.16 rows=16 width=16)
              ->  Seq Scan on preis p  (cost=0.00..30.30 rows=1630
width=4)
  ->  Sort  (cost=31.07..31.07 rows=413 width=16)
        ->  Seq Scan on kneipe k  (cost=0.00..13.13 rows=413 width=16)

The query takes up to 100 times longer although the table "preis" is not
used at all.
I guess the Optimizer gets here something REALLY wrong....

Any tips (besides removing "preis" from the select statement...)?

Re: Optimizer Bug?

From
"Mitterwald, Holger"
Date:
Peter Eisentraut wrote:
[..]
>
> Sure it's used. Just because it's not mentioned in the select list doesn't
> mean it will be ignored completely. What you get here is a three-way join,
> which is of course slower than a two way join.

What I do not understand at all: The table "preis" is neither used in
the select
nor in the where part of the statement. The result will not change if
the table will
not be here nor will it change if there are other datasets in it. Why
should a DBMS
do a full table scan, do a third join on a table that is not used at
all?

Greetings,
   Holger
P.S.: I tested it on OpenIngres 2.0, and the effects are the same. Funny
thing...

Re: Optimizer Bug?

From
Tom Lane
Date:
"Mitterwald, Holger" <mittehlg@coi.de> writes:
> Seems ok to me, but if I insert a (huge) table which is nowhere
> referenced (here:"preis"), the
> following happens:

> query: explain select k.name, t.lang from kneipe k, typ t, preis p where
> k.typ=t.typ;
> NOTICE:  QUERY PLAN:

> Merge Join  (cost=2948.99..3280.15 rows=673190 width=36)
-> Sort  (cost=2917.91..2917.91 rows=26080 width=20)
-> Nested Loop  (cost=0.00..746.76 rows=26080 width=20)
-> Seq Scan on typ t  (cost=0.00..1.16 rows=16 width=16)
-> Seq Scan on preis p  (cost=0.00..30.30 rows=1630
> width=4)
-> Sort  (cost=31.07..31.07 rows=413 width=16)
-> Seq Scan on kneipe k  (cost=0.00..13.13 rows=413 width=16)

> The query takes up to 100 times longer although the table "preis" is not
> used at all.
> I guess the Optimizer gets here something REALLY wrong....

The optimizer is doing exactly what it is supposed to do: you told
it to join over three tables, and three tables is what you will get.

6.5 in fact failed to join preis if no reference to it was visible in
the query, but *that* behavior was buggy, not this one.  It's perfectly
clear from the SQL standard that the result of the query must be the
three-way cross-product minus rows that don't meet the WHERE condition.
Without joining preis, we don't produce the expected number of rows.

            regards, tom lane

Re: Optimizer Bug?

From
Peter Eisentraut
Date:
On Tue, 30 May 2000, Mitterwald, Holger wrote:

> If I execute the following statement:
> query: explain select k.name, t.lang from kneipe k, typ t where
> k.typ=t.typ;
> ProcessUtility: explain select k.name, t.lang from kneipe k, typ t where
> k.typ=t.typ;

> Seems ok to me, but if I insert a (huge) table which is nowhere
> referenced (here:"preis"), the
> following happens:
>
> query: explain select k.name, t.lang from kneipe k, typ t, preis p where
> k.typ=t.typ;

> The query takes up to 100 times longer although the table "preis" is not
> used at all.

Sure it's used. Just because it's not mentioned in the select list doesn't
mean it will be ignored completely. What you get here is a three-way join,
which is of course slower than a two way join.


--
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden