Thread: 7.4.xx regression

7.4.xx regression

From
bs
Date:
I have the following tables and indexes

CREATE TABLE cddb
(
    id integer PRIMARY KEY,

    /* artist name */
    name text,

    /* disk title */
    title text,

    /* the type of the artist group, person, orchestra*/
    type smallint,

    /* the creation date */
    created integer
);

CREATE INDEX cddb1 on cddb(title);
CREATE INDEX cddb2 on cddb(name);

CREATE TABLE cddbentry
(
    id integer PRIMARY KEY,

    diskid text,

    entry integer REFERENCES cddb
);

CREATE INDEX cddbentry1 on cddbentry (diskid);
CREATE INDEX cddbentry2 on cddbentry (entry);


I am using the following query:
select cddb.* from cddb,cddbentry where cddbentry.diskid = 'toto' and
cddbentry.entry = cddb.id;


On version 7.3.4 it produces this query plan

Nested Loop  (cost=0.00..41.25 rows=5 width=78)
   ->  Index Scan using cddbentry1 on cddbentry  (cost=0.00..17.07
rows=5 width=4)
         Index Cond: (diskid = 'toto'::text)
   ->  Index Scan using cddb_pkey on cddb  (cost=0.00..4.82 rows=1 width=74)
         Index Cond: ("outer".entry = cddb.id)


I have upgraded to version 7.4.0 (compiling the software and migrating
the database using dump/restore)

Now the following plan is produced

Hash Join  (cost=17.08..42.15 rows=7 width=74)
   Hash Cond: ("outer".id = "inner".entry)
   ->  Seq Scan on cddb  (cost=0.00..20.00 rows=1000 width=74)
   ->  Hash  (cost=17.07..17.07 rows=6 width=4)
         ->  Index Scan using cddbentry1 on cddbentry  (cost=0.00..17.07
rows=6 width=4)
               Index Cond: (diskid = 'toto'::text)

which result in a VERY much slower query as the cddb table has more than
1 million entry....  and there is at most one entry in cddbentry
which matches the diskid !

The workaround is to disable the hash join capability using set
enable_hashjoin to false;  resulting in the same query plan as in 7.3.x.

I think this can classified as a regression bug.


N.B: The cost value may be wrong, because to be able to send you this
email, they have been produced on empty tables; but
I can guarantee you they are the same on the full loaded database.


Hope this help to improve this great product

Bernard SNYERS

Re: 7.4.xx regression

From
Tom Lane
Date:
bs <bs@ionicsoft.com> writes:
> I have upgraded to version 7.4.0 (compiling the software and migrating
> the database using dump/restore)

> Now the following plan is produced

> Hash Join  (cost=17.08..42.15 rows=7 width=74)
>    Hash Cond: ("outer".id = "inner".entry)
>    ->  Seq Scan on cddb  (cost=0.00..20.00 rows=1000 width=74)
>    ->  Hash  (cost=17.07..17.07 rows=6 width=4)
>          ->  Index Scan using cddbentry1 on cddbentry  (cost=0.00..17.07
> rows=6 width=4)
>                Index Cond: (diskid = 'toto'::text)

> which result in a VERY much slower query as the cddb table has more than
> 1 million entry....  and there is at most one entry in cddbentry
> which matches the diskid !

> I think this can classified as a regression bug.

No, it can be classified as a "user forgot to vacuum or analyze"
mistake.  The estimates shown in the query plan look like the defaults
for an empty table.

            regards, tom lane