very poorly optimised query - Mailing list pgsql-bugs

From pgsql-bugs@postgresql.org
Subject very poorly optimised query
Date
Msg-id 200009281908.e8SJ8ib81851@hub.org
Whole thread Raw
List pgsql-bugs
Orion Henry (orion@trustcommerce.com) reports a bug with a severity of 3
The lower the number the more severe it is.

Short Description
very poorly optimised query

Long Description
This is for Postgres 7.0.1 running on x86 linux

While usually impressed with postgres's speed I ran into
a query that ran very very slowly.  The simplified version of
the query is

select count(*) from foo where groupid in (select distinct groupid from foo);

It runs at exactly O(n*n).  Theres were the times I experienced
running it on linux with an Athlon 700.

 1000 rows  0.6 seconds
 5000 rows 14.8 seconds
10000 rows 59.6 seconds

I wrote an equivlent query that uses a temporary table
and all querys run well under a second. See example code.
I also noted that indexing the groupid makes no difference in time.

Sample Code

drop table foo;
drop sequence fooseq;
create table foo ( groupid int4 );
create sequence fooseq;

insert into foo values (nextval('fooseq'::text));
-- repeate this insert 10,00 times

-- this query is O(n*n);
select count(*) from foo where groupid in (select distinct groupid from foo);

-- this query produces the same results at O(n) or O(n*log(n))
-- cant tell too fast
select distinct groupid into temp table tmp from foo;
select count(*) from foo a, tmp b where a.groupid = b.groupid;



No file was uploaded with this report

pgsql-bugs by date:

Previous
From: pgsql-bugs@postgresql.org
Date:
Subject: Inexplicably running wild and freezing with 7.0
Next
From: pgsql-bugs@postgresql.org
Date:
Subject: Subselects lack functionality