Thread: query with subquery abnormally slow?

query with subquery abnormally slow?

From

Oskar Liljeblad

Date:

01 November 1999, 13:46:55

I'm doing a SELECT query with a subquery on a table with 12K rows
but it is very slow (10 seconds+). The query looks like this:
 select *   from items   where package in     (select package        from items        where ...blah...        group by
package)

Where package is an indexed varchar field.

The subquery generates 50 rows, and completes quickly when I run
it alone. Now if I manually replace the subquery with these 50
strings - like
 select *   from items   where package in     ('r1', 'r2', 'r3', .., 'r50')

this new query completes in less than a second. The result
of the complete query is 500 rows.

Is this natural? Is there anything I can do to speed things up?
(Besides doing N+1 queries where N is the number of rows the
subquery would produce.)

Oskar Liljeblad (osk@hem.passagen.se)

Re: [SQL] query with subquery abnormally slow?

From

Oskar Liljeblad

Date:

01 November 1999, 14:04:55

On Mon, Nov 01, 1999 at 07:47:07PM +0100, Oskar Liljeblad wrote:
> I'm doing a SELECT query with a subquery on a table with 12K rows
> but it is very slow (10 seconds+). The query looks like this:

Ok, I should have read the latest FAQ and checked the mail
archive before posting this.

I've read about the EXISTS-workaround, but it seems it can't be
applied to my query because it deals with one table only.
Or can it?

Oskar Liljeblad (osk@hem.passagen.se)

Re: [SQL] query with subquery abnormally slow?

From

"Moray McConnachie"

Date:

01 November 1999, 15:01:56

>   select *
>     from items
>     where package in
>       (select package
>          from items
>          where ...blah...
>          group by package)

Can't see why you don't rewrite this as one query:

select * from items where ... blah ... order by package;
(is it aggregates in the where clause?)

Assuming you do need to do it the way you have done it ,

SELECT * FROM items WHERE NOT EXISTS
(SELECT package FROM items itemscopy WHERE ... blah ... AND
itemscopy.itemid=items.itemid GROUP BY package);

should do it. itemid should be replaced by whatever the primary key of the
items table is. Note that in blah, fields must be referred to as
itemcopy.field1,itemcopy.field2, etc.

Yours,
Moray McConnachie

Re: [SQL] query with subquery abnormally slow?

From

Oskar Liljeblad

Date:

01 November 1999, 15:09:56

On Mon, Nov 01, 1999 at 07:47:07PM +0100, Oskar Liljeblad wrote:
> I'm doing a SELECT query with a subquery on a table with 12K rows
> but it is very slow (10 seconds+). The query looks like this:
> 
>   select *
>     from items
>     where package in
>       (select package
>          from items
>          where ...blah...
>          group by package)

I could do it much faster with this query:
 select distinct i2.package, i1.* from items i1, items i2 where i1.package = i2.package and ...blah...

but it is still at least three times slower than
doing the subquery first, and for each row in the
result another query. Any ideas?

Oskar Liljeblad (osk@hem.passagen.se)

Re: [SQL] query with subquery abnormally slow?

From

Bill Brandt

Date:

01 November 1999, 15:35:56

have you tried to:  
explain <your select statement> 
to see the cost of the parts?

I would try this.  Then I would try a :
create index <indexname> on items (package)
and repeat the explain to see if it changes the cost.

Bill

On Mon, Nov 01, 1999 at 09:10:30PM +0100, Oskar Liljeblad wrote:
>On Mon, Nov 01, 1999 at 07:47:07PM +0100, Oskar Liljeblad wrote:
>> I'm doing a SELECT query with a subquery on a table with 12K rows
>> but it is very slow (10 seconds+). The query looks like this:
>> 
>>   select *
>>     from items
>>     where package in
>>       (select package
>>          from items
>>          where ...blah...
>>          group by package)
>
>I could do it much faster with this query:
>
>  select distinct i2.package, i1.*
>  from items i1, items i2
>  where i1.package = i2.package
>  and ...blah...
>
>but it is still at least three times slower than
>doing the subquery first, and for each row in the
>result another query. Any ideas?
>
>Oskar Liljeblad (osk@hem.passagen.se)
>
>************
>

Bill Brandt                                   
brandtwr@draaw.net                              http://www.draaw.net/

Re: [SQL] query with subquery abnormally slow?

From

Oskar Liljeblad

Date:

01 November 1999, 15:36:57

On Mon, Nov 01, 1999 at 07:59:20PM -0000, Moray McConnachie wrote:
> >   select *
> >     from items
> >     where package in
> >       (select package
> >          from items
> >          where ...blah...
> >          group by package)
> 
> Can't see why you don't rewrite this as one query:
> 
> select * from items where ... blah ... order by package;
> (is it aggregates in the where clause?)

The items table consists of some columns `performer', `title',
and `package'. Each row corresponds to a song (or similar),
and a `package' is the album (or similar) the song is on.
(This design could certainly be improved with multiple tables,
but it wasn't created that way.)

My query is supposed to list all songs on all albums (packages)
which has at least one song by a specific artist (performer).
("Can you show me some stats on the albums which artist X
has participated in?".)

> Assuming you do need to do it the way you have done it ,
...
> should do it. itemid should be replaced by whatever the primary key of the
> items table is. Note that in blah, fields must be referred to as
> itemcopy.field1,itemcopy.field2, etc.

This query works:
select *from items i1where exists  (select package   from items i2   where i2.performer ~ '...'     and i1.package =
i2.package  group by i2.package)

However this was also very slow (>30 seconds). When I replaced
"and i1.package = i2.package" with "and i1.id = i2.id" only some
songs from the matching albums were returned. (The songs which
matched performer ~ '...'.)

Oskar Liljeblad (osk@hem.passagen.se)

Re: [SQL] query with subquery abnormally slow?

From

Tom Lane

Date:

01 November 1999, 17:11:58

Oskar Liljeblad <osk@hem.passagen.se> writes:
> I'm doing a SELECT query with a subquery on a table with 12K rows
> but it is very slow (10 seconds+). The query looks like this:

>   select *
>     from items
>     where package in
>       (select package
>          from items
>          where ...blah...
>          group by package)

Have you considered something like

select i1.* from items i1, items i2 where
i1.package = i2.package and i2.performer ~ '...';

This would only be fast given an index on package, I think,
but you said you had one...
        regards, tom lane

Re: [SQL] query with subquery abnormally slow?

From

Zalman Stern

Date:

02 November 1999, 03:43:08

I inadvertently deleted Oskar's message where he described what he is
trying to do at a higher level, but I don't think I'm completely missing
the point with the following:

The table looks like so:

select * from test;
[
ssmldb=> select * from test;
package               |artist             |song
----------------------+-------------------+-----------------
Surf Comp             |Mermen             |Pulpin
Surf Comp             |Bambi Molesters    |Tremor
Surf Comp             |The Squares        |Squaranoid
Dumb Loud Hollow Twang|Bambi Molesters    |Point Break
Dumb Loud Hollow Twang|Bambi Molesters    |Glider
Songs of the Cows     |Mermen             |Songs of the Cows
Surfmania             |The Aqua Velvets   |Surfmania
Surf Comp 2           |Mermen             |Slo Mo H50
Surf Comp 2           |Los Straightjackets|Caveman
]

select t2.* from test t1, test t2   where t1.package = t2.package and  t1.artist = 'Mermen' and t2.artist != 'Mermen';
[
package    |artist             |song
-----------+-------------------+----------
Surf Comp  |Bambi Molesters    |Tremor
Surf Comp  |The Squares        |Squaranoid
Surf Comp 2|Los Straightjackets|Caveman
(3 rows)
]

The query above shows all songs by a different artist that share an album
with a song by the artist in question. It however omits the songs by the
artist in question. To get those back, you could try:

select distinct t1.* from test t1, test t2   where t1.package = t2.package and  ((t1.artist = 'Mermen' and t2.artist !=
'Mermen')or   (t1.artist != 'Mermen' and t2.artist = 'Mermen'));
 

or use a union clause:
select t1.* from test t1, test t2   where t1.package = t2.package and  (t2.artist = 'Mermen' and t1.artist != 'Mermen')
 union select t3.* from test t3, test t4      where t3.package = t4.package and        (t3.artist = 'Mermen' and
t4.artist!= 'Mermen') ;
 

I don't know how these do for speed as I don't care to create a big table
and indices and all that, but they do not use EXITS and it seems with
suitable indices they should be fairly fast.

-Z-