Thread: speeding up subqueries

speeding up subqueries

From
Phil Glatz
Date:
I'm having difficulties getting a subselect to perform well.  I've used
EXPLAIN to try to understand the problem, but can't see anything
wrong.  I've also created appropriate indexes, but am wondering if there is
something else involved in my particular situation.

Here is my query:

SELECT COUNT(*) FROM quiksearch q
WHERE q.resource_status_id=1
AND q.org_id IN (
SELECT org_id FROM org_resource_type WHERE resource=12
);

Both tables are simple and small (5000 rows in quiksearch, 12000 in
org_resource_type).

q.org_id is an integer

I've tried this with three values for the constant in the inner subquery

n  rows in subquery  execution time
-- ---------------   ---------------
12       301            3 sec
  3      1136          182 sec
16      1129            7 sec

The subqueries themselves all execute in less than one second.

I also tried running the second subquery, saving the values in a list
(1,2,3...), and hard coding that in instead of a subquery, execution time
dropped to three seconds.  i.e.

SELECT COUNT(*)
FROM quiksearch q
WHERE q.resource_status_id=1
AND q.org_id IN (
9,25,512,36,3,167,166,169,170,.........
);

I don't understand what is going on here, since the inner subquery runs
very fast, and the entire query also runs fast if I substitute the list of
returned values instead of a subquery.


Is there a way to make this query run faster?  Are there tricks to
optimizing subqueries?


Re: speeding up subqueries

From
Tom Lane
Date:
Phil Glatz <phil@glatz.com> writes:
> I'm having difficulties getting a subselect to perform well.  I've used
> EXPLAIN to try to understand the problem,

And?

> I don't understand what is going on here, since the inner subquery runs
> very fast, and the entire query also runs fast if I substitute the list of
> returned values instead of a subquery.

Yeah, but the inner query has to be done over for every row of the
outer.

You might try converting to a joinable subselect:

SELECT COUNT(*) FROM
  quiksearch q
join
  (select distinct org_id FROM org_resource_type WHERE resource=12) ss
using (org_id)
WHERE q.resource_status_id=1;

This would probably be a win if the DISTINCT processing is not too
terribly expensive, which'd depend on the number of rows selected from
org_resource_type ... but for a few thousand rows as you illustrated,
it shouldn't be bad.

Experimenting on this with dummy tables, I get a hash join plan, which
looks pretty reasonable.

            regards, tom lane

Re: speeding up subqueries

From
CoL
Date:
Hi,
try to use "exists" instead of "in".

CoL

Phil Glatz wrote:
> I'm having difficulties getting a subselect to perform well.  I've used
> EXPLAIN to try to understand the problem, but can't see anything
> wrong.  I've also created appropriate indexes, but am wondering if there is
> something else involved in my particular situation.
>
> Here is my query:
>
> SELECT COUNT(*) FROM quiksearch q
> WHERE q.resource_status_id=1
> AND q.org_id IN (
> SELECT org_id FROM org_resource_type WHERE resource=12
> );
>
> Both tables are simple and small (5000 rows in quiksearch, 12000 in
> org_resource_type).
>
> q.org_id is an integer
>
> I've tried this with three values for the constant in the inner subquery
>
> n  rows in subquery  execution time
> -- ---------------   ---------------
> 12       301            3 sec
>   3      1136          182 sec
> 16      1129            7 sec
>
> The subqueries themselves all execute in less than one second.
>
> I also tried running the second subquery, saving the values in a list
> (1,2,3...), and hard coding that in instead of a subquery, execution time
> dropped to three seconds.  i.e.
>
> SELECT COUNT(*)
> FROM quiksearch q
> WHERE q.resource_status_id=1
> AND q.org_id IN (
> 9,25,512,36,3,167,166,169,170,.........
> );
>
> I don't understand what is going on here, since the inner subquery runs
> very fast, and the entire query also runs fast if I substitute the list of
> returned values instead of a subquery.
>
>
> Is there a way to make this query run faster?  Are there tricks to
> optimizing subqueries?
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly