Thread: Slow query

Slow query

From
Oleg Lebedev
Date:
Please help me speed up the following query. It used to run in 2-5 sec., but now it takes 2-3 mins!
I ran VACUUM FULL ANALYZE and REINDEX.
SELECT * FROM media m
WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE medianame='Audio')
AND EXISTS
        (SELECT * FROM
                (SELECT objectid AS mediaid
                FROM media
                WHERE activity='347667'
                UNION
                SELECT ism.media AS mediaid
                FROM intsetmedia ism, set s
                WHERE ism.set = s.objectid
                AND s.activity='347667' ) AS a1
        WHERE a1.mediaid = m.objectid
        LIMIT 1)
ORDER BY medianame ASC, status DESC
 
Basically it tries to find all Audios that are either explicitly attached to the given activity, or attached to the given activity via a many-to-many relationship intsetmedia which links records in table Interaction, Set, and Media.
I attached the output of EXPLAIN and schemas and indexes on the tables involved. Most of the fields are not relevant to the query, but I listed them anyways. I discarded trigger information, though.
Thanks for your help.
 
Oleg

*************************************

This email may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Any review, copying, printing, disclosure or other use is prohibited.
We reserve the right to monitor email sent through our network.

*************************************

Attachment

Re: Slow query

From
Josh Berkus
Date:
Oleg,

> Please help me speed up the following query. It used to run in 2-5 sec.,
> but now it takes 2-3 mins!
> I ran VACUUM FULL ANALYZE and REINDEX.
> SELECT * FROM media m
> WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE

This is a repost, isn't it?

--
Josh Berkus
Aglio Database Solutions
San Francisco


Re: Slow query

From
Oleg Lebedev
Date:
No, I don't believe so.
My previous question regarding performance was solved by VACUUM FULL and
REINDEX.
The current one, I believe, is more related to query structure and
planner stats.

-----Original Message-----
From: Josh Berkus [mailto:josh@agliodbs.com]
Sent: Monday, March 24, 2003 11:55 AM
To: Oleg Lebedev; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Slow query


Oleg,

> Please help me speed up the following query. It used to run in 2-5
> sec., but now it takes 2-3 mins! I ran VACUUM FULL ANALYZE and
> REINDEX. SELECT * FROM media m
> WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE

This is a repost, isn't it?

--
Josh Berkus
Aglio Database Solutions
San Francisco


*************************************

This email may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Any review, copying, printing, disclosure or other use is prohibited.
We reserve the right to monitor email sent through our network.

*************************************


Re: Slow query

From
Stephan Szabo
Date:
On Mon, 24 Mar 2003, Oleg Lebedev wrote:

> Please help me speed up the following query. It used to run in 2-5 sec.,
> but now it takes 2-3 mins!

EXPLAIN ANALYZE output would be useful to see where the time is actually
taking place (rather than an estimate thereof).


Re: Slow query

From
Oleg Lebedev
Date:
EXPLAIN ANALYZE plan is shown below.
I also attached it as a file.

One thing that might help is that the query produces 27 rows, which is
much less than predicted 1963.

QUERY PLAN
 Sort (cost=553657.66..553662.57 rows=1963 width=218) (actual
time=133036.73..133036.75 rows=27 loops=1)
  Sort Key: medianame, status
  InitPlan
  -> Seq Scan on mediatype (cost=0.00..1.29 rows=1 width=8) (actual
time=0.12..0.14 rows=1 loops=1)
  Filter: (medianame = 'Audio'::character varying)
  -> Index Scan using media_mtype_index on media m (cost=0.00..553550.28
rows=1963 width=218) (actual  time=5153.36..133036.00 rows=27 loops=1)
  Index Cond: (mediatype = $0)
  Filter: (subplan)
  SubPlan
  -> Limit (cost=138.92..138.93 rows=1 width=24) (actual time=2.92..2.92
rows=0 loops=44876)
  -> Subquery Scan a1 (cost=138.92..138.93 rows=1 width=24) (actual
time=2.92..2.92 rows=0 loops=44876)
  -> Unique (cost=138.92..138.93 rows=1 width=24) (actual
time=2.91..2.91 rows=0 loops=44876)
  -> Sort (cost=138.92..138.93 rows=2 width=24) (actual time=2.91..2.91
rows=0 loops=44876)
  Sort Key: mediaid
  -> Append (cost=0.00..138.91 rows=2 width=24) (actual time=2.80..2.81
rows=0 loops=44876)
  -> Subquery Scan "*SELECT* 1" (cost=0.00..5.11 rows=1 width=8) (actual
time=0.06..0.06 rows=0 loops=44876)
  -> Index Scan using media_pkey on media (cost=0.00..5.11 rows=1
width=8) (actual time=0.05..0.05 rows=0  loops=44876)
  Index Cond: (objectid = $1)
  Filter: (activity = 347667::bigint)
  -> Subquery Scan "*SELECT* 2" (cost=24.25..133.80 rows=1 width=24)
(actual time=2.73..2.73 rows=0  loops=44876)
  -> Hash Join (cost=24.25..133.80 rows=1 width=24) (actual
time=2.72..2.72 rows=0 loops=44876)
  Hash Cond: ("outer"."set" = "inner".objectid)
  -> Index Scan using intsetmedia_media_index on intsetmedia ism
(cost=0.00..109.26 rows=38 width=16)  (actual time=0.04..0.04 rows=1
loops=44876)
  Index Cond: (media = $1)
  -> Hash (cost=24.24..24.24 rows=6 width=8) (actual time=0.14..0.14
rows=0 loops=44876)
  -> Index Scan using set_act_index on "set" s (cost=0.00..24.24 rows=6
width=8) (actual time=0.11..0.13  rows=2 loops=44876)
  Index Cond: (activity = 347667::bigint)
 Total runtime: 133037.49 msec


-----Original Message-----
From: Stephan Szabo [mailto:sszabo@megazone23.bigpanda.com]
Sent: Monday, March 24, 2003 12:04 PM
To: Oleg Lebedev
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Slow query



On Mon, 24 Mar 2003, Oleg Lebedev wrote:

> Please help me speed up the following query. It used to run in 2-5
> sec., but now it takes 2-3 mins!

EXPLAIN ANALYZE output would be useful to see where the time is actually
taking place (rather than an estimate thereof).



*************************************

This email may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Any review, copying, printing, disclosure or other use is prohibited.
We reserve the right to monitor email sent through our network.

*************************************

Attachment

Re: Slow query

From
Josh Berkus
Date:
Stephan,

Hmmm ... I'm a bit confused by the new EXPLAIN output.   Stefan, does Oleg's
output show the time for *one* subplan execution, executed for 44,000 loops,
or does it show the total time?  The former would make more sense given his
query, but I'm just not sure ....


--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: Slow query

From
Oleg Lebedev
Date:
I decided that it might help to list the cardinalities of the pertinent
tables:
Intsetmedia: 90,000 rows
Interaction: 26,000 rows
Set: 7,000 rows
Media: 80,000 rows

-----Original Message-----
From: Josh Berkus [mailto:josh@agliodbs.com]
Sent: Monday, March 24, 2003 12:47 PM
To: Stephan Szabo; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Slow query
Importance: Low


Stephan,

Hmmm ... I'm a bit confused by the new EXPLAIN output.   Stefan, does
Oleg's
output show the time for *one* subplan execution, executed for 44,000
loops,
or does it show the total time?  The former would make more sense given
his
query, but I'm just not sure ....


--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


*************************************

This email may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Any review, copying, printing, disclosure or other use is prohibited.
We reserve the right to monitor email sent through our network.

*************************************


Re: Slow query

From
Tom Lane
Date:
Oleg Lebedev <oleg.lebedev@waterford.org> writes:
> SELECT * FROM media m
> WHERE m.mediatype =3D (SELECT objectid FROM mediatype WHERE
> medianame=3D'Audio')=20
> AND EXISTS=20
>         (SELECT * FROM=20
>                 (SELECT objectid AS mediaid=20
>                 FROM media=20
>                 WHERE activity=3D'347667'=20
>                 UNION=20
>                 SELECT ism.media AS mediaid=20
>                 FROM intsetmedia ism, set s=20
>                 WHERE ism.set =3D s.objectid=20
>                 AND s.activity=3D'347667' ) AS a1=20
>         WHERE a1.mediaid =3D m.objectid=20
>         LIMIT 1)=20
> ORDER BY medianame ASC, status DESC=20

Well, one observation is that the LIMIT clause is useless and probably
counterproductive; EXISTS takes only one row from the subselect anyway.
Another is that the UNION is doing it the hard way; UNION implies doing
a duplicate-elimination step, which you don't need here.  UNION ALL
would be a little quicker.  But what I would do is split it into two
EXISTS:

SELECT * FROM media m
WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE
medianame='Audio')
AND (   EXISTS(SELECT 1
               FROM media
               WHERE activity='347667'
               AND objectid = m.objectid)
     OR EXISTS(SELECT 1
               FROM intsetmedia ism, set s
               WHERE ism.set = s.objectid
               AND s.activity='347667'
               AND ism.media = m.objectid))
ORDER BY medianame ASC, status DESC

            regards, tom lane


Re: Slow query

From
Oleg Lebedev
Date:
I just ran the query you sent me and attached the output of EXPLAIN
ANALYZE as TOMs_plan.txt
It did not speed up the query significantly.

It always seemed to me that UNION is faster than OR, so I tried your
suggestion to use UNION ALL with the original query without
counter-productive LIMIT 1 in EXISTS clause. This reduced the cost of
the plan by 50%, but slowed down the query. Weird ... The plan is shown
in UNION_ALL_plan.txt

AFAIK, the only change I've done since the time when the query took 3
sec. to run was adding more indexes and increasing the size of data by
about 25%. It sounds kind of stupid, but I remember that adding indexes
sometimes slowed down my queries. I will try to drop all the indexes and
add them back again one by one.

Any other ideas?

Thanks.
Oleg




-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, March 24, 2003 1:48 PM
To: Oleg Lebedev
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Slow query


Oleg Lebedev <oleg.lebedev@waterford.org> writes:
> SELECT * FROM media m
> WHERE m.mediatype =3D (SELECT objectid FROM mediatype WHERE
> medianame=3D'Audio')=20 AND EXISTS=20
>         (SELECT * FROM=20
>                 (SELECT objectid AS mediaid=20
>                 FROM media=20
>                 WHERE activity=3D'347667'=20
>                 UNION=20
>                 SELECT ism.media AS mediaid=20
>                 FROM intsetmedia ism, set s=20
>                 WHERE ism.set =3D s.objectid=20
>                 AND s.activity=3D'347667' ) AS a1=20
>         WHERE a1.mediaid =3D m.objectid=20
>         LIMIT 1)=20
> ORDER BY medianame ASC, status DESC=20

Well, one observation is that the LIMIT clause is useless and probably
counterproductive; EXISTS takes only one row from the subselect anyway.
Another is that the UNION is doing it the hard way; UNION implies doing
a duplicate-elimination step, which you don't need here.  UNION ALL
would be a little quicker.  But what I would do is split it into two
EXISTS:

SELECT * FROM media m
WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE
medianame='Audio')
AND (   EXISTS(SELECT 1
               FROM media
               WHERE activity='347667'
               AND objectid = m.objectid)
     OR EXISTS(SELECT 1
               FROM intsetmedia ism, set s
               WHERE ism.set = s.objectid
               AND s.activity='347667'
               AND ism.media = m.objectid))
ORDER BY medianame ASC, status DESC

            regards, tom lane



*************************************

This email may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Any review, copying, printing, disclosure or other use is prohibited.
We reserve the right to monitor email sent through our network.

*************************************

Attachment

Re: Slow query

From
Tom Lane
Date:
Oleg Lebedev <oleg.lebedev@waterford.org> writes:
> I just ran the query you sent me and attached the output of EXPLAIN
> ANALYZE as TOMs_plan.txt
> It did not speed up the query significantly.

Nope.  I was hoping to see a faster-start plan, but given the number of
rows involved I guess it won't change its mind.  You're going to have to
think about a more intelligent approach, rather than minor tweaks.

One question: since objectid is evidently a primary key, why are you
doing a subselect for the first part?  Wouldn't it give the same result
just to say "m.activity = '347667'" in the top-level WHERE?

As for the second part, I think you'll have to try to rewrite it as a
join with the media table.

            regards, tom lane


Re: Slow query

From
Oleg Lebedev
Date:
You are right. I rewrote the query using JOINs and it increased
performance from 123 sec. to 20msec. I betcha I screwed smth up, but I
list the rewritten query below anyways. I also attached the new plan.
Thank you.

SELECT * FROM media m
JOIN
((SELECT objectid AS mediaid
FROM media
WHERE activity='347667')
UNION
(SELECT ism.media AS mediaid
FROM intsetmedia ism, set s
WHERE ism.set = s.objectid
AND s.activity='347667' )) a1
ON
m.mediatype = (SELECT objectid FROM mediatype WHERE medianame='Audio')
AND m.objectid=mediaid
ORDER BY medianame ASC, status DESC



-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, March 24, 2003 3:09 PM
To: Oleg Lebedev
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Slow query


Oleg Lebedev <oleg.lebedev@waterford.org> writes:
> I just ran the query you sent me and attached the output of EXPLAIN
> ANALYZE as TOMs_plan.txt It did not speed up the query significantly.

Nope.  I was hoping to see a faster-start plan, but given the number of
rows involved I guess it won't change its mind.  You're going to have to
think about a more intelligent approach, rather than minor tweaks.

One question: since objectid is evidently a primary key, why are you
doing a subselect for the first part?  Wouldn't it give the same result
just to say "m.activity = '347667'" in the top-level WHERE?

As for the second part, I think you'll have to try to rewrite it as a
join with the media table.

            regards, tom lane



*************************************

This email may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Any review, copying, printing, disclosure or other use is prohibited.
We reserve the right to monitor email sent through our network.

*************************************

Attachment

Re: Slow query

From
"Nikolaus Dilger"
Date:
Oleg,

My guess is that the query runs slow because by adding
data you exceeded what your database can do in memory
and you need to do some kind of disk sort.

How about rewriting your query without the UNION and
the EXISTS to something like

SELECT * FROM media m
WHERE m.mediatype = (SELECT objectid FROM mediatype
                     WHERE medianame='Audio')
AND ( m.activity='347667'
      OR m.objectid IN (
               SELECT s.objectid
               FROM intsetmedia ism, set s
               WHERE ism.set = s.objectid
               AND s.activity='347667'))
ORDER BY medianame ASC, status DESC

Regards,
Nikolaus Dilger

On Mon, 24 Mar 2003, Oleg Lebedev wrote:


Message



Please help me speed
up the following query. It used to run in 2-5 sec., but
now it takes 2-3
mins!
I ran VACUUM FULL
ANALYZE and REINDEX.
SELECT * FROM media
m
WHERE m.mediatype =
(SELECT objectid FROM mediatype WHERE
medianame='Audio')
AND EXISTS

        (SELECT * FROM

               
(SELECT objectid AS mediaid
               
FROM media
               
WHERE activity='347667'
               
UNION
               
SELECT ism.media AS mediaid
               
FROM intsetmedia ism, set s
               
WHERE ism.set = s.objectid
               
AND s.activity='347667' ) AS a1
        WHERE a1.mediaid = m.objectid

        LIMIT 1)
ORDER BY medianame ASC, status DESC
 
Basically it tries
to find all Audios that are either explicitly attached
to the given activity, or
attached to the given activity via a many-to-many
relationship intsetmedia which
links records in table Interaction, Set, and Media.
I attached the
output of EXPLAIN and schemas and indexes on the tables
involved. Most of the
fields are not relevant to the query, but I listed them
anyways. I discarded
trigger information, though.
Thanks for your
help.
 
Oleg

*************************************

This email may contain privileged or confidential
material intended for the named recipient only.
If you are not the named recipient, delete this message
and all attachments.
Any review, copying, printing, disclosure or other use
is prohibited.
We reserve the right to monitor email sent through our
network.

*************************************