Thread: Slow query
Please help me speed up the following query. It used to run in 2-5 sec., but now it takes 2-3 mins!
I ran VACUUM FULL ANALYZE and REINDEX.
SELECT * FROM media m
WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE medianame='Audio')
AND EXISTS
(SELECT * FROM
(SELECT objectid AS mediaid
FROM media
WHERE activity='347667'
UNION
SELECT ism.media AS mediaid
FROM intsetmedia ism, set s
WHERE ism.set = s.objectid
AND s.activity='347667' ) AS a1
WHERE a1.mediaid = m.objectid
LIMIT 1)
ORDER BY medianame ASC, status DESC
Basically it tries to find all Audios that are either explicitly attached to the given activity, or attached to the given activity via a many-to-many relationship intsetmedia which links records in table Interaction, Set, and Media.
I attached the output of EXPLAIN and schemas and indexes on the tables involved. Most of the fields are not relevant to the query, but I listed them anyways. I discarded trigger information, though.
Thanks for your help.
Oleg
*************************************
This email may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Any review, copying, printing, disclosure or other use is prohibited.
We reserve the right to monitor email sent through our network.
*************************************
Attachment
Oleg, > Please help me speed up the following query. It used to run in 2-5 sec., > but now it takes 2-3 mins! > I ran VACUUM FULL ANALYZE and REINDEX. > SELECT * FROM media m > WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE This is a repost, isn't it? -- Josh Berkus Aglio Database Solutions San Francisco
No, I don't believe so. My previous question regarding performance was solved by VACUUM FULL and REINDEX. The current one, I believe, is more related to query structure and planner stats. -----Original Message----- From: Josh Berkus [mailto:josh@agliodbs.com] Sent: Monday, March 24, 2003 11:55 AM To: Oleg Lebedev; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Slow query Oleg, > Please help me speed up the following query. It used to run in 2-5 > sec., but now it takes 2-3 mins! I ran VACUUM FULL ANALYZE and > REINDEX. SELECT * FROM media m > WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE This is a repost, isn't it? -- Josh Berkus Aglio Database Solutions San Francisco ************************************* This email may contain privileged or confidential material intended for the named recipient only. If you are not the named recipient, delete this message and all attachments. Any review, copying, printing, disclosure or other use is prohibited. We reserve the right to monitor email sent through our network. *************************************
On Mon, 24 Mar 2003, Oleg Lebedev wrote: > Please help me speed up the following query. It used to run in 2-5 sec., > but now it takes 2-3 mins! EXPLAIN ANALYZE output would be useful to see where the time is actually taking place (rather than an estimate thereof).
EXPLAIN ANALYZE plan is shown below. I also attached it as a file. One thing that might help is that the query produces 27 rows, which is much less than predicted 1963. QUERY PLAN Sort (cost=553657.66..553662.57 rows=1963 width=218) (actual time=133036.73..133036.75 rows=27 loops=1) Sort Key: medianame, status InitPlan -> Seq Scan on mediatype (cost=0.00..1.29 rows=1 width=8) (actual time=0.12..0.14 rows=1 loops=1) Filter: (medianame = 'Audio'::character varying) -> Index Scan using media_mtype_index on media m (cost=0.00..553550.28 rows=1963 width=218) (actual time=5153.36..133036.00 rows=27 loops=1) Index Cond: (mediatype = $0) Filter: (subplan) SubPlan -> Limit (cost=138.92..138.93 rows=1 width=24) (actual time=2.92..2.92 rows=0 loops=44876) -> Subquery Scan a1 (cost=138.92..138.93 rows=1 width=24) (actual time=2.92..2.92 rows=0 loops=44876) -> Unique (cost=138.92..138.93 rows=1 width=24) (actual time=2.91..2.91 rows=0 loops=44876) -> Sort (cost=138.92..138.93 rows=2 width=24) (actual time=2.91..2.91 rows=0 loops=44876) Sort Key: mediaid -> Append (cost=0.00..138.91 rows=2 width=24) (actual time=2.80..2.81 rows=0 loops=44876) -> Subquery Scan "*SELECT* 1" (cost=0.00..5.11 rows=1 width=8) (actual time=0.06..0.06 rows=0 loops=44876) -> Index Scan using media_pkey on media (cost=0.00..5.11 rows=1 width=8) (actual time=0.05..0.05 rows=0 loops=44876) Index Cond: (objectid = $1) Filter: (activity = 347667::bigint) -> Subquery Scan "*SELECT* 2" (cost=24.25..133.80 rows=1 width=24) (actual time=2.73..2.73 rows=0 loops=44876) -> Hash Join (cost=24.25..133.80 rows=1 width=24) (actual time=2.72..2.72 rows=0 loops=44876) Hash Cond: ("outer"."set" = "inner".objectid) -> Index Scan using intsetmedia_media_index on intsetmedia ism (cost=0.00..109.26 rows=38 width=16) (actual time=0.04..0.04 rows=1 loops=44876) Index Cond: (media = $1) -> Hash (cost=24.24..24.24 rows=6 width=8) (actual time=0.14..0.14 rows=0 loops=44876) -> Index Scan using set_act_index on "set" s (cost=0.00..24.24 rows=6 width=8) (actual time=0.11..0.13 rows=2 loops=44876) Index Cond: (activity = 347667::bigint) Total runtime: 133037.49 msec -----Original Message----- From: Stephan Szabo [mailto:sszabo@megazone23.bigpanda.com] Sent: Monday, March 24, 2003 12:04 PM To: Oleg Lebedev Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Slow query On Mon, 24 Mar 2003, Oleg Lebedev wrote: > Please help me speed up the following query. It used to run in 2-5 > sec., but now it takes 2-3 mins! EXPLAIN ANALYZE output would be useful to see where the time is actually taking place (rather than an estimate thereof). ************************************* This email may contain privileged or confidential material intended for the named recipient only. If you are not the named recipient, delete this message and all attachments. Any review, copying, printing, disclosure or other use is prohibited. We reserve the right to monitor email sent through our network. *************************************
Attachment
Stephan, Hmmm ... I'm a bit confused by the new EXPLAIN output. Stefan, does Oleg's output show the time for *one* subplan execution, executed for 44,000 loops, or does it show the total time? The former would make more sense given his query, but I'm just not sure .... -- -Josh Berkus Aglio Database Solutions San Francisco
I decided that it might help to list the cardinalities of the pertinent tables: Intsetmedia: 90,000 rows Interaction: 26,000 rows Set: 7,000 rows Media: 80,000 rows -----Original Message----- From: Josh Berkus [mailto:josh@agliodbs.com] Sent: Monday, March 24, 2003 12:47 PM To: Stephan Szabo; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Slow query Importance: Low Stephan, Hmmm ... I'm a bit confused by the new EXPLAIN output. Stefan, does Oleg's output show the time for *one* subplan execution, executed for 44,000 loops, or does it show the total time? The former would make more sense given his query, but I'm just not sure .... -- -Josh Berkus Aglio Database Solutions San Francisco ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster ************************************* This email may contain privileged or confidential material intended for the named recipient only. If you are not the named recipient, delete this message and all attachments. Any review, copying, printing, disclosure or other use is prohibited. We reserve the right to monitor email sent through our network. *************************************
Oleg Lebedev <oleg.lebedev@waterford.org> writes: > SELECT * FROM media m > WHERE m.mediatype =3D (SELECT objectid FROM mediatype WHERE > medianame=3D'Audio')=20 > AND EXISTS=20 > (SELECT * FROM=20 > (SELECT objectid AS mediaid=20 > FROM media=20 > WHERE activity=3D'347667'=20 > UNION=20 > SELECT ism.media AS mediaid=20 > FROM intsetmedia ism, set s=20 > WHERE ism.set =3D s.objectid=20 > AND s.activity=3D'347667' ) AS a1=20 > WHERE a1.mediaid =3D m.objectid=20 > LIMIT 1)=20 > ORDER BY medianame ASC, status DESC=20 Well, one observation is that the LIMIT clause is useless and probably counterproductive; EXISTS takes only one row from the subselect anyway. Another is that the UNION is doing it the hard way; UNION implies doing a duplicate-elimination step, which you don't need here. UNION ALL would be a little quicker. But what I would do is split it into two EXISTS: SELECT * FROM media m WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE medianame='Audio') AND ( EXISTS(SELECT 1 FROM media WHERE activity='347667' AND objectid = m.objectid) OR EXISTS(SELECT 1 FROM intsetmedia ism, set s WHERE ism.set = s.objectid AND s.activity='347667' AND ism.media = m.objectid)) ORDER BY medianame ASC, status DESC regards, tom lane
I just ran the query you sent me and attached the output of EXPLAIN ANALYZE as TOMs_plan.txt It did not speed up the query significantly. It always seemed to me that UNION is faster than OR, so I tried your suggestion to use UNION ALL with the original query without counter-productive LIMIT 1 in EXISTS clause. This reduced the cost of the plan by 50%, but slowed down the query. Weird ... The plan is shown in UNION_ALL_plan.txt AFAIK, the only change I've done since the time when the query took 3 sec. to run was adding more indexes and increasing the size of data by about 25%. It sounds kind of stupid, but I remember that adding indexes sometimes slowed down my queries. I will try to drop all the indexes and add them back again one by one. Any other ideas? Thanks. Oleg -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Monday, March 24, 2003 1:48 PM To: Oleg Lebedev Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Slow query Oleg Lebedev <oleg.lebedev@waterford.org> writes: > SELECT * FROM media m > WHERE m.mediatype =3D (SELECT objectid FROM mediatype WHERE > medianame=3D'Audio')=20 AND EXISTS=20 > (SELECT * FROM=20 > (SELECT objectid AS mediaid=20 > FROM media=20 > WHERE activity=3D'347667'=20 > UNION=20 > SELECT ism.media AS mediaid=20 > FROM intsetmedia ism, set s=20 > WHERE ism.set =3D s.objectid=20 > AND s.activity=3D'347667' ) AS a1=20 > WHERE a1.mediaid =3D m.objectid=20 > LIMIT 1)=20 > ORDER BY medianame ASC, status DESC=20 Well, one observation is that the LIMIT clause is useless and probably counterproductive; EXISTS takes only one row from the subselect anyway. Another is that the UNION is doing it the hard way; UNION implies doing a duplicate-elimination step, which you don't need here. UNION ALL would be a little quicker. But what I would do is split it into two EXISTS: SELECT * FROM media m WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE medianame='Audio') AND ( EXISTS(SELECT 1 FROM media WHERE activity='347667' AND objectid = m.objectid) OR EXISTS(SELECT 1 FROM intsetmedia ism, set s WHERE ism.set = s.objectid AND s.activity='347667' AND ism.media = m.objectid)) ORDER BY medianame ASC, status DESC regards, tom lane ************************************* This email may contain privileged or confidential material intended for the named recipient only. If you are not the named recipient, delete this message and all attachments. Any review, copying, printing, disclosure or other use is prohibited. We reserve the right to monitor email sent through our network. *************************************
Attachment
Oleg Lebedev <oleg.lebedev@waterford.org> writes: > I just ran the query you sent me and attached the output of EXPLAIN > ANALYZE as TOMs_plan.txt > It did not speed up the query significantly. Nope. I was hoping to see a faster-start plan, but given the number of rows involved I guess it won't change its mind. You're going to have to think about a more intelligent approach, rather than minor tweaks. One question: since objectid is evidently a primary key, why are you doing a subselect for the first part? Wouldn't it give the same result just to say "m.activity = '347667'" in the top-level WHERE? As for the second part, I think you'll have to try to rewrite it as a join with the media table. regards, tom lane
You are right. I rewrote the query using JOINs and it increased performance from 123 sec. to 20msec. I betcha I screwed smth up, but I list the rewritten query below anyways. I also attached the new plan. Thank you. SELECT * FROM media m JOIN ((SELECT objectid AS mediaid FROM media WHERE activity='347667') UNION (SELECT ism.media AS mediaid FROM intsetmedia ism, set s WHERE ism.set = s.objectid AND s.activity='347667' )) a1 ON m.mediatype = (SELECT objectid FROM mediatype WHERE medianame='Audio') AND m.objectid=mediaid ORDER BY medianame ASC, status DESC -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Monday, March 24, 2003 3:09 PM To: Oleg Lebedev Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Slow query Oleg Lebedev <oleg.lebedev@waterford.org> writes: > I just ran the query you sent me and attached the output of EXPLAIN > ANALYZE as TOMs_plan.txt It did not speed up the query significantly. Nope. I was hoping to see a faster-start plan, but given the number of rows involved I guess it won't change its mind. You're going to have to think about a more intelligent approach, rather than minor tweaks. One question: since objectid is evidently a primary key, why are you doing a subselect for the first part? Wouldn't it give the same result just to say "m.activity = '347667'" in the top-level WHERE? As for the second part, I think you'll have to try to rewrite it as a join with the media table. regards, tom lane ************************************* This email may contain privileged or confidential material intended for the named recipient only. If you are not the named recipient, delete this message and all attachments. Any review, copying, printing, disclosure or other use is prohibited. We reserve the right to monitor email sent through our network. *************************************
Attachment
Oleg, My guess is that the query runs slow because by adding data you exceeded what your database can do in memory and you need to do some kind of disk sort. How about rewriting your query without the UNION and the EXISTS to something like SELECT * FROM media m WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE medianame='Audio') AND ( m.activity='347667' OR m.objectid IN ( SELECT s.objectid FROM intsetmedia ism, set s WHERE ism.set = s.objectid AND s.activity='347667')) ORDER BY medianame ASC, status DESC Regards, Nikolaus Dilger On Mon, 24 Mar 2003, Oleg Lebedev wrote: Message Please help me speed up the following query. It used to run in 2-5 sec., but now it takes 2-3 mins! I ran VACUUM FULL ANALYZE and REINDEX. SELECT * FROM media m WHERE m.mediatype = (SELECT objectid FROM mediatype WHERE medianame='Audio') AND EXISTS (SELECT * FROM (SELECT objectid AS mediaid FROM media WHERE activity='347667' UNION SELECT ism.media AS mediaid FROM intsetmedia ism, set s WHERE ism.set = s.objectid AND s.activity='347667' ) AS a1 WHERE a1.mediaid = m.objectid LIMIT 1) ORDER BY medianame ASC, status DESC Basically it tries to find all Audios that are either explicitly attached to the given activity, or attached to the given activity via a many-to-many relationship intsetmedia which links records in table Interaction, Set, and Media. I attached the output of EXPLAIN and schemas and indexes on the tables involved. Most of the fields are not relevant to the query, but I listed them anyways. I discarded trigger information, though. Thanks for your help. Oleg ************************************* This email may contain privileged or confidential material intended for the named recipient only. If you are not the named recipient, delete this message and all attachments. Any review, copying, printing, disclosure or other use is prohibited. We reserve the right to monitor email sent through our network. *************************************