Thread: Egroupware infolog query slow (includes query plan)
Hello, I could use some help figuring out how to speed up a query. Below is the SQL and query plan for a common SELECT done by the open source eGroupware project. Now that there are about 16,000 rows in egw_infolog and 32,000 in egw_infolog_extra, the below query takes about 6 minutes to finish! I'm not skilled enough at reading the "Explain Analzyze" output to understand what the primary problem is. Thanks! Mark ########### SELECT DISTINCT main.* ,( SELECT count(*) FROM egw_infolog sub WHERE sub.info_id_parent=main.info_id AND (info_owner=6OR ((','||info_responsible||',' LIKE '%,-2,%' OR ','||info_responsible||',' LIKE '%,-1,%'OR ','||info_responsible||',' LIKE '%,6,%') AND info_access='public') OR info_owner IN (6)OR (info_access='public' AND info_owner IN(6))) ) AS info_anz_subs FROM egw_infolog main LEFT JOIN egw_infolog_extra ON main.info_id=egw_infolog_extra.info_id WHERE ( (info_owner=6 OR ((','||info_responsible||',' LIKE '%,-2,%' OR ','||info_responsible||',' LIKE '%,-1,%'OR ','||info_responsible||',' LIKE '%,6,%') AND info_access='public') OR info_owner IN (6)OR (info_access='public' AND info_owner IN(6))) AND info_status <> 'deleted' ) ORDER BY info_datemodified DESC LIMIT 15 OFFSET 0 Query plan: #### Limit (cost=68624989.18..68624991.31 rows=15 width=1011) (actual time=686260.735..686260.878 rows=15 loops=1) -> Unique (cost=68624989.18..68627288.59 rows=16212 width=1011) (actual time=686260.733..686260.857 rows=15 loops=1) -> Sort (cost=68624989.18..68625068.47 rows=31716 width=1011) (actual time=686260.730..686260.766 rows=29 loops=1) Sort Key: main.info_datemodified, main.info_id, main.info_type, main.info_from, main.info_addr, main.info_subject, main.info_des, main.info_owner, main.info_responsible, main.info_access, main.info_cat, main.info_startdate, main.info_enddate, main.info_id_parent, main.info_planned_time, main.info_used_time, main.info_status, main.info_confirm, main.info_modifier, main.info_link_id, main.info_priority, main.pl_id, main.info_price, main.info_percent, main.info_datecompleted, main.info_location, main.info_custom_from, (subplan) -> Merge Left Join (cost=0.00..68594428.95 rows=31716 width=1011) (actual time=21.358..684226.134 rows=32424 loops=1) Merge Cond: (main.info_id = egw_infolog_extra.info_id) -> Index Scan using egw_infolog_pkey on egw_infolog main (cost=0.00..3025.84 rows=16212 width=1011) (actual time=0.060..135.766 rows=16212 loops=1) Filter: (((info_owner = 6) OR (((((','::text || (info_responsible)::text) || ','::text) ~~ '%,-2,%'::text) OR (((','::text || (info_responsible)::text) || ','::text) ~~ '%,-1, %'::text) OR (((','::text || (info_responsible)::text) || ','::text) ~~ '%,6,%'::text)) AND ((info_access)::text = 'public'::text)) OR (info_owner = 6) OR (((info_access)::text = 'public'::text) AND (info_owner = 6))) AND ((info_status)::text <> 'deleted'::text)) -> Index Scan using egw_infolog_extra_pkeyon egw_infolog_extra (cost=0.00..1546.30 rows=32424 width=4) (actual time=0.025..317.272 rows=32424 loops=1) SubPlan -> Aggregate (cost=2162.60..2162.61rows=1 width=0) (actual time=21.073..21.073 rows=1 loops=32424) -> Seq Scan on egw_infolog sub (cost=0.00..2122.07 rows=16212 width=0) (actual time=21.065..21.065 rows=0 loops=32424) Filter: ((info_id_parent = $0) AND ((info_owner = 6) OR (((((','::text || (info_responsible)::text) || ','::text) ~~ '%,-2,%'::text) OR (((','::text || (info_responsible)::text) || ','::text) ~~ '%,-1,%'::text) OR (((','::text || (info_responsible)::text) || ','::text) ~~ '%,6, %'::text)) AND ((info_access)::text = 'public'::text)) OR (info_owner = 6) OR (((info_access)::text = 'public'::text) AND (info_owner = 6)))) Total runtime: 686278.730 ms
I should have mentioned in the last post that PostgreSQL 8.2.9 is in use. I could upgrade to 8.3.x if that is expected to help performance in this case. Mark On Sun, 2008-07-06 at 16:23 -0400, Mark Stosberg wrote: > Hello, > > I could use some help figuring out how to speed up a query. Below is the > SQL and query plan for a common SELECT done by the open source > eGroupware project. > > Now that there are about 16,000 rows in egw_infolog and 32,000 in > egw_infolog_extra, the below query takes about 6 minutes to finish! > > I'm not skilled enough at reading the "Explain Analzyze" output to > understand what the primary problem is. > > Thanks! > > Mark > > ########### > > SELECT DISTINCT main.* ,( > SELECT count(*) FROM egw_infolog sub WHERE > sub.info_id_parent=main.info_id AND (info_owner=6 OR > ((','||info_responsible||',' LIKE '%,-2,%' OR > ','||info_responsible||',' LIKE '%,-1,%' OR > ','||info_responsible||',' LIKE '%,6,%') AND > info_access='public') OR info_owner IN (6) OR > (info_access='public' > AND info_owner IN(6))) > ) AS info_anz_subs FROM egw_infolog main > LEFT JOIN egw_infolog_extra ON main.info_id=egw_infolog_extra.info_id > WHERE ( > (info_owner=6 OR ((','||info_responsible||',' LIKE '%,-2,%' OR > ','||info_responsible||',' LIKE '%,-1,%' OR > ','||info_responsible||',' LIKE '%,6,%') AND > info_access='public') OR info_owner IN (6) OR > (info_access='public' > AND info_owner IN(6))) AND info_status <> 'deleted' ) > ORDER BY > info_datemodified DESC LIMIT 15 OFFSET 0 > > Query plan: > #### > Limit (cost=68624989.18..68624991.31 rows=15 width=1011) (actual > time=686260.735..686260.878 rows=15 loops=1) > -> Unique (cost=68624989.18..68627288.59 rows=16212 width=1011) > (actual time=686260.733..686260.857 rows=15 loops=1) > -> Sort (cost=68624989.18..68625068.47 rows=31716 width=1011) > (actual time=686260.730..686260.766 rows=29 loops=1) > Sort Key: main.info_datemodified, main.info_id, > main.info_type, main.info_from, main.info_addr, main.info_subject, > main.info_des, main.info_owner, main.info_responsible, main.info_access, > main.info_cat, main.info_startdate, main.info_enddate, > main.info_id_parent, main.info_planned_time, main.info_used_time, > main.info_status, main.info_confirm, main.info_modifier, > main.info_link_id, main.info_priority, main.pl_id, main.info_price, > main.info_percent, main.info_datecompleted, main.info_location, > main.info_custom_from, (subplan) > -> Merge Left Join (cost=0.00..68594428.95 rows=31716 > width=1011) (actual time=21.358..684226.134 rows=32424 loops=1) > Merge Cond: (main.info_id = > egw_infolog_extra.info_id) > -> Index Scan using egw_infolog_pkey on egw_infolog > main (cost=0.00..3025.84 rows=16212 width=1011) (actual > time=0.060..135.766 rows=16212 loops=1) > Filter: (((info_owner = 6) OR (((((','::text > || (info_responsible)::text) || ','::text) ~~ '%,-2,%'::text) OR > (((','::text || (info_responsible)::text) || ','::text) ~~ '%,-1, > %'::text) OR (((','::text || (info_responsible)::text) || ','::text) ~~ > '%,6,%'::text)) AND ((info_access)::text = 'public'::text)) OR > (info_owner = 6) OR (((info_access)::text = 'public'::text) AND > (info_owner = 6))) AND ((info_status)::text <> 'deleted'::text)) > -> Index Scan using egw_infolog_extra_pkey on > egw_infolog_extra (cost=0.00..1546.30 rows=32424 width=4) (actual > time=0.025..317.272 rows=32424 loops=1) > SubPlan > -> Aggregate (cost=2162.60..2162.61 rows=1 > width=0) (actual time=21.073..21.073 rows=1 loops=32424) > -> Seq Scan on egw_infolog sub > (cost=0.00..2122.07 rows=16212 width=0) (actual time=21.065..21.065 > rows=0 loops=32424) > Filter: ((info_id_parent = $0) AND > ((info_owner = 6) OR (((((','::text || (info_responsible)::text) || > ','::text) ~~ '%,-2,%'::text) OR (((','::text || > (info_responsible)::text) || ','::text) ~~ '%,-1,%'::text) OR > (((','::text || (info_responsible)::text) || ','::text) ~~ '%,6, > %'::text)) AND ((info_access)::text = 'public'::text)) OR (info_owner = > 6) OR (((info_access)::text = 'public'::text) AND (info_owner = 6)))) > Total runtime: 686278.730 ms > > > > >
Mark Stosberg <mark@summersault.com> writes: > I'm not skilled enough at reading the "Explain Analzyze" output to > understand what the primary problem is. The problem is the repeated execution of the subquery in the SELECT list; that's taking over 683 of the 686 seconds: > SubPlan > -> Aggregate (cost=2162.60..2162.61 rows=1 > width=0) (actual time=21.073..21.073 rows=1 loops=32424) ^^^^^^ ^^^^^ The current formulation of the query guarantees that you can't do better than a nestloop join with "sub" on the inside, and that nestloop isn't even indexed. See if you can convert it to a regular join instead of a sub-select (probably with GROUP BY instead of DISTINCT). Also, those LIKE conditions are just horrid: slow *and* unreadable. Consider redesigning your data representation. Perhaps converting info_responsible to an int array would be reasonable. regards, tom lane