Re: bad query performance - Mailing list pgsql-sql
From | Luis Sousa |
---|---|
Subject | Re: bad query performance |
Date | |
Msg-id | 3EC3531E.1040805@ualg.pt Whole thread Raw |
In response to | Re: bad query performance (Stephan Szabo <sszabo@megazone23.bigpanda.com>) |
List | pgsql-sql |
Didrik Pinte wrote: > At 18:54 5/14/2003, Luis Sousa wrote: > >> Stephan Szabo wrote: >> >>> On Mon, 12 May 2003, Didrik Pinte wrote: >>> >>> >>> >>>> Here is the query : >>>> ------------------------------------------------------------------------------------ >>>> SELECT profiles_et.username, profiles_et.name, profiles_et.firstname, >>>> profiles_et.email, company_et.name AS company, count(logs_et.dt) AS >>>> cnt, >>>> pro_invoice_addr_at.id >>>> >>>> >>>> FROM >>>> ( >>>> ( >>>> ( >>>> secu.company_et JOIN secu.pro_invoice_addr_at >>>> ON (pro_invoice_addr_at.company = company_et.id) >>>> ) >>>> >>>> JOIN secu.profiles_et ON (pro_invoice_addr_at.profile = >>>> profiles_et.id) >>>> ) >>>> JOIN secu.logs_et ON (logs_et.invaddr = pro_invoice_addr_at.id) >>>> ) >>>> >>>> >>>> GROUP BY profiles_et.username, profiles_et.name, >>>> profiles_et.firstname, >>>> profiles_et.email, company_et.name, pro_invoice_addr_at.id >>>> >>>> ORDER BY count(logs_et.dt) DESC; >>>> ------------------------------------------------------------------------------------ >>>> >>>> >>>> >>>> The secu_company, secu_prov_invoice_addr_at and secu_profiles >>>> contains only >>>> informations about the users. >>>> >>>> The query is executing in 8,6 minutes.... Indexes are defined on >>>> all the >>>> field used in the different joins. >>>> >>> >>> Have you recently analyzed the tables involved? >>> I'd guess that the two outer sorts and the sort on logs_et are probably >>> the killers, but can you send explain analyze output? Looking at the >>> estimated row numbers and width, it seems to me that with 16Mb of >>> sort_mem >>> it's going to have to sort on disk (although you don't have enough >>> memory >>> to up it far enough probably if the estimates are right) >>> >>> >>> >>> >>>> Here is the query plan : >>>> >>>> ------------------------------------------------------------------------------------ >>>> Sort (cost=895649.54..896073.23 rows=169474 width=145) >>>> Sort Key: count(logs_et.dt) >>>> -> Aggregate (cost=831240.24..865135.10 rows=169474 width=145) >>>> -> Group (cost=831240.24..860898.24 rows=1694743 width=145) >>>> -> Sort (cost=831240.24..835477.10 rows=1694743 width=145) >>>> Sort Key: profiles_et.username, profiles_et.name, >>>> profiles_et.firstname, profiles_et.email, company_et.name, >>>> pro_invoice_addr_at.id >>>> -> Merge Join (cost=274406.73..304066.75 >>>> rows=1694743 width=145) >>>> Merge Cond: ("outer".id = "inner".invaddr) >>>> -> Sort (cost=143.81..144.92 rows=446 width=126) >>>> Sort Key: pro_invoice_addr_at.id >>>> -> Merge Join (cost=90.27..124.18 >>>> rows=446 width=126) >>>> Merge Cond: ("outer".id = >>>> "inner".profile) >>>> -> Index Scan using profiles_pk on >>>> profiles_et (cost=0.00..24.98 rows=449 width=66) >>>> -> Sort (cost=90.27..91.39 >>>> rows=446 width=60) >>>> Sort Key: >>>> pro_invoice_addr_at.profile >>>> -> Merge >>>> Join (cost=37.82..70.65 rows=446 width=60) >>>> Merge Cond: >>>> ("outer".company = "inner".id) >>>> -> Index Scan using >>>> invaddr_at_company_idx on pro_invoice_addr_at (cost=0.00..24.68 >>>> rows=446 >>>> width=33) >>>> -> Sort >>>> (cost=37.82..38.55 rows=291 width=27) >>>> Sort Key: >>>> company_et.id >>>> -> Seq Scan on >>>> company_et (cost=0.00..25.91 rows=291 width=27) >>>> -> Sort (cost=274262.92..278499.78 >>>> rows=1694743 width=19) >>>> Sort Key: logs_et.invaddr >>>> -> Seq Scan on >>>> logs_et (cost=0.00..55404.43 rows=1694743 width=19) >>>> ------------------------------------------------------------------------------------ >>>> >>>> The computer is a Pentium III 850 Mhz with 256 mb RAM (Sort memory for >>>> postres is 16 mb, and 64 mb of shared memory). >>>> >>> >>> >>> >>> ---------------------------(end of >>> broadcast)--------------------------- >>> TIP 3: if posting/reading through Usenet, please send an appropriate >>> subscribe-nomail command to majordomo@postgresql.org so that your >>> message can get through to the mailing list cleanly >>> >>> >>> >> -> Seq Scan on >> company_et (cost=0.00..25.91 rows=291 width=27) >> -> Sort (cost=274262.92..278499.78 >> rows=1694743 width=19) >> Sort Key: logs_et.invaddr >> >> >> Do you have any indexes defined for logs_et.invaddr and for >> pro_invoice_addr_at.id ? >> >> Luis Sousa > > > > pro_invoice_add_at.id had not index but was a primary key. I've > created an index on it. Here is the explain analyse for the query : > > Sort (cost=895649.54..896073.23 rows=169474 width=145) (actual > time=553423.57..553423.76 rows=259 loops=1) > Sort Key: count(logs_et.dt) > -> Aggregate (cost=831240.24..865135.10 rows=169474 width=145) (actual > time=463611.22..553421.28 rows=259 loops=1) > -> Group (cost=831240.24..860898.24 rows=1694743 width=145) (actual > time=463610.39..550481.54 rows=1698665 loops=1) > -> Sort (cost=831240.24..835477.10 rows=1694743 width=145) (actual > time=463610.36..468962.97 rows=1698665 loops=1) > Sort Key: profiles_et.username, profiles_et.name, > profiles_et.firstname, profiles_et.email, company_et.name, > pro_invoice_addr_at.id > -> Merge Join (cost=274406.73..304066.75 rows=1694743 width=145) > (actual time=76531.90..97594.64 rows=1698665 loops=1) > Merge Cond: ("outer".id = "inner".invaddr) > -> Sort (cost=143.81..144.92 rows=446 width=126) (actual > time=147.27..147.67 rows=446 loops=1) > Sort Key: pro_invoice_addr_at.id > -> Merge Join (cost=90.27..124.18 rows=446 width=126) (actual > time=105.69..143.63 rows=446 loops=1) > Merge Cond: ("outer".id = "inner".profile) > -> Index Scan using profiles_pk on profiles_et (cost=0.00..24.98 > rows=449 width=66) (actual time=50.53..79.70 rows=449 loops=1) > -> Sort (cost=90.27..91.39 rows=446 width=60) (actual > time=55.13..55.54 rows=446 loops=1) > Sort Key: pro_invoice_addr_at.profile > -> Merge Join (cost=37.82..70.65 rows=446 width=60) (actual > time=40.11..49.69 rows=446 loops=1) > Merge Cond: ("outer".company = "inner".id) > -> Index Scan using invaddr_at_company_idx on pro_invoice_addr_at > (cost=0.00..24.68 rows=446 width=33) (actual time=9.91..13.54 rows=446 > loops=1) > -> Sort (cost=37.82..38.55 rows=291 width=27) (actual > time=30.11..30.49 rows=447 loops=1) > Sort Key: company_et.id > -> Seq Scan on company_et (cost=0.00..25.91 rows=291 width=27) (actual > time=7.24..27.73 rows=291 loops=1) > -> Sort (cost=274262.92..278499.78 rows=1694743 width=19) (actual > time=76384.57..80077.80 rows=1698665 loops=1) > Sort Key: logs_et.invaddr > -> Seq Scan on logs_et (cost=0.00..55404.43 rows=1694743 width=19) > (actual time=11.47..13274.68 rows=1698665 loops=1) > Total runtime: 553673.02 msec > > ,"**** DISCLAIMER **** > This e-mail and any attachments thereto may contain information > which is confidential and/or protected by intellectual property > rights and are intended for the sole use of the recipient(s) named above. > Any use of the information contained herein (including, but not > limited to, > total or partial reproduction, communication or distribution in any form) > by persons other than the designated recipient(s) is prohibited. > If you have received this e-mail in error, please notify the sender > and delete the material from any computer. > Thank you for your cooperation. > > > You can make some experiences trying to change the sequence of JOIN, like this (without parenthesis the sequence is different) SELECT profiles_et.username, profiles_et.name, profiles_et.firstname, profiles_et.email, company_et.name AS company, count(logs_et.dt) AS cnt, pro_invoice_addr_at.id FROM secu.company_et JOIN secu.pro_invoice_addr_at ON (pro_invoice_addr_at.company = company_et.id) JOIN secu.profiles_et ON (pro_invoice_addr_at.profile = profiles_et.id) JOIN secu.logs_et ON (logs_et.invaddr = pro_invoice_addr_at.id) GROUP BY profiles_et.username, profiles_et.name, profiles_et.firstname, profiles_et.email, company_et.name, pro_invoice_addr_at.id ORDER BY count(logs_et.dt) DESC; In the first JOIN, put the table with less records, and continue that order. Luis Sousa