Query Optimization - Hash Join estimate off? - Mailing list pgsql-performance
From | mark.lubratt@indeq.com |
---|---|
Subject | Query Optimization - Hash Join estimate off? |
Date | |
Msg-id | 20050305183835.17457.qmail@webmail02.mesa1.secureserver.net Whole thread Raw |
List | pgsql-performance |
<div>Hello!</div><div> </div><div>First off, I'm a real newbie at trying to read the output of explain analyze.</div><div> </div><div>Ihave several similar queries in my application that I've got incorporated into views. Whenthey run sub 300ms, the users don't seem to mind. However, one of them (query is below along with some relevant tableinformation) is running about 800ms and my users are starting to grumble.</div><div> </div><div>I ran explain analyzeon it (explain analyze results are below). I noticed that the biggest chunk of time is being taken by a Hash Joinnear the top of the output (I'm still not sure what the indentation means and what the order means). If I look at theestimate, it is comparable to several other hash join estimates in the query; however, the actual cost in time is significantlyhigher than those other hash joins. Is this significant?</div><div> </div><div>I tried optimizing accordingto "SQL Tuning" by Tow, but this actually seemed to slow things down. It also seemed that the query optimizer inPostgreSQL reordered things on its own according to its own plan anyway. Is this correct?</div><div> </div><div>I'd appreciateany help I can get to try to get this query below 300ms.</div><div> </div><div>Thanks!</div><div>Mark</div><div> </div><div>Theplatform is a dual 2.2GHz Xeon 1.2GB RAM withmirrored drives (raid 1) running Win2000 Pro. I run "vacuum analyze" every night. The postgresql.conf is basicallystandard except that I've opened it up to listen to the external network. Other changes:</div><div> </div><div>max_connections= 100</div><div>shared_buffers = 10000</div><div> </div><div>query (the person_id= 1 in the where clause is changed on a case by case basis - depending upon who's running the query):</div><div> </div><div>explainanalyze<br /> SELECT DISTINCT c.job_id, g.person_id, c.job_no, b.deadline, c.name, bid_date(c.job_id)AS bid_date, c.miscq, c.city, c.st, j.name AS eng, c.s_team AS salesteam, <br /> CASE<br /> WHEN c.file_loc = 0 THEN 'No Bid'::character varying<br /> WHEN c.file_loc = -1 THEN 'Bid Board'::charactervarying<br /> WHEN c.file_loc = -2 THEN 'Lost Job'::character varying<br /> WHEN c.file_loc= -3 THEN 'See Job Notes'::character varying<br /> WHEN c.file_loc < -3 OR c.file_loc IS NULL THEN''::character varying<br /> WHEN h.initials IS NOT NULL THEN h.initials<br /> ELSE 'Unknown person'::charactervarying<br /> END AS file_loc, COALESCE(c.city::text || COALESCE(', '::text || c.st::text, ''::text),COALESCE(c.st, ''::character varying)::text) AS "location", c.file_loc AS file_loc_id<br /> FROM status a<br/> LEFT JOIN status_list b ON a.status_id = b.status_id AND b.active<br /> LEFT JOIN job c ON c.job_id = b.job_id<br/> LEFT JOIN builder_list d ON c.job_id = d.job_id AND (d.won_heat OR d.won_vent OR d.won_tc OR c.heat AND d.bid_heatAND d.won_heat IS NULL OR c.vent AND d.bid_vent AND d.won_vent IS NULL OR c.tc AND d.bid_tc AND d.won_tc IS NULL)AND d.role = 'C'::bpchar<br /> LEFT JOIN company e ON d.company_id = e.company_id<br /> LEFT JOIN call_list f ONe.company_id = f.company_id<br /> LEFT JOIN person g ON f.person_id = g.person_id OR "position"(c.s_team::text, g.initials::text)> 0<br /> LEFT JOIN person h ON c.file_loc = h.person_id<br /> LEFT JOIN builder_list i ON c.job_id= i.job_id AND i.role = 'E'::bpchar<br /> LEFT JOIN company j ON i.company_id = j.company_id<br /> WHERE a.name::text= 'Awaiting Award'::character varying::text and g.person_id = 1<br /> ORDER BY c.job_id, g.person_id, c.job_no,b.deadline, c.name, bid_date(c.job_id), c.miscq, c.city, COALESCE(c.city::text || COALESCE(', '::text || c.st::text,''::text), COALESCE(c.st, ''::character varying)::text), c.st, <br />CASE<br /> WHEN c.file_loc = 0 THEN 'NoBid'::character varying<br /> WHEN c.file_loc = -1 THEN 'Bid Board'::character varying<br /> WHEN c.file_loc = -2THEN 'Lost Job'::character varying<br /> WHEN c.file_loc = -3 THEN 'See Job Notes'::character varying<br /> WHENc.file_loc < -3 OR c.file_loc IS NULL THEN ''::character varying<br /> WHEN h.initials IS NOT NULL THEN h.initials<br/> ELSE 'Unknown person'::character varying<br />END, j.name, c.s_team, c.file_loc;<br /></div><div>Tables:</div><div>status- 14 rows</div><div>status_list - 6566 rows</div><div>job - 2210 rows</div><div>builder_list -9670 rows</div><div>company - 1249 rows</div><div>call_list - 4731 rows</div><div>person - 27rows</div><div> </div><div>Primary keys:</div><div>any field with a "_id" suffix is a primary key; and thus is implicitlyindexed.</div><div> </div><div>Other indexes:</div><div>status_list(job_id) btree</div><div>status_list(status_id)btree</div><div>job(file_loc) btree</div><div>builder_list(company_id) btree</div><div>call_list(company_id)btree</div><div>call_list(person_id) btree</div><div>call_list(company_id) btree</div><div>person(company_id)btree</div><div> </div><div>explain analyze:</div><div>Unique (cost=1798.47..1809.38 rows=291width=114) (actual time=766.000..781.000 rows=566 loops=1)<br /> -> Sort (cost=1798.47..1799.19 rows=291 width=114)(actual time=766.000..766.000 rows=1473 loops=1)<br /> Sort Key: c.job_id, g.person_id, c.job_no, b.deadline,c.name, bid_date(c.job_id), c.miscq, c.city, COALESCE(((c.city)::text || COALESCE((', '::text || (c.st)::text),''::text)), (COALESCE(c.st, ''::character varying))::text), c.st, CASE WHEN (c.fi (..)<br /> -> Hash Left Join (cost=1750.81..1786.56 rows=291 width=114) (actual time=453.000..750.000 rows=1473 loops=1)<br /> Hash Cond: ("outer".company_id = "inner".company_id)<br /> -> Merge Left Join (cost=1707.20..1722.53rows=291 width=95) (actual time=437.000..484.000 rows=1473 loops=1)<br /> MergeCond: ("outer".job_id = "inner".job_id)<br /> -> Sort (cost=1382.44..1383.17 rows=291 width=91)(actual time=406.000..406.000 rows=1473 loops=1)<br /> Sort Key: c.job_id<br /> -> Hash Left Join (cost=1137.28..1370.53 rows=291 width=91) (actual time=234.000..390.000rows=1473 loops=1)<br /> Hash Cond: ("outer".file_loc = "inner".person_id)<br/> -> Nested Loop (cost=1135.94..1365.27 rows=291 width=84) (actualtime=234.000..390.000 rows=1473 loops=1)<br /> Join Filter: (("inner".person_id= "outer".person_id) OR ("position"(("inner".s_team)::text, ("outer".initials)::text) > 0))<br /> -> Seq Scan on person g (cost=0.00..1.34 rows=1 width=11) (actual time=0.000..0.000rows=1 loops=1)<br /> Filter: (person_id = 1)<br /> -> Merge Right Join (cost=1135.94..1349.74 rows=811 width=84) (actual time=234.000..297.000rows=7490 loops=1)<br /> Merge Cond: ("outer".company_id= "inner".company_id)<br /> -> Index Scan using idx_company_id_call_liston call_list f (cost=0.00..189.80 rows=4731 width=8) (actual time=0.000..15.000 rows=4731 loops=1)<br/> -> Sort (cost=1135.94..1136.48 rows=214 width=84) (actual time=234.000..234.000rows=7490 loops=1)<br /> Sort Key: e.company_id<br/> -> Merge Right Join (cost=1004.19..1127.66 rows=214width=84) (actual time=203.000..219.000 rows=1569 loops=1)<br /> Merge Cond: ("outer".company_id = "inner".company_id)<br /> -> Index Scan using company_pkey on company e (cost=0.00..117.13rows=1249 width=4) (actual time=0.000..0.000 rows=1249 loops=1)<br /> -> Sort (cost=1004.19..1004.73 rows=214 width=84) (actual time=203.000..203.000rows=1569 loops=1)<br /> Sort Key: d.company_id<br/> -> Hash Left Join (cost=633.74..995.91rows=214 width=84) (actual time=156.000..187.000 rows=1569 loops=1)<br /> Hash Cond: ("outer".job_id = "inner".job_id)<br /> Join Filter: ("inner".won_heat OR "inner".won_vent OR"inner".won_tc OR ("outer".heat AND "inner".bid_heat AND ("inner".won_heat IS NULL)) OR ("outer".vent AND "inner".bid_ventAND ("inner (..)<br /> -> Merge LeftJoin (cost=368.17..381.60 rows=159 width=83) (actual time=78.000..93.000 rows=695 loops=1)<br /> Merge Cond: ("outer".job_id = "inner".job_id)<br/> -> Sort (cost=168.31..168.71rows=159 width=8) (actual time=31.000..31.000 rows=695 loops=1)<br /> Sort Key: b.job_id<br /> -> Nested Loop Left Join (cost=0.00..162.50rows=159 width=8) (actual time=0.000..31.000 rows=695 loops=1)<br /> Join Filter: ("outer".status_id ="inner".status_id)<br /> -> SeqScan on status a (cost=0.00..1.18 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)<br /> Filter: ((name)::text = 'AwaitingAward'::text)<br /> -> Seq Scan on status_list b (cost=0.00..133.66 rows=2213 width=12) (actual time=0.000..15.000 rows=2210 loops=1)<br/> Filter: active<br/> -> Sort (cost=199.86..205.39 rows=2210width=79) (actual time=47.000..47.000 rows=2194 loops=1)<br /> Sort Key: c.job_id<br /> -> Seq Scan on job c (cost=0.00..77.10rows=2210 width=79) (actual time=0.000..31.000 rows=2210 loops=1)<br /> -> Hash (cost=202.88..202.88 rows=7475 width=14)(actual time=78.000..78.000 rows=0 loops=1)<br /> -> Seq Scan on builder_list d (cost=0.00..202.88rows=7475 width=14) (actual time=0.000..15.000 rows=7517 loops=1)<br /> Filter: (role = 'C'::bpchar)<br /> -> Hash (cost=1.27..1.27 rows=27 width=11) (actual time=0.000..0.000 rows=0 loops=1)<br/> -> Seq Scan on person h (cost=0.00..1.27 rows=27 width=11) (actualtime=0.000..0.000 rows=27 loops=1)<br /> -> Sort (cost=324.76..330.25 rows=2196 width=8) (actualtime=31.000..31.000 rows=3044 loops=1)<br /> Sort Key: i.job_id<br /> -> Seq Scan on builder_list i (cost=0.00..202.88 rows=2196 width=8) (actual time=0.000..31.000rows=2153 loops=1)<br /> Filter: (role = 'E'::bpchar)<br /> -> Hash (cost=40.49..40.49 rows=1249 width=27) (actual time=16.000..16.000 rows=0 loops=1)</div><div> -> Seq Scan on company j (cost=0.00..40.49 rows=1249 width=27) (actual time=0.000..0.000rows=1249 loops=1)<br />Total runtime: 781.000 ms<br /></div>
pgsql-performance by date: