Thread: index usage for query
Hi, I have a question about index usage in PostgreSQL 7.2.1 on Solaris. I have three tables like this: port: element text portno int primary key: element, portno index: element port_s: element text portno int sname text pri int primary key: element, portno, sname index: element index: element, portno port_s_p: element text portno int sname text pname text value text primary key: element, portno, sname, pname index: element, portno, sname At first, I did the query like this: SELECT po.portno,po.element,s.sname,pri,p.pname,value FROM port po, port_s s LEFT OUTER JOIN port_s_p p USING (element, portno, sname) WHERE po.element=s.element AND po.portno=s.portno AND po.element LIKE 'lab-el1' ORDER BY po.element,po.portno,pri,s.sname; And got this query plan using EXPLAIN: Sort (cost=43690.55..43690.55 rows=26 width=157) -> Merge Join (cost=41757.20..43689.93 rows=26 width=157) -> IndexScan using idx_p_element on port po (cost=0.00..72.65 rows=22 width=67) -> Materialize (cost=42682.50..42682.50rows=370111 width=90) -> Merge Join (cost=41757.20..42682.50 rows=370111 width=90) -> Sort (cost=41755.93..41755.93 rows=370111 width=42) -> Seq Scanon port_s s (cost=0.00..7525.11 rows=370111 width=42) -> Sort (cost=1.27..1.27 rows=10 width=48) -> Seq Scan on port_s_p p (cost=0.00..1.10 rows=10 width=48) EXPLAIN What I don't understand is why the index port_s(element,portno) is not used here. If I changed the query to: SELECT po.portno,po.element,s.sname,pri,p.pname,value FROM port po INNER JOIN port_s s USING (element, portno) LEFT OUTER JOIN port_s_p p USING (element, portno, sname) WHERE po.element LIKE 'lab-el1' ORDER BY po.element,po.portno,pri,s.sname; I.e. using INNER JOIN instead of the WHERE case to join port and port_s. This query gave this plan: NOTICE: QUERY PLAN: Sort (cost=239.17..239.17 rows=26 width=157) -> Merge Join (cost=1.27..238.55 rows=26 width=157) -> Nested Loop (cost=0.00..237.19 rows=26 width=109) -> Index Scan using idx_p_element on port po (cost=0.00..72.65 rows=22width=67) -> Index Scan using idx_p_s_element_portno on port_s s (cost=0.00..7.38 rows=1 width=42) -> Sort (cost=1.27..1.27 rows=10 width=48) -> Seq Scan on port_s_p p (cost=0.00..1.10 rows=10 width=48) EXPLAIN Can someone explain why the index is used in the second query, but not in the first? Greetings, Tomas
On Tuesday 19 Nov 2002 9:10 am, Tomas Berndtsson wrote: > Hi, > > I have a question about index usage in PostgreSQL 7.2.1 on Solaris. > > At first, I did the query like this: > > SELECT po.portno,po.element,s.sname,pri,p.pname,value > FROM port po, port_s s LEFT OUTER JOIN port_s_p p > USING (element, portno, sname) WHERE po.element=s.element > AND po.portno=s.portno AND po.element LIKE 'lab-el1' > ORDER BY po.element,po.portno,pri,s.sname; > > And got this query plan using EXPLAIN: > > Sort (cost=43690.55..43690.55 rows=26 width=157) > -> Merge Join (cost=41757.20..43689.93 rows=26 width=157) > -> Index Scan using idx_p_element on port po (cost=0.00..72.65 > rows=22 width=67) -> Materialize (cost=42682.50..42682.50 rows=370111 > width=90) -> Merge Join (cost=41757.20..42682.50 rows=370111 width=90) -> > Sort (cost=41755.93..41755.93 rows=370111 width=42) -> Seq Scan on > port_s s (cost=0.00..7525.11 rows=370111 width=42) -> Sort > (cost=1.27..1.27 rows=10 width=48) > -> Seq Scan on port_s_p p (cost=0.00..1.10 > rows=10 width=48) > What I don't understand is why the index port_s(element,portno) > is not used here. Well - the estimate of 370111 rows shows where we have our problem. An explicit JOIN overrides the planner's normal order of operation. Looking at the plan, I think what's happening is that OUTER JOIN is being done first to "post_s s". Then it does your implicit join. > If I changed the query to: > > SELECT po.portno,po.element,s.sname,pri,p.pname,value > FROM port po INNER JOIN port_s s USING (element, portno) LEFT > OUTER JOIN port_s_p p USING (element, portno, sname) > WHERE po.element LIKE 'lab-el1' > ORDER BY po.element,po.portno,pri,s.sname; > > I.e. using INNER JOIN instead of the WHERE case to join port and > port_s. This query gave this plan: > > NOTICE: QUERY PLAN: > > Sort (cost=239.17..239.17 rows=26 width=157) > -> Merge Join (cost=1.27..238.55 rows=26 width=157) > -> Nested Loop (cost=0.00..237.19 rows=26 width=109) > -> Index Scan using idx_p_element on port po > (cost=0.00..72.65 rows=22 width=67) -> Index Scan using > idx_p_s_element_portno on port_s s (cost=0.00..7.38 rows=1 width=42) -> > Sort (cost=1.27..1.27 rows=10 width=48) > -> Seq Scan on port_s_p p (cost=0.00..1.10 rows=10 width=48) Here you only have one join in two parts "port po" - "port_s s" and then "port_s_p p". It's the overriding that's hitting you here. If you've got time could you rewrite it so that all the joins are implicit and let us know what that does to the plan? -- Richard Huxton