Thread: Question about LEFT JOIN and query plan
Hello, I have I query which behave strangely (according to me). According to the first plan PG makes absolutely unnecessary seq scan on tables "invoices" and "domeini" and etc. I thing they should be access only if there are rows from the where. Why the left join executes first? Then I rewrite the query and move left joins to sub queries and the result was great speed up. But I thing it is more correctly to write the query with left joins. At least the sub queries have similar parts which are now accessed twice. So I will appreciate any suggestions how it is correct to write this query and why the left join plan is so wrong. SELECT version(); version ---------------------------------------------------------------------------------------------------------- PostgreSQL 8.4.4 on amd64-portbld-freebsd8.1, compiled by GCC cc (GCC) 4.2.1 20070719 [FreeBSD], 64-bit (1 row) Best regards, Kaloyan Iliev ===============================ORIGINAL QUERY============================== explain analyze SELECT DD.debtid, ADD.amount as saldo, DOM.fqdn ||DT.descr as domain_fqdn, S.descr_bg as service_descr_bg, ADD.pno, ADD.amount, M.name_bg as measure_name_bg, AC.ino, I.idate FROM debts_desc DD LEFT JOIN domeini DOM ON (DD.domain_id = DOM.id) LEFT JOIN domain_type DT ON (DOM.domain_type_id = DT.id) LEFT JOIN acc_debts ADC ON (DD.debtid = ADC.debtid AND ADC.credit) LEFT JOIN acc_clients AC ON (AC.transact_no = ADC.transact_no AND NOT AC.credit) LEFT JOIN invoices I ON (AC.ino = I.ino AND I.istatus = 0), acc_debts ADD, services S, measures M, proforms P WHERE DD.debtid = ADD.debtid AND DD.measure_id = M.measure_id AND DD.active AND NOT DD.paid AND DD.has_proform AND NOT DD.storned AND ADD.pno = P.pno AND NOT ADD.credit AND P.person1_id = 287294 AND DD.serviceid = S.serviceid; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=37503.47..47243.77 rows=1 width=110) (actual time=1522.796..1522.796 rows=0 loops=1) Join Filter: (dd.measure_id = m.measure_id) -> Nested Loop (cost=37503.47..47242.45 rows=1 width=106) (actual time=1522.794..1522.794 rows=0 loops=1) Join Filter: (dd.serviceid = s.serviceid) -> Hash Join (cost=37503.47..47239.46 rows=1 width=79) (actual time=1522.791..1522.791 rows=0 loops=1) Hash Cond: (dd.debtid = add.debtid) -> Hash Left Join (cost=37475.95..47122.76 rows=23782 width=67) (actual time=1370.668..1521.629 rows=1037 loops=1) Hash Cond: (dom.domain_type_id = dt.id) -> Hash Left Join (cost=37474.12..46793.92 rows=23782 width=66) (actual time=1370.563..1519.302 rows=1037 loops=1) Hash Cond: (dd.domain_id = dom.id) -> Hash Left Join (cost=23487.71..30402.02 rows=23782 width=54) (actual time=556.587..636.320 rows=1037 loops=1) Hash Cond: (ac.ino = i.ino) -> Hash Left Join (cost=8410.66..14259.11 rows=23782 width=50) (actual time=318.180..387.026 rows=1037 loops=1) Hash Cond: (adc.transact_no = ac.transact_no) -> Hash Left Join (cost=4973.98..9903.69 rows=23782 width=50) (actual time=175.979..234.068 rows=1037 loops=1) Hash Cond: (dd.debtid = adc.debtid) -> Seq Scan on debts_desc dd (cost=0.00..2866.52 rows=23782 width=46) (actual time=0.481..45.085 rows=1037 loops=1) Filter: (active AND (NOT paid) AND has_proform AND (NOT storned)) -> Hash (cost=3942.08..3942.08 rows=62872 width=8) (actual time=175.410..175.410 rows=63157 loops=1) -> Seq Scan on acc_debts adc (cost=0.00..3942.08 rows=62872 width=8) (actual time=0.097..102.172 rows=63157 loops=1) Filter: credit -> Hash (cost=2536.53..2536.53 rows=54812 width=8) (actual time=142.169..142.169 rows=54559 loops=1) -> Seq Scan on acc_clients ac (cost=0.00..2536.53 rows=54812 width=8) (actual time=0.019..78.736 rows=54559 loops=1) Filter: (NOT credit) -> Hash (cost=14181.02..14181.02 rows=54562 width=8) (actual time=238.380..238.380 rows=54559 loops=1) -> Seq Scan on invoices i (cost=0.00..14181.02 rows=54562 width=8) (actual time=0.029..170.761 rows=54559 loops=1) Filter: (istatus = 0) -> Hash (cost=8669.96..8669.96 rows=305796 width=16) (actual time=813.940..813.940 rows=305796 loops=1) -> Seq Scan on domeini dom (cost=0.00..8669.96 rows=305796 width=16) (actual time=0.015..419.684 rows=305796 loops=1) -> Hash (cost=1.37..1.37 rows=37 width=9) (actual time=0.087..0.087 rows=37 loops=1) -> Seq Scan on domain_type dt (cost=0.00..1.37 rows=37 width=9) (actual time=0.003..0.040 rows=37 loops=1) -> Hash (cost=27.45..27.45 rows=5 width=16) (actual time=0.078..0.078 rows=1 loops=1) -> Nested Loop (cost=0.00..27.45 rows=5 width=16) (actual time=0.067..0.073 rows=1 loops=1) -> Index Scan using proforms_person1_id_idx on proforms p (cost=0.00..10.62 rows=2 width=4) (actual time=0.045..0.046 rows=1 loops=1) Index Cond: (person1_id = 287294) -> Index Scan using acc_debts_pno_idx on acc_debts add (cost=0.00..8.38 rows=3 width=16) (actual time=0.017..0.019 rows=1 loops=1) Index Cond: (add.pno = p.pno) Filter: (NOT add.credit) -> Seq Scan on services s (cost=0.00..2.44 rows=44 width=31) (never executed) -> Seq Scan on measures m (cost=0.00..1.14 rows=14 width=8) (never executed) Total runtime: 1523.525 ms (41 rows) ==================================================AFTER REWRITE============================================ explain analyze SELECT DD.debtid, ADD.amount as saldo, (SELECT DOM.fqdn ||DT.descr FROM domeini DOM, domain_type DT WHERE DOM.domain_type_id = DT.id AND DD.domain_id = DOM.id) as domain_fqdn, S.descr_bg as service_descr_bg, ADD.pno, ADD.amount, M.name_bg as measure_name_bg, (SELECT AC.ino FROM acc_debts ACD, acc_clients AC WHERE ACD.debtid = ADD.debtid AND ACD.credit AND AC.transact_no = ACD.transact_no AND NOT AC.credit) as ino, (SELECT I.idate FROM acc_debts ACD, acc_clients AC, invoices I WHERE ACD.debtid = ADD.debtid AND ACD.credit AND AC.transact_no = ACD.transact_no AND NOT AC.credit AND AC.ino = I.ino AND I.istatus = 0) as idate FROM debts_desc DD, acc_debts ADD, services S, measures M, proforms P WHERE DD.debtid = ADD.debtid AND DD.measure_id = M.measure_id AND DD.active AND NOT DD.paid AND DD.has_proform AND NOT DD.storned AND ADD.pno = P.pno AND NOT ADD.credit AND P.person1_id = 287294 AND DD.serviceid = S.serviceid; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..77.90 rows=1 width=93) (actual time=0.047..0.047 rows=0 loops=1) -> Nested Loop (cost=0.00..32.96 rows=1 width=66) (actual time=0.045..0.045 rows=0 loops=1) -> Nested Loop (cost=0.00..32.68 rows=1 width=62) (actual time=0.043..0.043 rows=0 loops=1) -> Nested Loop (cost=0.00..27.45 rows=5 width=16) (actual time=0.026..0.031 rows=1 loops=1) -> Index Scan using proforms_person1_id_idx on proforms p (cost=0.00..10.62 rows=2 width=4) (actual time=0.013..0.014 rows=1 loops=1) Index Cond: (person1_id = 287294) -> Index Scan using acc_debts_pno_idx on acc_debts add (cost=0.00..8.38 rows=3 width=16) (actual time=0.007..0.008 rows=1 loops=1) Index Cond: (add.pno = p.pno) Filter: (NOT add.credit) -> Index Scan using debts_desc_pkey on debts_desc dd (cost=0.00..1.03 rows=1 width=46) (actual time=0.007..0.007 rows=0 loops=1) Index Cond: (dd.debtid = add.debtid) Filter: (dd.active AND (NOT dd.paid) AND dd.has_proform AND (NOT dd.storned)) -> Index Scan using measures_pkey on measures m (cost=0.00..0.27 rows=1 width=8) (never executed) Index Cond: (m.measure_id = dd.measure_id) -> Index Scan using services_pkey on services s (cost=0.00..0.27 rows=1 width=31) (never executed) Index Cond: (s.serviceid = dd.serviceid) SubPlan 1 -> Hash Join (cost=8.31..9.84 rows=1 width=13) (never executed) Hash Cond: (dt.id = dom.domain_type_id) -> Seq Scan on domain_type dt (cost=0.00..1.37 rows=37 width=9) (never executed) -> Hash (cost=8.30..8.30 rows=1 width=12) (never executed) -> Index Scan using domeini_pkey on domeini dom (cost=0.00..8.30 rows=1 width=12) (never executed) Index Cond: ($0 = id) SubPlan 2 -> Nested Loop (cost=0.00..16.63 rows=1 width=4) (never executed) -> Index Scan using acc_debts_debtid_idx on acc_debts acd (cost=0.00..8.33 rows=1 width=4) (never executed) Index Cond: (debtid = $1) Filter: credit -> Index Scan using acc_clients_transact_no_uidx on acc_clients ac (cost=0.00..8.28 rows=1 width=8) (never executed) Index Cond: (ac.transact_no = acd.transact_no) Filter: (NOT ac.credit) SubPlan 3 -> Nested Loop (cost=0.00..18.19 rows=1 width=4) (never executed) -> Nested Loop (cost=0.00..16.63 rows=1 width=4) (never executed) -> Index Scan using acc_debts_debtid_idx on acc_debts acd (cost=0.00..8.33 rows=1 width=4) (never executed) Index Cond: (debtid = $1) Filter: credit -> Index Scan using acc_clients_transact_no_uidx on acc_clients ac (cost=0.00..8.28 rows=1 width=8) (never executed) Index Cond: (ac.transact_no = acd.transact_no) Filter: (NOT ac.credit) -> Index Scan using invoices_ino_uidx on invoices i (cost=0.00..1.55 rows=1 width=8) (never executed) Index Cond: (i.ino = ac.ino) Total runtime: 0.202 ms (43 rows)
Kaloyan Iliev Iliev <kaloyan@digsys.bg> writes: > I have I query which behave strangely (according to me). > According to the first plan PG makes absolutely unnecessary seq scan on > tables "invoices" and "domeini" and etc. I think you might get better results if you could get this rowcount estimate a bit more in line with reality: > -> Seq Scan on debts_desc dd (cost=0.00..2866.52 rows=23782 width=46) (actual time=0.481..45.085 rows=1037loops=1) > Filter: (active AND (NOT paid) AND has_proform AND (NOT storned)) It's choosing to hash instead of doing (what it thinks will be) 23K index probes into the other table. For 1000 probes the decision might be different. I don't know if raising the stats target for that table will be enough to fix it. Most likely those four conditions are not uncorrelated. You might need to think about revising the table's representation so that the query condition can be simpler and thus more accurately estimated. regards, tom lane
Kaloyan Iliev Iliev <kaloyan@digsys.bg> wrote: > I thing they should be access only if there are rows from the > where. Why the left join executes first? Out of curiosity, what happens if you consistently us JOIN clauses, rather than mixing that with commas?: explain analyze SELECT DD.debtid, ADD.amount as saldo, DOM.fqdn ||DT.descr as domain_fqdn, S.descr_bg as service_descr_bg, ADD.pno, ADD.amount, M.name_bg as measure_name_bg, AC.ino, I.idate FROM debts_desc DD JOIN proforms P ON (ADD.pno = P.pno) JOIN acc_debts ADD ON (DD.debtid = ADD.debtid) JOIN services S ON (DD.serviceid = S.serviceid) JOIN measures M ON (DD.measure_id = M.measure_id) LEFT JOIN domeini DOM ON (DD.domain_id = DOM.id) LEFT JOIN domain_type DT ON (DOM.domain_type_id = DT.id) LEFT JOIN acc_debts ADC ON (DD.debtid = ADC.debtid AND ADC.credit) LEFT JOIN acc_clients AC ON (AC.transact_no = ADC.transact_no AND NOT AC.credit) LEFT JOIN invoices I ON (AC.ino = I.ino AND I.istatus = 0) WHERE DD.active AND NOT DD.paid AND DD.has_proform AND NOT DD.storned AND NOT ADD.credit AND P.person1_id = 287294 ; -Kevin
<tt>Hi,<br /> The plan improves. So can you explain why?<br /> Thanks in advance.<br /><br /> Kaloyan<br /> QUERY PLAN <br /> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------<br /> Nested Loop Left Join (cost=0.00..82.88 rows=1 width=68) (actual time=92.455..92.455 rows=0 loops=1)<br /> -> Nested Loop Left Join (cost=0.00..77.73 rows=1 width=64) (actual time=92.453..92.453 rows=0 loops=1)<br /> -> Nested Loop Left Join (cost=0.00..69.44 rows=1 width=64) (actual time=92.451..92.451 rows=0 loops=1)<br /> -> Nested Loop (cost=0.00..64.26 rows=1 width=60) (actual time=92.449..92.449 rows=0 loops=1)<br /> Join Filter: (dd.measure_id = m.measure_id)<br /> -> Nested Loop (cost=0.00..62.95rows=1 width=60) (actual time=92.447..92.447 rows=0 loops=1)<br /> Join Filter:(dd.serviceid = s.serviceid)<br /> -> Nested Loop Left Join (cost=0.00..59.96 rows=1width=37) (actual time=92.444..92.444 rows=0 loops=1)<br /> Join Filter: (dom.domain_type_id= dt.id)<br /> -> Nested Loop Left Join (cost=0.00..58.13 rows=1width=36) (actual time=92.443..92.443 rows=0 loops=1)<br /> -> Nested Loop (cost=0.00..52.88 rows=1 width=28) (actual time=92.440..92.440 rows=0 loops=1)<br /> -> Nested Loop (cost=0.00..27.50 rows=5 width=16) (actual time=0.021..0.027rows=1 loops=1)<br /> -> Index Scan using proforms_person1_id_idxon proforms p (cost=0.00..10.67 rows=2 width=4) (actual time=0.008..0.009 rows=1 loops=1)<br /> Index Cond: (person1_id = 287294)<br /> -> Index Scan using acc_debts_pno_idx on acc_debts add (cost=0.00..8.38rows=3 width=16) (actual time=0.007..0.009 rows=1 loops=1)<br /> Index Cond: (add.pno = p.pno)<br /> Filter: (NOT add.credit)<br /> -> Index Scan using debts_desc_pkey on debts_desc dd (cost=0.00..5.06 rows=1width=16) (actual time=92.408..92.408 rows=0 loops=1)<br /> IndexCond: (dd.debtid = add.debtid)<br /> Filter: (dd.active AND (NOTdd.paid) AND dd.has_proform AND (NOT dd.storned))<br /> -> Index Scan usingdomeini_pkey on domeini dom (cost=0.00..5.24 rows=1 width=16) (never executed)<br /> Index Cond: (dd.domain_id = dom.id)<br /> -> Seq Scan on domain_type dt (cost=0.00..1.37 rows=37 width=9) (never executed)<br /> -> Seq Scan on services s (cost=0.00..2.44 rows=44 width=31) (never executed)<br /> -> Seq Scanon measures m (cost=0.00..1.14 rows=14 width=8) (never executed)<br /> -> Index Scan using acc_debts_debtid_idxon acc_debts adc (cost=0.00..5.16 rows=1 width=8) (never executed)<br /> IndexCond: (dd.debtid = adc.debtid)<br /> Filter: adc.credit<br /> -> Index Scan usingacc_clients_transact_no_uidx on acc_clients ac (cost=0.00..8.28 rows=1 width=8) (never executed)<br /> Index Cond: (ac.transact_no = adc.transact_no)<br /> Filter: (NOT ac.credit)<br /> -> Index Scan using invoices_ino_uidx on invoices i (cost=0.00..5.13 rows=1 width=8) (never executed)<br /> Index Cond: (ac.ino = i.ino)<br /> Total runtime: 92.612 ms<br /> (34 rows)<br /></tt><br /><br /> Kevin Grittnerwrote: <blockquote cite="mid:4C80E73402000025000351AF@gw.wicourts.gov" type="cite"><pre wrap="">Kaloyan Iliev Iliev<a class="moz-txt-link-rfc2396E" href="mailto:kaloyan@digsys.bg"><kaloyan@digsys.bg></a> wrote: </pre><blockquotetype="cite"><pre wrap="">I thing they should be access only if there are rows from the where. Why the left join executes first? </pre></blockquote><pre wrap=""> Out of curiosity, what happens if you consistently us JOIN clauses, rather than mixing that with commas?: explain analyze SELECT DD.debtid, ADD.amount as saldo, DOM.fqdn ||DT.descr as domain_fqdn, S.descr_bg as service_descr_bg, ADD.pno, ADD.amount, M.name_bg as measure_name_bg, AC.ino, I.idate FROM debts_desc DD JOIN proforms P ON (ADD.pno= P.pno) JOIN acc_debts ADD ON (DD.debtid = ADD.debtid) JOIN services S ON (DD.serviceid = S.serviceid) JOIN measuresM ON (DD.measure_id = M.measure_id) LEFT JOIN domeini DOM ON (DD.domain_id = DOM.id) LEFT JOIN domain_type DT ON(DOM.domain_type_id = DT.id) LEFT JOIN acc_debts ADC ON (DD.debtid = ADC.debtid AND ADC.credit) LEFT JOIN acc_clientsAC ON (AC.transact_no = ADC.transact_no AND NOT AC.credit) LEFT JOIN invoices I ON (AC.ino = I.ino AND I.istatus= 0) WHERE DD.active AND NOT DD.paid AND DD.has_proform AND NOT DD.storned AND NOT ADD.credit AND P.person1_id= 287294 ; -Kevin </pre></blockquote>
Hello again, I have another query which performance drops drastically after PG upgrade. I can not improve the plan no matter how hard I try. I try creating new indexes and rewrite the query with JOIN .. ON instead of commas but nothing happens. I will appreciate any suggestions. Best regards, Kaloyan Iliev ==========================VERSION 8.2.15=================================================== regbgrgr=# SELECT version(); version --------------------------------------------------------------------------------------------------- PostgreSQL 8.2.15 on amd64-portbld-freebsd7.2, compiled by GCC cc (GCC) 4.2.1 20070719 [FreeBSD] (1 row) regbgrgr=# explain analyze SELECT COUNT (D.id) as all_domains_count FROM domeini as D, domainperson as DP, person as P, request as R, domain_status as DS WHERE R.number = D.request_number AND D.domain_status_id = DS.id AND DS.is_removed = 0 AND D.id = DP.domain_id AND DP.dp_type_id = 1 AND DP.person1_id = P.id AND ( LOWER (P.bulstat) = LOWER ('999999999') OR LOWER (P.bulstat) = 'bg'||'999999999'); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=138.30..138.31 rows=1 width=4) (actual time=0.804..0.806 rows=1 loops=1) -> Nested Loop (cost=74.70..138.29 rows=5 width=4) (actual time=0.797..0.797 rows=0 loops=1) -> Nested Loop (cost=74.70..136.88 rows=5 width=8) (actual time=0.793..0.793 rows=0 loops=1) -> Nested Loop (cost=74.70..135.44 rows=5 width=12) (actual time=0.791..0.791 rows=0 loops=1) -> Hash Join (cost=74.70..122.42 rows=5 width=4) (actual time=0.787..0.787 rows=0 loops=1) Hash Cond: (dp.person1_id = p.id) -> Bitmap Heap Scan on domainperson dp (cost=19.91..65.81 rows=472 width=8) (actual time=0.088..0.088 rows=1 loops=1) Recheck Cond: (dp_type_id = 1) -> Bitmap Index Scan on domainperson_admin_person_uidx (cost=0.00..19.79 rows=472 width=0) (actual time=0.071..0.071 rows=474 loops=1) Index Cond: (dp_type_id = 1) -> Hash (cost=54.62..54.62 rows=14 width=4) (actual time=0.678..0.678 rows=0 loops=1) -> Seq Scan on person p (cost=0.00..54.62 rows=14 width=4) (actual time=0.675..0.675 rows=0 loops=1) Filter: ((lower(bulstat) = '999999999'::text) OR (lower(bulstat) = 'bg999999999'::text)) -> Index Scan using domeini_pkey on domeini d (cost=0.00..2.59 rows=1 width=12) (never executed) Index Cond: (d.id = dp.domain_id) -> Index Scan using domain_status_pkey on domain_status ds (cost=0.00..0.27 rows=1 width=4) (never executed) Index Cond: (d.domain_status_id = ds.id) Filter: (is_removed = 0) -> Index Scan using request_pkey on request r (cost=0.00..0.27 rows=1 width=4) (never executed) Index Cond: (r.number = d.request_number) Total runtime: 0.926 ms (21 rows) regbgrgr=# SHOW default_statistics_target ; default_statistics_target --------------------------- 10 (1 row) ==========================VERSION 8.4.4=================================================== regbgrgr=# select version (); version ---------------------------------------------------------------------------------------------------------- PostgreSQL 8.4.4 on amd64-portbld-freebsd8.1, compiled by GCC cc (GCC) 4.2.1 20070719 [FreeBSD], 64-bit (1 row) regbgrgr=# explain analyze SELECT COUNT (D.id) as all_domains_count FROM domeini as D, domainperson as DP, person as P, request as R, domain_status as DS WHERE R.number = D.request_number AND D.domain_status_id = DS.id AND DS.is_removed = 0 AND D.id = DP.domain_id AND DP.dp_type_id = 1 AND DP.person1_id = P.id AND ( LOWER (P.bulstat) = LOWER ('999999999') OR LOWER (P.bulstat) = 'bg'||'999999999'); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=61113.19..61113.20 rows=1 width=4) (actual time=6013.705..6013.706 rows=1 loops=1) -> Hash Join (cost=20859.23..61023.00 rows=36075 width=4) (actual time=4553.945..6013.098 rows=598 loops=1) Hash Cond: (d.request_number = r.number) -> Hash Join (cost=18796.01..57800.47 rows=36075 width=8) (actual time=4177.313..5646.153 rows=598 loops=1) Hash Cond: (d.domain_status_id = ds.id) -> Hash Join (cost=18778.40..57286.82 rows=36075 width=12) (actual time=4176.838..5643.637 rows=1357 loops=1) Hash Cond: (dp.domain_id = d.id) -> Hash Join (cost=4671.42..40710.39 rows=36080 width=4) (actual time=3210.201..4621.977 rows=1357 loops=1) Hash Cond: (dp.person1_id = p.id) -> Seq Scan on domainperson dp (cost=0.00..33976.29 rows=272302 width=8) (actual time=0.026..1128.230 rows=279008 loops=1) Filter: (dp_type_id = 1) -> Hash (cost=4634.39..4634.39 rows=2962 width=4) (actual time=3210.050..3210.050 rows=1263 loops=1) -> Bitmap Heap Scan on person p (cost=64.33..4634.39 rows=2962 width=4) (actual time=114.401..3206.440 rows=1263 loops=1) Recheck Cond: ((lower(bulstat) = '999999999'::text) OR (lower(bulstat) = 'bg999999999'::text)) -> BitmapOr (cost=64.33..64.33 rows=2969 width=0) (actual time=95.115..95.115 rows=0 loops=1) -> Bitmap Index Scan on person_bulstat_lower_idx (cost=0.00..31.43 rows=1485 width=0) (actual time=33.525..33.525 rows=1241 loops=1) Index Cond: (lower(bulstat) = '999999999'::text) -> Bitmap Index Scan on person_bulstat_lower_idx (cost=0.00..31.43 rows=1485 width=0) (actual time=61.584..61.584 rows=22 loops=1) Index Cond: (lower(bulstat) = 'bg999999999'::text) -> Hash (cost=8728.77..8728.77 rows=309377 width=12) (actual time=957.267..957.267 rows=309410 loops=1) -> Seq Scan on domeini d (cost=0.00..8728.77 rows=309377 width=12) (actual time=0.015..563.414 rows=309410 loops=1) -> Hash (cost=15.31..15.31 rows=184 width=4) (actual time=0.455..0.455 rows=184 loops=1) -> Seq Scan on domain_status ds (cost=0.00..15.31 rows=184 width=4) (actual time=0.009..0.252 rows=184 loops=1) Filter: (is_removed = 0) -> Hash (cost=1030.43..1030.43 rows=62943 width=4) (actual time=356.134..356.134 rows=62815 loops=1) -> Seq Scan on request r (cost=0.00..1030.43 rows=62943 width=4) (actual time=10.902..275.137 rows=62815 loops=1) Total runtime: 6014.029 ms (27 rows) regbgrgr=# show default_statistics_target ; default_statistics_target --------------------------- 100 (1 row)
Sorry for the spam. The 8.2.15 plan was on an empty database. On a full database the plan was almost the same. So the question is could I speed up the plan? Why the "Hash Cond: (dp.person1_id = p.id)" isn't used for index scan on that table? Best regards, Kaloya Iliev Here is the plan on a full database: ==========================VERSION 8.2.17=================================================== version --------------------------------------------------------------------------------------------------- PostgreSQL 8.2.17 on amd64-portbld-freebsd8.0, compiled by GCC cc (GCC) 4.2.1 20070719 [FreeBSD] (1 row) regbgrgr=# SHOW default_statistics_target ; default_statistics_target --------------------------- 10 (1 row) regbgrgr=# explain analyze SELECT COUNT (D.id) as all_domains_count FROM domeini as D, domainperson as DP, person as P, request as R, domain_status as DS WHERE R.number = D.request_number AND D.domain_status_id = DS.id AND DS.is_removed = 0 AND D.id = DP.domain_id AND DP.dp_type_id = 1 AND DP.person1_id = P.id AND ( LOWER (P.bulstat) = LOWER ('999999999') OR LOWER (P.bulstat) = 'bg'||'999999999'); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=48342.78..48342.79 rows=1 width=4) (actual time=2429.190..2429.192 rows=1 loops=1) -> Hash Join (cost=5142.26..48339.54 rows=1295 width=4) (actual time=314.817..2427.752 rows=570 loops=1) Hash Cond: (d.request_number = r.number) -> Hash Join (cost=3088.49..45960.01 rows=1308 width=8) (actual time=37.001..2125.040 rows=570 loops=1) Hash Cond: (d.domain_status_id = ds.id) -> Nested Loop (cost=3064.88..45918.37 rows=1316 width=12) (actual time=35.584..2117.332 rows=1250 loops=1) -> Hash Join (cost=3064.88..40159.12 rows=1316 width=4) (actual time=35.506..2043.384 rows=1250 loops=1) Hash Cond: (dp.person1_id = p.id) -> Seq Scan on domainperson dp (cost=0.00..36010.68 rows=285441 width=8) (actual time=0.069..1459.818 rows=274533 loops=1) Filter: (dp_type_id = 1) -> Hash (cost=3048.93..3048.93 rows=1276 width=4) (actual time=35.206..35.206 rows=1157 loops=1) -> Bitmap Heap Scan on person p (cost=30.78..3048.93 rows=1276 width=4) (actual time=1.187..31.170 rows=1157 loops=1) Recheck Cond: ((lower(bulstat) = '999999999'::text) OR (lower(bulstat) = 'bg999999999'::text)) -> BitmapOr (cost=30.78..30.78 rows=1276 width=0) (actual time=0.841..0.841 rows=0 loops=1) -> Bitmap Index Scan on person_bulstat_lower_idx (cost=0.00..25.28 rows=1199 width=0) (actual time=0.709..0.709 rows=1135 loops=1) Index Cond: (lower(bulstat) = '999999999'::text) -> Bitmap Index Scan on person_bulstat_lower_idx (cost=0.00..4.86 rows=77 width=0) (actual time=0.124..0.124 rows=22 loops=1) Index Cond: (lower(bulstat) = 'bg999999999'::text) -> Index Scan using domeini_pkey on domeini d (cost=0.00..4.36 rows=1 width=12) (actual time=0.043..0.046 rows=1 loops=1250) Index Cond: (d.id = dp.domain_id) -> Hash (cost=21.31..21.31 rows=184 width=4) (actual time=1.380..1.380 rows=184 loops=1) -> Seq Scan on domain_status ds (cost=0.00..21.31 rows=184 width=4) (actual time=0.316..0.942 rows=184 loops=1) Filter: (is_removed = 0) -> Hash (cost=1026.01..1026.01 rows=59101 width=4) (actual time=277.161..277.161 rows=59027 loops=1) -> Seq Scan on request r (cost=0.00..1026.01 rows=59101 width=4) (actual time=0.075..131.951 rows=59027 loops=1) Total runtime: 2429.603 ms (26 rows) Kaloyan Iliev Iliev wrote: > Hello again, > I have another query which performance drops drastically after PG > upgrade. > I can not improve the plan no matter how hard I try. I try creating > new indexes and rewrite the query with JOIN .. ON instead of commas > but nothing happens. > I will appreciate any suggestions. > Best regards, > Kaloyan Iliev > > ==========================VERSION > 8.2.15=================================================== > > > regbgrgr=# SELECT version(); > > version > --------------------------------------------------------------------------------------------------- > > PostgreSQL 8.2.15 on amd64-portbld-freebsd7.2, compiled by GCC cc > (GCC) 4.2.1 20070719 [FreeBSD] > (1 row) > > regbgrgr=# explain analyze SELECT > > COUNT (D.id) as all_domains_count > FROM > > domeini as D, > > domainperson as DP, > > person as P, > > request as R, > > domain_status as DS > WHERE > > R.number = D.request_number AND > > D.domain_status_id = DS.id AND > > DS.is_removed = 0 AND > > D.id = DP.domain_id AND > > DP.dp_type_id = 1 AND > > DP.person1_id = P.id AND ( LOWER (P.bulstat) = LOWER ('999999999') OR > LOWER (P.bulstat) = 'bg'||'999999999'); > > QUERY > PLAN > > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Aggregate (cost=138.30..138.31 rows=1 width=4) (actual > time=0.804..0.806 rows=1 loops=1) > -> Nested Loop (cost=74.70..138.29 rows=5 width=4) (actual > time=0.797..0.797 rows=0 loops=1) > -> Nested Loop (cost=74.70..136.88 rows=5 width=8) (actual > time=0.793..0.793 rows=0 loops=1) > -> Nested Loop (cost=74.70..135.44 rows=5 width=12) > (actual time=0.791..0.791 rows=0 loops=1) > -> Hash Join (cost=74.70..122.42 rows=5 width=4) > (actual time=0.787..0.787 rows=0 loops=1) > Hash Cond: (dp.person1_id = p.id) > -> Bitmap Heap Scan on domainperson dp > (cost=19.91..65.81 rows=472 width=8) (actual time=0.088..0.088 rows=1 > loops=1) > Recheck Cond: (dp_type_id = 1) > -> Bitmap Index Scan on > domainperson_admin_person_uidx (cost=0.00..19.79 rows=472 width=0) > (actual time=0.071..0.071 rows=474 loops=1) > Index Cond: (dp_type_id = 1) > -> Hash (cost=54.62..54.62 rows=14 > width=4) (actual time=0.678..0.678 rows=0 loops=1) > -> Seq Scan on person p > (cost=0.00..54.62 rows=14 width=4) (actual time=0.675..0.675 rows=0 > loops=1) > Filter: ((lower(bulstat) = > '999999999'::text) OR (lower(bulstat) = 'bg999999999'::text)) > -> Index Scan using domeini_pkey on domeini d > (cost=0.00..2.59 rows=1 width=12) (never executed) > Index Cond: (d.id = dp.domain_id) > -> Index Scan using domain_status_pkey on domain_status > ds (cost=0.00..0.27 rows=1 width=4) (never executed) > Index Cond: (d.domain_status_id = ds.id) > Filter: (is_removed = 0) > -> Index Scan using request_pkey on request r > (cost=0.00..0.27 rows=1 width=4) (never executed) > Index Cond: (r.number = d.request_number) > Total runtime: 0.926 ms > (21 rows) > > regbgrgr=# SHOW default_statistics_target ; > default_statistics_target > --------------------------- > 10 > (1 row) > > > ==========================VERSION > 8.4.4=================================================== > regbgrgr=# select version (); > > version > ---------------------------------------------------------------------------------------------------------- > > PostgreSQL 8.4.4 on amd64-portbld-freebsd8.1, compiled by GCC cc (GCC) > 4.2.1 20070719 [FreeBSD], 64-bit > (1 row) > > regbgrgr=# explain analyze SELECT > > COUNT (D.id) as all_domains_count > FROM > > domeini as D, > > domainperson as DP, > > person as P, > > request as R, > > domain_status as DS > WHERE > > R.number = D.request_number AND > > D.domain_status_id = DS.id AND > > DS.is_removed = 0 AND > > D.id = DP.domain_id AND > > DP.dp_type_id = 1 AND > > DP.person1_id = P.id AND ( LOWER (P.bulstat) = LOWER ('999999999') OR > LOWER (P.bulstat) = 'bg'||'999999999'); > > QUERY > PLAN > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > Aggregate (cost=61113.19..61113.20 rows=1 width=4) (actual > time=6013.705..6013.706 rows=1 loops=1) > -> Hash Join (cost=20859.23..61023.00 rows=36075 width=4) (actual > time=4553.945..6013.098 rows=598 loops=1) > Hash Cond: (d.request_number = r.number) > -> Hash Join (cost=18796.01..57800.47 rows=36075 width=8) > (actual time=4177.313..5646.153 rows=598 loops=1) > Hash Cond: (d.domain_status_id = ds.id) > -> Hash Join (cost=18778.40..57286.82 rows=36075 > width=12) (actual time=4176.838..5643.637 rows=1357 loops=1) > Hash Cond: (dp.domain_id = d.id) > -> Hash Join (cost=4671.42..40710.39 rows=36080 > width=4) (actual time=3210.201..4621.977 rows=1357 loops=1) > Hash Cond: (dp.person1_id = p.id) > -> Seq Scan on domainperson dp > (cost=0.00..33976.29 rows=272302 width=8) (actual time=0.026..1128.230 > rows=279008 loops=1) > Filter: (dp_type_id = 1) > -> Hash (cost=4634.39..4634.39 rows=2962 > width=4) (actual time=3210.050..3210.050 rows=1263 loops=1) > -> Bitmap Heap Scan on person p > (cost=64.33..4634.39 rows=2962 width=4) (actual time=114.401..3206.440 > rows=1263 loops=1) > Recheck Cond: ((lower(bulstat) = > '999999999'::text) OR (lower(bulstat) = 'bg999999999'::text)) > -> BitmapOr (cost=64.33..64.33 > rows=2969 width=0) (actual time=95.115..95.115 rows=0 loops=1) > -> Bitmap Index Scan on > person_bulstat_lower_idx (cost=0.00..31.43 rows=1485 width=0) (actual > time=33.525..33.525 rows=1241 loops=1) > Index Cond: > (lower(bulstat) = '999999999'::text) > -> Bitmap Index Scan on > person_bulstat_lower_idx (cost=0.00..31.43 rows=1485 width=0) (actual > time=61.584..61.584 rows=22 loops=1) > Index Cond: > (lower(bulstat) = 'bg999999999'::text) > -> Hash (cost=8728.77..8728.77 rows=309377 > width=12) (actual time=957.267..957.267 rows=309410 loops=1) > -> Seq Scan on domeini d > (cost=0.00..8728.77 rows=309377 width=12) (actual time=0.015..563.414 > rows=309410 loops=1) > -> Hash (cost=15.31..15.31 rows=184 width=4) (actual > time=0.455..0.455 rows=184 loops=1) > -> Seq Scan on domain_status ds > (cost=0.00..15.31 rows=184 width=4) (actual time=0.009..0.252 rows=184 > loops=1) > Filter: (is_removed = 0) > -> Hash (cost=1030.43..1030.43 rows=62943 width=4) (actual > time=356.134..356.134 rows=62815 loops=1) > -> Seq Scan on request r (cost=0.00..1030.43 > rows=62943 width=4) (actual time=10.902..275.137 rows=62815 loops=1) > Total runtime: 6014.029 ms > (27 rows) > > regbgrgr=# show default_statistics_target ; > default_statistics_target > --------------------------- > 100 > (1 row) > >
Kaloyan Iliev Iliev <kaloyan@digsys.bg> wrote: > The 8.2.15 plan was on an empty database. > On a full database the plan was almost the same. So the question > is could I speed up the plan? Since this is an entirely new query which doesn't include a LEFT JOIN, it's not good to just tack it onto the other thread. Could you please re-post with an appropriate subject and a little more information?: http://wiki.postgresql.org/wiki/SlowQueryQuestions Schema (including indexes), configuration settings, and hardware (CPUs, storage system, and RAM) can be particularly significant. Also, if you could *attach* the EXPLAIN ANALYZE output instead of pasting it within the email, you'd save time for those trying to read it -- emails tend to get word-wrapped in a way which makes them hard to read without manual reformatting. Thanks, -Kevin
>Kaloyan Iliev Iliev <kaloyan@digsys.bg> wrote: > Kevin Grittner wrote: >> Out of curiosity, what happens if you consistently use JOIN >> clauses, rather than mixing that with commas?: > The plan improves. So can you explain why? Commas in a FROM clause bind more loosely than JOIN clauses, rearrangement from one side of an outer join to the other is a bit tricky, and the *_collapse_limit settings (which you have not shown) can affect how much JOIN rearrangement is done for a complex query. On a quick scan over your query it didn't appear that the rearrangement would break anything, so I wondered whether the planner might do better if you made its job a bit easier by putting the inner joins all on the left to start with and putting the tables closer to the order of efficient access. If you still see this difference with very high collapse limits, your example might be a good one to support further work on the optimizer; but it would be more useful for that if you could create a synthetic case to demonstrate the problem -- starting with creation of tables, data, and indexes on which the different forms of the query yielded different plans. -Kevin