Thread: Question about LEFT JOIN and query plan

Question about LEFT JOIN and query plan

From
Kaloyan Iliev Iliev
Date:
Hello,
I have I query which behave strangely (according to me).
According to the first plan PG makes absolutely unnecessary seq scan on
tables "invoices" and "domeini" and etc.
I thing they should be access only if there are rows from the where. Why
the left join executes first?
Then I rewrite the query and move left joins to sub queries and the
result was great speed up.
But I thing it is more correctly to write the query with left joins. At
least the sub queries have similar parts which are now accessed twice.

So I will appreciate any suggestions how it is correct to write this
query and why the left join plan is so wrong.

 SELECT version();

version
----------------------------------------------------------------------------------------------------------
 PostgreSQL 8.4.4 on amd64-portbld-freebsd8.1, compiled by GCC cc (GCC)
4.2.1 20070719  [FreeBSD], 64-bit
(1 row)


Best regards,
  Kaloyan Iliev


===============================ORIGINAL QUERY==============================
explain analyze SELECT
                                                        DD.debtid,
                                                        ADD.amount as saldo,
                                    DOM.fqdn ||DT.descr as domain_fqdn,
                                    S.descr_bg as service_descr_bg,
                                    ADD.pno,
                                    ADD.amount,
                                    M.name_bg as measure_name_bg,
                                    AC.ino,
                                    I.idate
                            FROM debts_desc DD LEFT JOIN domeini DOM ON
(DD.domain_id = DOM.id)
                                                                LEFT
JOIN domain_type DT ON (DOM.domain_type_id = DT.id)
                                                   LEFT JOIN acc_debts
ADC ON (DD.debtid = ADC.debtid AND ADC.credit)
                                                         LEFT JOIN
acc_clients AC ON (AC.transact_no = ADC.transact_no AND NOT AC.credit)
                                                           LEFT JOIN
invoices I ON (AC.ino = I.ino AND I.istatus = 0),
                                 acc_debts ADD,
                                 services S,
                                 measures M,
                                 proforms P
                            WHERE DD.debtid = ADD.debtid
                                  AND DD.measure_id = M.measure_id
                                  AND DD.active
                                  AND NOT DD.paid
                                  AND DD.has_proform
                                  AND NOT DD.storned
                                                          AND ADD.pno =
P.pno
                                                          AND NOT ADD.credit

                                                          AND
P.person1_id = 287294
                                                          AND
DD.serviceid = S.serviceid;


QUERY
PLAN


--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=37503.47..47243.77 rows=1 width=110) (actual
time=1522.796..1522.796 rows=0 loops=1)
   Join Filter: (dd.measure_id = m.measure_id)
   ->  Nested Loop  (cost=37503.47..47242.45 rows=1 width=106) (actual
time=1522.794..1522.794 rows=0 loops=1)
         Join Filter: (dd.serviceid = s.serviceid)
         ->  Hash Join  (cost=37503.47..47239.46 rows=1 width=79)
(actual time=1522.791..1522.791 rows=0 loops=1)
               Hash Cond: (dd.debtid = add.debtid)
               ->  Hash Left Join  (cost=37475.95..47122.76 rows=23782
width=67) (actual time=1370.668..1521.629 rows=1037 loops=1)
                     Hash Cond: (dom.domain_type_id = dt.id)
                     ->  Hash Left Join  (cost=37474.12..46793.92
rows=23782 width=66) (actual time=1370.563..1519.302 rows=1037 loops=1)
                           Hash Cond: (dd.domain_id = dom.id)
                           ->  Hash Left Join  (cost=23487.71..30402.02
rows=23782 width=54) (actual time=556.587..636.320 rows=1037 loops=1)
                                 Hash Cond: (ac.ino = i.ino)
                                 ->  Hash Left Join
(cost=8410.66..14259.11 rows=23782 width=50) (actual
time=318.180..387.026 rows=1037 loops=1)
                                       Hash Cond: (adc.transact_no =
ac.transact_no)
                                       ->  Hash Left Join
(cost=4973.98..9903.69 rows=23782 width=50) (actual
time=175.979..234.068 rows=1037 loops=1)
                                             Hash Cond: (dd.debtid =
adc.debtid)
                                             ->  Seq Scan on debts_desc
dd  (cost=0.00..2866.52 rows=23782 width=46) (actual time=0.481..45.085
rows=1037 loops=1)
                                                   Filter: (active AND
(NOT paid) AND has_proform AND (NOT storned))
                                             ->  Hash
(cost=3942.08..3942.08 rows=62872 width=8) (actual time=175.410..175.410
rows=63157 loops=1)
                                                   ->  Seq Scan on
acc_debts adc  (cost=0.00..3942.08 rows=62872 width=8) (actual
time=0.097..102.172 rows=63157 loops=1)
                                                         Filter: credit
                                       ->  Hash  (cost=2536.53..2536.53
rows=54812 width=8) (actual time=142.169..142.169 rows=54559 loops=1)
                                             ->  Seq Scan on acc_clients
ac  (cost=0.00..2536.53 rows=54812 width=8) (actual time=0.019..78.736
rows=54559 loops=1)
                                                   Filter: (NOT credit)
                                 ->  Hash  (cost=14181.02..14181.02
rows=54562 width=8) (actual time=238.380..238.380 rows=54559 loops=1)
                                       ->  Seq Scan on invoices i
(cost=0.00..14181.02 rows=54562 width=8) (actual time=0.029..170.761
rows=54559 loops=1)
                                             Filter: (istatus = 0)
                           ->  Hash  (cost=8669.96..8669.96 rows=305796
width=16) (actual time=813.940..813.940 rows=305796 loops=1)
                                 ->  Seq Scan on domeini dom
(cost=0.00..8669.96 rows=305796 width=16) (actual time=0.015..419.684
rows=305796 loops=1)
                     ->  Hash  (cost=1.37..1.37 rows=37 width=9) (actual
time=0.087..0.087 rows=37 loops=1)
                           ->  Seq Scan on domain_type dt
(cost=0.00..1.37 rows=37 width=9) (actual time=0.003..0.040 rows=37 loops=1)
               ->  Hash  (cost=27.45..27.45 rows=5 width=16) (actual
time=0.078..0.078 rows=1 loops=1)
                     ->  Nested Loop  (cost=0.00..27.45 rows=5 width=16)
(actual time=0.067..0.073 rows=1 loops=1)
                           ->  Index Scan using proforms_person1_id_idx
on proforms p  (cost=0.00..10.62 rows=2 width=4) (actual
time=0.045..0.046 rows=1 loops=1)
                                 Index Cond: (person1_id = 287294)
                           ->  Index Scan using acc_debts_pno_idx on
acc_debts add  (cost=0.00..8.38 rows=3 width=16) (actual
time=0.017..0.019 rows=1 loops=1)
                                 Index Cond: (add.pno = p.pno)
                                 Filter: (NOT add.credit)
         ->  Seq Scan on services s  (cost=0.00..2.44 rows=44 width=31)
(never executed)
   ->  Seq Scan on measures m  (cost=0.00..1.14 rows=14 width=8) (never
executed)
 Total runtime: 1523.525 ms
(41 rows)



==================================================AFTER
REWRITE============================================

explain analyze SELECT
                                DD.debtid,
                                ADD.amount as saldo,
                            (SELECT DOM.fqdn ||DT.descr
                             FROM domeini DOM, domain_type DT
                             WHERE DOM.domain_type_id = DT.id
                                   AND DD.domain_id = DOM.id) as
domain_fqdn,
                            S.descr_bg as service_descr_bg,
                            ADD.pno,
                            ADD.amount,
                            M.name_bg as measure_name_bg,
                            (SELECT AC.ino FROM acc_debts ACD,
                                             acc_clients AC
                                         WHERE ACD.debtid = ADD.debtid
                                              AND ACD.credit
                                              AND AC.transact_no =
ACD.transact_no
                                              AND NOT AC.credit) as ino,
                            (SELECT I.idate FROM acc_debts ACD,
                                             acc_clients AC,
                                             invoices I
                                          WHERE  ACD.debtid = ADD.debtid
                                                AND ACD.credit
                                                AND AC.transact_no =
ACD.transact_no
                                                 AND NOT AC.credit
                                                 AND AC.ino = I.ino
                                                 AND I.istatus = 0) as idate
                         FROM debts_desc DD,
                              acc_debts ADD,
                              services S,
                              measures M,
                              proforms P
                         WHERE DD.debtid = ADD.debtid
                               AND DD.measure_id = M.measure_id
                               AND DD.active
                               AND NOT DD.paid
                               AND DD.has_proform
                               AND NOT DD.storned
                                  AND ADD.pno = P.pno
                                  AND NOT ADD.credit
                                  AND P.person1_id = 287294
                                  AND DD.serviceid = S.serviceid;


QUERY
PLAN


-------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..77.90 rows=1 width=93) (actual
time=0.047..0.047 rows=0 loops=1)
   ->  Nested Loop  (cost=0.00..32.96 rows=1 width=66) (actual
time=0.045..0.045 rows=0 loops=1)
         ->  Nested Loop  (cost=0.00..32.68 rows=1 width=62) (actual
time=0.043..0.043 rows=0 loops=1)
               ->  Nested Loop  (cost=0.00..27.45 rows=5 width=16)
(actual time=0.026..0.031 rows=1 loops=1)
                     ->  Index Scan using proforms_person1_id_idx on
proforms p  (cost=0.00..10.62 rows=2 width=4) (actual time=0.013..0.014
rows=1 loops=1)
                           Index Cond: (person1_id = 287294)
                     ->  Index Scan using acc_debts_pno_idx on acc_debts
add  (cost=0.00..8.38 rows=3 width=16) (actual time=0.007..0.008 rows=1
loops=1)
                           Index Cond: (add.pno = p.pno)
                           Filter: (NOT add.credit)
               ->  Index Scan using debts_desc_pkey on debts_desc dd
(cost=0.00..1.03 rows=1 width=46) (actual time=0.007..0.007 rows=0 loops=1)
                     Index Cond: (dd.debtid = add.debtid)
                     Filter: (dd.active AND (NOT dd.paid) AND
dd.has_proform AND (NOT dd.storned))
         ->  Index Scan using measures_pkey on measures m
(cost=0.00..0.27 rows=1 width=8) (never executed)
               Index Cond: (m.measure_id = dd.measure_id)
   ->  Index Scan using services_pkey on services s  (cost=0.00..0.27
rows=1 width=31) (never executed)
         Index Cond: (s.serviceid = dd.serviceid)
   SubPlan 1
     ->  Hash Join  (cost=8.31..9.84 rows=1 width=13) (never executed)
           Hash Cond: (dt.id = dom.domain_type_id)
           ->  Seq Scan on domain_type dt  (cost=0.00..1.37 rows=37
width=9) (never executed)
           ->  Hash  (cost=8.30..8.30 rows=1 width=12) (never executed)
                 ->  Index Scan using domeini_pkey on domeini dom
(cost=0.00..8.30 rows=1 width=12) (never executed)
                       Index Cond: ($0 = id)
   SubPlan 2
     ->  Nested Loop  (cost=0.00..16.63 rows=1 width=4) (never executed)
           ->  Index Scan using acc_debts_debtid_idx on acc_debts acd
(cost=0.00..8.33 rows=1 width=4) (never executed)
                 Index Cond: (debtid = $1)
                 Filter: credit
           ->  Index Scan using acc_clients_transact_no_uidx on
acc_clients ac  (cost=0.00..8.28 rows=1 width=8) (never executed)
                 Index Cond: (ac.transact_no = acd.transact_no)
                 Filter: (NOT ac.credit)
   SubPlan 3
     ->  Nested Loop  (cost=0.00..18.19 rows=1 width=4) (never executed)
           ->  Nested Loop  (cost=0.00..16.63 rows=1 width=4) (never
executed)
                 ->  Index Scan using acc_debts_debtid_idx on acc_debts
acd  (cost=0.00..8.33 rows=1 width=4) (never executed)
                       Index Cond: (debtid = $1)
                       Filter: credit
                 ->  Index Scan using acc_clients_transact_no_uidx on
acc_clients ac  (cost=0.00..8.28 rows=1 width=8) (never executed)
                       Index Cond: (ac.transact_no = acd.transact_no)
                       Filter: (NOT ac.credit)
           ->  Index Scan using invoices_ino_uidx on invoices i
(cost=0.00..1.55 rows=1 width=8) (never executed)
                 Index Cond: (i.ino = ac.ino)
 Total runtime: 0.202 ms
(43 rows)




Re: Question about LEFT JOIN and query plan

From
Tom Lane
Date:
Kaloyan Iliev Iliev <kaloyan@digsys.bg> writes:
> I have I query which behave strangely (according to me).
> According to the first plan PG makes absolutely unnecessary seq scan on
> tables "invoices" and "domeini" and etc.

I think you might get better results if you could get this rowcount
estimate a bit more in line with reality:

>              ->  Seq Scan on debts_desc dd  (cost=0.00..2866.52 rows=23782 width=46) (actual time=0.481..45.085
rows=1037loops=1) 
>                    Filter: (active AND (NOT paid) AND has_proform AND (NOT storned))

It's choosing to hash instead of doing (what it thinks will be) 23K
index probes into the other table.  For 1000 probes the decision
might be different.

I don't know if raising the stats target for that table will be enough
to fix it.  Most likely those four conditions are not uncorrelated.
You might need to think about revising the table's representation
so that the query condition can be simpler and thus more accurately
estimated.

            regards, tom lane

Re: Question about LEFT JOIN and query plan

From
"Kevin Grittner"
Date:
Kaloyan Iliev Iliev <kaloyan@digsys.bg> wrote:

> I thing they should be access only if there are rows from the
> where. Why the left join executes first?

Out of curiosity, what happens if you consistently us JOIN clauses,
rather than mixing that with commas?:

explain analyze
SELECT
    DD.debtid,
    ADD.amount as saldo,
    DOM.fqdn ||DT.descr as domain_fqdn,
    S.descr_bg as service_descr_bg,
    ADD.pno,
    ADD.amount,
    M.name_bg as measure_name_bg,
    AC.ino,
    I.idate
  FROM debts_desc DD
  JOIN proforms P ON (ADD.pno = P.pno)
  JOIN acc_debts ADD ON (DD.debtid = ADD.debtid)
  JOIN services S ON (DD.serviceid = S.serviceid)
  JOIN measures M ON (DD.measure_id = M.measure_id)
  LEFT JOIN domeini DOM ON (DD.domain_id = DOM.id)
  LEFT JOIN domain_type DT ON (DOM.domain_type_id = DT.id)
  LEFT JOIN acc_debts ADC
    ON (DD.debtid = ADC.debtid AND ADC.credit)
  LEFT JOIN acc_clients AC
    ON (AC.transact_no = ADC.transact_no AND NOT AC.credit)
  LEFT JOIN invoices I ON (AC.ino = I.ino AND I.istatus = 0)
  WHERE DD.active
    AND NOT DD.paid
    AND DD.has_proform
    AND NOT DD.storned
    AND NOT ADD.credit
    AND P.person1_id = 287294
;

-Kevin

Re: Question about LEFT JOIN and query plan

From
Kaloyan Iliev Iliev
Date:
<tt>Hi,<br /> The plan improves.  So can you explain why?<br /> Thanks in advance.<br /><br /> Kaloyan<br />
                                                                                       QUERY
PLAN                                                                                        <br />
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------<br
/> Nested Loop Left Join  (cost=0.00..82.88 rows=1 width=68) (actual time=92.455..92.455 rows=0 loops=1)<br />   
-> Nested Loop Left Join  (cost=0.00..77.73 rows=1 width=64) (actual time=92.453..92.453 rows=0 loops=1)<br />
        ->  Nested Loop Left Join  (cost=0.00..69.44 rows=1 width=64) (actual time=92.451..92.451 rows=0 loops=1)<br
/>               ->  Nested Loop  (cost=0.00..64.26 rows=1 width=60) (actual time=92.449..92.449 rows=0 loops=1)<br
/>                     Join Filter: (dd.measure_id = m.measure_id)<br />                      ->  Nested Loop 
(cost=0.00..62.95rows=1 width=60) (actual time=92.447..92.447 rows=0 loops=1)<br />                            Join
Filter:(dd.serviceid = s.serviceid)<br />                            ->  Nested Loop Left Join  (cost=0.00..59.96
rows=1width=37) (actual time=92.444..92.444 rows=0 loops=1)<br />                                  Join Filter:
(dom.domain_type_id= dt.id)<br />                                  ->  Nested Loop Left Join  (cost=0.00..58.13
rows=1width=36) (actual time=92.443..92.443 rows=0 loops=1)<br />                                        ->  Nested
Loop (cost=0.00..52.88 rows=1 width=28) (actual time=92.440..92.440 rows=0 loops=1)<br />
                                            ->  Nested Loop  (cost=0.00..27.50 rows=5 width=16) (actual
time=0.021..0.027rows=1 loops=1)<br />                                                    ->  Index Scan using
proforms_person1_id_idxon proforms p  (cost=0.00..10.67 rows=2 width=4) (actual time=0.008..0.009 rows=1 loops=1)<br />
                                                        Index Cond: (person1_id = 287294)<br />
                                                  ->  Index Scan using acc_debts_pno_idx on acc_debts add 
(cost=0.00..8.38rows=3 width=16) (actual time=0.007..0.009 rows=1 loops=1)<br />
                                                        Index Cond: (add.pno = p.pno)<br />
                                                        Filter: (NOT add.credit)<br />
                                            ->  Index Scan using debts_desc_pkey on debts_desc dd  (cost=0.00..5.06
rows=1width=16) (actual time=92.408..92.408 rows=0 loops=1)<br />                                                   
IndexCond: (dd.debtid = add.debtid)<br />                                                    Filter: (dd.active AND
(NOTdd.paid) AND dd.has_proform AND (NOT dd.storned))<br />                                        ->  Index Scan
usingdomeini_pkey on domeini dom  (cost=0.00..5.24 rows=1 width=16) (never executed)<br />
                                            Index Cond: (dd.domain_id = dom.id)<br />                                 
-> Seq Scan on domain_type dt  (cost=0.00..1.37 rows=37 width=9) (never executed)<br />                           
-> Seq Scan on services s  (cost=0.00..2.44 rows=44 width=31) (never executed)<br />                      ->  Seq
Scanon measures m  (cost=0.00..1.14 rows=14 width=8) (never executed)<br />                ->  Index Scan using
acc_debts_debtid_idxon acc_debts adc  (cost=0.00..5.16 rows=1 width=8) (never executed)<br />                     
IndexCond: (dd.debtid = adc.debtid)<br />                      Filter: adc.credit<br />          ->  Index Scan
usingacc_clients_transact_no_uidx on acc_clients ac  (cost=0.00..8.28 rows=1 width=8) (never executed)<br />
              Index Cond: (ac.transact_no = adc.transact_no)<br />                Filter: (NOT ac.credit)<br />   
-> Index Scan using invoices_ino_uidx on invoices i  (cost=0.00..5.13 rows=1 width=8) (never executed)<br />
        Index Cond: (ac.ino = i.ino)<br />  Total runtime: 92.612 ms<br /> (34 rows)<br /></tt><br /><br /> Kevin
Grittnerwrote: <blockquote cite="mid:4C80E73402000025000351AF@gw.wicourts.gov" type="cite"><pre wrap="">Kaloyan Iliev
Iliev<a class="moz-txt-link-rfc2396E" href="mailto:kaloyan@digsys.bg"><kaloyan@digsys.bg></a> wrote:
</pre><blockquotetype="cite"><pre wrap="">I thing they should be access only if there are rows from the
 
where. Why the left join executes first?   </pre></blockquote><pre wrap=""> 
Out of curiosity, what happens if you consistently us JOIN clauses,
rather than mixing that with commas?:
explain analyze
SELECT   DD.debtid,   ADD.amount as saldo,   DOM.fqdn ||DT.descr as domain_fqdn,   S.descr_bg as service_descr_bg,
ADD.pno,  ADD.amount,   M.name_bg as measure_name_bg,   AC.ino,   I.idate FROM debts_desc DD JOIN proforms P ON
(ADD.pno= P.pno) JOIN acc_debts ADD ON (DD.debtid = ADD.debtid) JOIN services S ON (DD.serviceid = S.serviceid) JOIN
measuresM ON (DD.measure_id = M.measure_id) LEFT JOIN domeini DOM ON (DD.domain_id = DOM.id) LEFT JOIN domain_type DT
ON(DOM.domain_type_id = DT.id) LEFT JOIN acc_debts ADC   ON (DD.debtid = ADC.debtid AND ADC.credit) LEFT JOIN
acc_clientsAC   ON (AC.transact_no = ADC.transact_no AND NOT AC.credit) LEFT JOIN invoices I ON (AC.ino = I.ino AND
I.istatus= 0) WHERE DD.active   AND NOT DD.paid   AND DD.has_proform   AND NOT DD.storned   AND NOT ADD.credit   AND
P.person1_id= 287294
 
;
-Kevin
 </pre></blockquote>

Re: Question about LEFT JOIN and query plan

From
Kaloyan Iliev Iliev
Date:
Hello again,
I have another query which performance drops drastically after PG upgrade.
I can not improve the plan no matter how hard I try. I try creating new
indexes and rewrite the query with JOIN .. ON instead of commas but
nothing happens.
I will appreciate any suggestions.
Best regards,
 Kaloyan Iliev

==========================VERSION
8.2.15===================================================


regbgrgr=# SELECT version();

version
---------------------------------------------------------------------------------------------------
 PostgreSQL 8.2.15 on amd64-portbld-freebsd7.2, compiled by GCC cc (GCC)
4.2.1 20070719  [FreeBSD]
(1 row)

regbgrgr=# explain analyze SELECT

COUNT (D.id) as all_domains_count
                                                                 FROM

domeini as D,

domainperson as DP,

person as P,

request as R,

domain_status as DS
                                                                WHERE

R.number = D.request_number AND

D.domain_status_id = DS.id AND

DS.is_removed = 0 AND

D.id = DP.domain_id AND

DP.dp_type_id = 1 AND

DP.person1_id = P.id AND  ( LOWER (P.bulstat) = LOWER ('999999999') OR
LOWER (P.bulstat) = 'bg'||'999999999');

QUERY
PLAN


--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=138.30..138.31 rows=1 width=4) (actual
time=0.804..0.806 rows=1 loops=1)
   ->  Nested Loop  (cost=74.70..138.29 rows=5 width=4) (actual
time=0.797..0.797 rows=0 loops=1)
         ->  Nested Loop  (cost=74.70..136.88 rows=5 width=8) (actual
time=0.793..0.793 rows=0 loops=1)
               ->  Nested Loop  (cost=74.70..135.44 rows=5 width=12)
(actual time=0.791..0.791 rows=0 loops=1)
                     ->  Hash Join  (cost=74.70..122.42 rows=5 width=4)
(actual time=0.787..0.787 rows=0 loops=1)
                           Hash Cond: (dp.person1_id = p.id)
                           ->  Bitmap Heap Scan on domainperson dp
(cost=19.91..65.81 rows=472 width=8) (actual time=0.088..0.088 rows=1
loops=1)
                                 Recheck Cond: (dp_type_id = 1)
                                 ->  Bitmap Index Scan on
domainperson_admin_person_uidx  (cost=0.00..19.79 rows=472 width=0)
(actual time=0.071..0.071 rows=474 loops=1)
                                       Index Cond: (dp_type_id = 1)
                           ->  Hash  (cost=54.62..54.62 rows=14 width=4)
(actual time=0.678..0.678 rows=0 loops=1)
                                 ->  Seq Scan on person p
(cost=0.00..54.62 rows=14 width=4) (actual time=0.675..0.675 rows=0 loops=1)
                                       Filter: ((lower(bulstat) =
'999999999'::text) OR (lower(bulstat) = 'bg999999999'::text))
                     ->  Index Scan using domeini_pkey on domeini d
(cost=0.00..2.59 rows=1 width=12) (never executed)
                           Index Cond: (d.id = dp.domain_id)
               ->  Index Scan using domain_status_pkey on domain_status
ds  (cost=0.00..0.27 rows=1 width=4) (never executed)
                     Index Cond: (d.domain_status_id = ds.id)
                     Filter: (is_removed = 0)
         ->  Index Scan using request_pkey on request r
(cost=0.00..0.27 rows=1 width=4) (never executed)
               Index Cond: (r.number = d.request_number)
 Total runtime: 0.926 ms
(21 rows)

regbgrgr=# SHOW default_statistics_target ;
 default_statistics_target
---------------------------
 10
(1 row)


==========================VERSION
8.4.4===================================================
regbgrgr=# select version ();

version
----------------------------------------------------------------------------------------------------------
 PostgreSQL 8.4.4 on amd64-portbld-freebsd8.1, compiled by GCC cc (GCC)
4.2.1 20070719  [FreeBSD], 64-bit
(1 row)

regbgrgr=# explain analyze SELECT

COUNT (D.id) as all_domains_count
                                                                 FROM

domeini as D,

domainperson as DP,

person as P,

request as R,

domain_status as DS
                                                                WHERE

R.number = D.request_number AND

D.domain_status_id = DS.id AND

DS.is_removed = 0 AND

D.id = DP.domain_id AND

DP.dp_type_id = 1 AND

DP.person1_id = P.id AND  ( LOWER (P.bulstat) = LOWER ('999999999') OR
LOWER (P.bulstat) = 'bg'||'999999999');

QUERY
PLAN


------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=61113.19..61113.20 rows=1 width=4) (actual
time=6013.705..6013.706 rows=1 loops=1)
   ->  Hash Join  (cost=20859.23..61023.00 rows=36075 width=4) (actual
time=4553.945..6013.098 rows=598 loops=1)
         Hash Cond: (d.request_number = r.number)
         ->  Hash Join  (cost=18796.01..57800.47 rows=36075 width=8)
(actual time=4177.313..5646.153 rows=598 loops=1)
               Hash Cond: (d.domain_status_id = ds.id)
               ->  Hash Join  (cost=18778.40..57286.82 rows=36075
width=12) (actual time=4176.838..5643.637 rows=1357 loops=1)
                     Hash Cond: (dp.domain_id = d.id)
                     ->  Hash Join  (cost=4671.42..40710.39 rows=36080
width=4) (actual time=3210.201..4621.977 rows=1357 loops=1)
                           Hash Cond: (dp.person1_id = p.id)
                           ->  Seq Scan on domainperson dp
(cost=0.00..33976.29 rows=272302 width=8) (actual time=0.026..1128.230
rows=279008 loops=1)
                                 Filter: (dp_type_id = 1)
                           ->  Hash  (cost=4634.39..4634.39 rows=2962
width=4) (actual time=3210.050..3210.050 rows=1263 loops=1)
                                 ->  Bitmap Heap Scan on person p
(cost=64.33..4634.39 rows=2962 width=4) (actual time=114.401..3206.440
rows=1263 loops=1)
                                       Recheck Cond: ((lower(bulstat) =
'999999999'::text) OR (lower(bulstat) = 'bg999999999'::text))
                                       ->  BitmapOr  (cost=64.33..64.33
rows=2969 width=0) (actual time=95.115..95.115 rows=0 loops=1)
                                             ->  Bitmap Index Scan on
person_bulstat_lower_idx  (cost=0.00..31.43 rows=1485 width=0) (actual
time=33.525..33.525 rows=1241 loops=1)
                                                   Index Cond:
(lower(bulstat) = '999999999'::text)
                                             ->  Bitmap Index Scan on
person_bulstat_lower_idx  (cost=0.00..31.43 rows=1485 width=0) (actual
time=61.584..61.584 rows=22 loops=1)
                                                   Index Cond:
(lower(bulstat) = 'bg999999999'::text)
                     ->  Hash  (cost=8728.77..8728.77 rows=309377
width=12) (actual time=957.267..957.267 rows=309410 loops=1)
                           ->  Seq Scan on domeini d
(cost=0.00..8728.77 rows=309377 width=12) (actual time=0.015..563.414
rows=309410 loops=1)
               ->  Hash  (cost=15.31..15.31 rows=184 width=4) (actual
time=0.455..0.455 rows=184 loops=1)
                     ->  Seq Scan on domain_status ds  (cost=0.00..15.31
rows=184 width=4) (actual time=0.009..0.252 rows=184 loops=1)
                           Filter: (is_removed = 0)
         ->  Hash  (cost=1030.43..1030.43 rows=62943 width=4) (actual
time=356.134..356.134 rows=62815 loops=1)
               ->  Seq Scan on request r  (cost=0.00..1030.43 rows=62943
width=4) (actual time=10.902..275.137 rows=62815 loops=1)
 Total runtime: 6014.029 ms
(27 rows)

regbgrgr=# show default_statistics_target ;
 default_statistics_target
---------------------------
 100
(1 row)


Re: Question about LEFT JOIN and query plan

From
Kaloyan Iliev Iliev
Date:
Sorry for the spam.
The 8.2.15 plan was on an empty database.
On a full database the plan was almost the same. So the question is
could I speed up the plan?
Why the "Hash Cond: (dp.person1_id = p.id)" isn't used for index scan on
that table?

Best regards,
Kaloya Iliev

Here is the plan on a full database:

==========================VERSION
8.2.17===================================================


version
---------------------------------------------------------------------------------------------------
 PostgreSQL 8.2.17 on amd64-portbld-freebsd8.0, compiled by GCC cc (GCC)
4.2.1 20070719  [FreeBSD]
(1 row)

regbgrgr=# SHOW default_statistics_target ;
 default_statistics_target
---------------------------
 10
(1 row)

regbgrgr=# explain analyze SELECT

COUNT (D.id) as all_domains_count
                                                                 FROM

domeini as D,

domainperson as DP,

person as P,

request as R,

domain_status as DS
                                                                WHERE

R.number = D.request_number AND

D.domain_status_id = DS.id AND

DS.is_removed = 0 AND

D.id = DP.domain_id AND

DP.dp_type_id = 1 AND

DP.person1_id = P.id AND  ( LOWER (P.bulstat) = LOWER ('999999999') OR
LOWER (P.bulstat) = 'bg'||'999999999');

QUERY
PLAN


----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=48342.78..48342.79 rows=1 width=4) (actual
time=2429.190..2429.192 rows=1 loops=1)
   ->  Hash Join  (cost=5142.26..48339.54 rows=1295 width=4) (actual
time=314.817..2427.752 rows=570 loops=1)
         Hash Cond: (d.request_number = r.number)
         ->  Hash Join  (cost=3088.49..45960.01 rows=1308 width=8)
(actual time=37.001..2125.040 rows=570 loops=1)
               Hash Cond: (d.domain_status_id = ds.id)
               ->  Nested Loop  (cost=3064.88..45918.37 rows=1316
width=12) (actual time=35.584..2117.332 rows=1250 loops=1)
                     ->  Hash Join  (cost=3064.88..40159.12 rows=1316
width=4) (actual time=35.506..2043.384 rows=1250 loops=1)
                           Hash Cond: (dp.person1_id = p.id)
                           ->  Seq Scan on domainperson dp
(cost=0.00..36010.68 rows=285441 width=8) (actual time=0.069..1459.818
rows=274533 loops=1)
                                 Filter: (dp_type_id = 1)
                           ->  Hash  (cost=3048.93..3048.93 rows=1276
width=4) (actual time=35.206..35.206 rows=1157 loops=1)
                                 ->  Bitmap Heap Scan on person p
(cost=30.78..3048.93 rows=1276 width=4) (actual time=1.187..31.170
rows=1157 loops=1)
                                       Recheck Cond: ((lower(bulstat) =
'999999999'::text) OR (lower(bulstat) = 'bg999999999'::text))
                                       ->  BitmapOr  (cost=30.78..30.78
rows=1276 width=0) (actual time=0.841..0.841 rows=0 loops=1)
                                             ->  Bitmap Index Scan on
person_bulstat_lower_idx  (cost=0.00..25.28 rows=1199 width=0) (actual
time=0.709..0.709 rows=1135 loops=1)
                                                   Index Cond:
(lower(bulstat) = '999999999'::text)
                                             ->  Bitmap Index Scan on
person_bulstat_lower_idx  (cost=0.00..4.86 rows=77 width=0) (actual
time=0.124..0.124 rows=22 loops=1)
                                                   Index Cond:
(lower(bulstat) = 'bg999999999'::text)
                     ->  Index Scan using domeini_pkey on domeini d
(cost=0.00..4.36 rows=1 width=12) (actual time=0.043..0.046 rows=1
loops=1250)
                           Index Cond: (d.id = dp.domain_id)
               ->  Hash  (cost=21.31..21.31 rows=184 width=4) (actual
time=1.380..1.380 rows=184 loops=1)
                     ->  Seq Scan on domain_status ds  (cost=0.00..21.31
rows=184 width=4) (actual time=0.316..0.942 rows=184 loops=1)
                           Filter: (is_removed = 0)
         ->  Hash  (cost=1026.01..1026.01 rows=59101 width=4) (actual
time=277.161..277.161 rows=59027 loops=1)
               ->  Seq Scan on request r  (cost=0.00..1026.01 rows=59101
width=4) (actual time=0.075..131.951 rows=59027 loops=1)
 Total runtime: 2429.603 ms
(26 rows)


Kaloyan Iliev Iliev wrote:
> Hello again,
> I have another query which performance drops drastically after PG
> upgrade.
> I can not improve the plan no matter how hard I try. I try creating
> new indexes and rewrite the query with JOIN .. ON instead of commas
> but nothing happens.
> I will appreciate any suggestions.
> Best regards,
> Kaloyan Iliev
>
> ==========================VERSION
> 8.2.15===================================================
>
>
> regbgrgr=# SELECT version();
>
> version
> ---------------------------------------------------------------------------------------------------
>
> PostgreSQL 8.2.15 on amd64-portbld-freebsd7.2, compiled by GCC cc
> (GCC) 4.2.1 20070719  [FreeBSD]
> (1 row)
>
> regbgrgr=# explain analyze SELECT
>
> COUNT (D.id) as all_domains_count
>                                                                 FROM
>
> domeini as D,
>
> domainperson as DP,
>
> person as P,
>
> request as R,
>
> domain_status as DS
>                                                                WHERE
>
> R.number = D.request_number AND
>
> D.domain_status_id = DS.id AND
>
> DS.is_removed = 0 AND
>
> D.id = DP.domain_id AND
>
> DP.dp_type_id = 1 AND
>
> DP.person1_id = P.id AND  ( LOWER (P.bulstat) = LOWER ('999999999') OR
> LOWER (P.bulstat) = 'bg'||'999999999');
>
> QUERY
> PLAN
>
>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

>
> Aggregate  (cost=138.30..138.31 rows=1 width=4) (actual
> time=0.804..0.806 rows=1 loops=1)
>   ->  Nested Loop  (cost=74.70..138.29 rows=5 width=4) (actual
> time=0.797..0.797 rows=0 loops=1)
>         ->  Nested Loop  (cost=74.70..136.88 rows=5 width=8) (actual
> time=0.793..0.793 rows=0 loops=1)
>               ->  Nested Loop  (cost=74.70..135.44 rows=5 width=12)
> (actual time=0.791..0.791 rows=0 loops=1)
>                     ->  Hash Join  (cost=74.70..122.42 rows=5 width=4)
> (actual time=0.787..0.787 rows=0 loops=1)
>                           Hash Cond: (dp.person1_id = p.id)
>                           ->  Bitmap Heap Scan on domainperson dp
> (cost=19.91..65.81 rows=472 width=8) (actual time=0.088..0.088 rows=1
> loops=1)
>                                 Recheck Cond: (dp_type_id = 1)
>                                 ->  Bitmap Index Scan on
> domainperson_admin_person_uidx  (cost=0.00..19.79 rows=472 width=0)
> (actual time=0.071..0.071 rows=474 loops=1)
>                                       Index Cond: (dp_type_id = 1)
>                           ->  Hash  (cost=54.62..54.62 rows=14
> width=4) (actual time=0.678..0.678 rows=0 loops=1)
>                                 ->  Seq Scan on person p
> (cost=0.00..54.62 rows=14 width=4) (actual time=0.675..0.675 rows=0
> loops=1)
>                                       Filter: ((lower(bulstat) =
> '999999999'::text) OR (lower(bulstat) = 'bg999999999'::text))
>                     ->  Index Scan using domeini_pkey on domeini d
> (cost=0.00..2.59 rows=1 width=12) (never executed)
>                           Index Cond: (d.id = dp.domain_id)
>               ->  Index Scan using domain_status_pkey on domain_status
> ds  (cost=0.00..0.27 rows=1 width=4) (never executed)
>                     Index Cond: (d.domain_status_id = ds.id)
>                     Filter: (is_removed = 0)
>         ->  Index Scan using request_pkey on request r
> (cost=0.00..0.27 rows=1 width=4) (never executed)
>               Index Cond: (r.number = d.request_number)
> Total runtime: 0.926 ms
> (21 rows)
>
> regbgrgr=# SHOW default_statistics_target ;
> default_statistics_target
> ---------------------------
> 10
> (1 row)
>
>
> ==========================VERSION
> 8.4.4===================================================
> regbgrgr=# select version ();
>
> version
> ----------------------------------------------------------------------------------------------------------
>
> PostgreSQL 8.4.4 on amd64-portbld-freebsd8.1, compiled by GCC cc (GCC)
> 4.2.1 20070719  [FreeBSD], 64-bit
> (1 row)
>
> regbgrgr=# explain analyze SELECT
>
> COUNT (D.id) as all_domains_count
>                                                                 FROM
>
> domeini as D,
>
> domainperson as DP,
>
> person as P,
>
> request as R,
>
> domain_status as DS
>                                                                WHERE
>
> R.number = D.request_number AND
>
> D.domain_status_id = DS.id AND
>
> DS.is_removed = 0 AND
>
> D.id = DP.domain_id AND
>
> DP.dp_type_id = 1 AND
>
> DP.person1_id = P.id AND  ( LOWER (P.bulstat) = LOWER ('999999999') OR
> LOWER (P.bulstat) = 'bg'||'999999999');
>
> QUERY
> PLAN
>
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

>
> Aggregate  (cost=61113.19..61113.20 rows=1 width=4) (actual
> time=6013.705..6013.706 rows=1 loops=1)
>   ->  Hash Join  (cost=20859.23..61023.00 rows=36075 width=4) (actual
> time=4553.945..6013.098 rows=598 loops=1)
>         Hash Cond: (d.request_number = r.number)
>         ->  Hash Join  (cost=18796.01..57800.47 rows=36075 width=8)
> (actual time=4177.313..5646.153 rows=598 loops=1)
>               Hash Cond: (d.domain_status_id = ds.id)
>               ->  Hash Join  (cost=18778.40..57286.82 rows=36075
> width=12) (actual time=4176.838..5643.637 rows=1357 loops=1)
>                     Hash Cond: (dp.domain_id = d.id)
>                     ->  Hash Join  (cost=4671.42..40710.39 rows=36080
> width=4) (actual time=3210.201..4621.977 rows=1357 loops=1)
>                           Hash Cond: (dp.person1_id = p.id)
>                           ->  Seq Scan on domainperson dp
> (cost=0.00..33976.29 rows=272302 width=8) (actual time=0.026..1128.230
> rows=279008 loops=1)
>                                 Filter: (dp_type_id = 1)
>                           ->  Hash  (cost=4634.39..4634.39 rows=2962
> width=4) (actual time=3210.050..3210.050 rows=1263 loops=1)
>                                 ->  Bitmap Heap Scan on person p
> (cost=64.33..4634.39 rows=2962 width=4) (actual time=114.401..3206.440
> rows=1263 loops=1)
>                                       Recheck Cond: ((lower(bulstat) =
> '999999999'::text) OR (lower(bulstat) = 'bg999999999'::text))
>                                       ->  BitmapOr  (cost=64.33..64.33
> rows=2969 width=0) (actual time=95.115..95.115 rows=0 loops=1)
>                                             ->  Bitmap Index Scan on
> person_bulstat_lower_idx  (cost=0.00..31.43 rows=1485 width=0) (actual
> time=33.525..33.525 rows=1241 loops=1)
>                                                   Index Cond:
> (lower(bulstat) = '999999999'::text)
>                                             ->  Bitmap Index Scan on
> person_bulstat_lower_idx  (cost=0.00..31.43 rows=1485 width=0) (actual
> time=61.584..61.584 rows=22 loops=1)
>                                                   Index Cond:
> (lower(bulstat) = 'bg999999999'::text)
>                     ->  Hash  (cost=8728.77..8728.77 rows=309377
> width=12) (actual time=957.267..957.267 rows=309410 loops=1)
>                           ->  Seq Scan on domeini d
> (cost=0.00..8728.77 rows=309377 width=12) (actual time=0.015..563.414
> rows=309410 loops=1)
>               ->  Hash  (cost=15.31..15.31 rows=184 width=4) (actual
> time=0.455..0.455 rows=184 loops=1)
>                     ->  Seq Scan on domain_status ds
> (cost=0.00..15.31 rows=184 width=4) (actual time=0.009..0.252 rows=184
> loops=1)
>                           Filter: (is_removed = 0)
>         ->  Hash  (cost=1030.43..1030.43 rows=62943 width=4) (actual
> time=356.134..356.134 rows=62815 loops=1)
>               ->  Seq Scan on request r  (cost=0.00..1030.43
> rows=62943 width=4) (actual time=10.902..275.137 rows=62815 loops=1)
> Total runtime: 6014.029 ms
> (27 rows)
>
> regbgrgr=# show default_statistics_target ;
> default_statistics_target
> ---------------------------
> 100
> (1 row)
>
>

Re: Question about LEFT JOIN and query plan

From
"Kevin Grittner"
Date:
Kaloyan Iliev Iliev <kaloyan@digsys.bg> wrote:

> The 8.2.15 plan was on an empty database.
> On a full database the plan was almost the same. So the question
> is could I speed up the plan?

Since this is an entirely new query which doesn't include a LEFT
JOIN, it's not good to just tack it onto the other thread.  Could
you please re-post with an appropriate subject and a little more
information?:

http://wiki.postgresql.org/wiki/SlowQueryQuestions

Schema (including indexes), configuration settings, and hardware
(CPUs, storage system, and RAM) can be particularly significant.

Also, if you could *attach* the EXPLAIN ANALYZE output instead of
pasting it within the email, you'd save time for those trying to
read it -- emails tend to get word-wrapped in a way which makes them
hard to read without manual reformatting.

Thanks,

-Kevin

Re: Question about LEFT JOIN and query plan

From
"Kevin Grittner"
Date:
>Kaloyan Iliev Iliev <kaloyan@digsys.bg> wrote:
> Kevin Grittner wrote:

>> Out of curiosity, what happens if you consistently use JOIN
>> clauses, rather than mixing that with commas?:

> The plan improves. So can you explain why?

Commas in a FROM clause bind more loosely than JOIN clauses,
rearrangement from one side of an outer join to the other is a bit
tricky, and the *_collapse_limit settings (which you have not shown)
can affect how much JOIN rearrangement is done for a complex query.
On a quick scan over your query it didn't appear that the
rearrangement would break anything, so I wondered whether the
planner might do better if you made its job a bit easier by putting
the inner joins all on the left to start with and putting the tables
closer to the order of efficient access.

If you still see this difference with very high collapse limits,
your example might be a good one to support further work on the
optimizer; but it would be more useful for that if you could create
a synthetic case to demonstrate the problem -- starting with
creation of tables, data, and indexes on which the different forms
of the query yielded different plans.

-Kevin