Fwd: Help with view performance problem - Mailing list pgsql-performance

From Chris Hoover
Subject Fwd: Help with view performance problem
Date
Msg-id 1d219a6f0507280738791f0ea3@mail.gmail.com
Whole thread Raw
Responses Re: Fwd: Help with view performance problem
List pgsql-performance
Does anyone have any suggestions on this?  I did not get any response
from the admin list.

Thanks,

Chris

---------- Forwarded message ----------
From: Chris Hoover <revoohc@gmail.com>
Date: Jul 27, 2005 12:29 PM
Subject: Re: Help with view performance problem
To: pgsql-admin@postgresql.org


I did some more testing, and ran the explain analyze on the problem.
In my session I did a set enable_hashjoin = false and then ran the
analyze.  This caused it to use the indexes as I have been expecting
it to do.

Now, how can I get it to use the indexes w/o manipulating the
environment?  What make postgresql want to sequentially scan and use a
hash join?

thanks,

Chris

explain analyze with set_hashjoin=false;
prob_db=#explain analyze select * from clm_com;



                            QUERY PLAN




----------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Subquery Scan clm_com  (cost=1057975.45..1169021.26 rows=126910
width=366) (actual time=142307.99..225997.22 rows=1268649 loops=1)
   ->  Unique  (cost=1057975.45..1169021.26 rows=126910 width=366)
(actual time=142307.96..206082.30 rows=1268649 loops=1)
         ->  Sort  (cost=1057975.45..1061148.19 rows=1269095
width=366) (actual time=142307.95..156019.01 rows=1268649 loops=1)
               Sort Key: clmcom1.inv_nbr, clmcom1.inv_qfr,
clmcom1.pat_addr_1, clmcom1.pat_addr_2, clmcom1.pat_city,
clmcom1.pat_cntry, clmcom1.pat_dob, clmcom1.pat_gender_cd,
clmcom1.pat_info_pregnancy_ind, clmcom1.pat_state, clmcom1.pat_suffix,
clmcom1.pat_zip, clmcom1.payto_addr_1,
clmcom1.payto_addr_2, clmcom1.payto_city, clmcom1.payto_cntry,
clmcom1.payto_f_name, clmcom1.payto_m_name, clmcom1.payto_state,
clmcom1.payto_zip, clmcom1.clm_tot_clm_chgs, clmcom1.bill_l_name_org,
clmcom1.clm_delay_rsn_cd, clmcom1.clm_submit_rsn_cd,
clmcom1.payto_l_name_org, clmcom1.payto_prim_id, clmcom1.bill_prim_id,
clmcom1.clm_tot_ncov_chgs, clmcom2.contract_amt,
clmcom2.svc_fac_or_lab_name, clmcom2.svc_fac_addr_1,
clmcom2.svc_fac_addr_2, clmcom2.svc_fac_city, clmcom2.svc_fac_zip
               ->  Merge Join  (cost=0.00..565541.46 rows=1269095
width=366) (actual time=464.89..130638.06 rows=1268649 loops=1)
                     Merge Cond: ("outer".inv_nbr = "inner".inv_nbr)
                     Join Filter: ("outer".inv_qfr = "inner".inv_qfr)
                     ->  Index Scan using clmcom1_inv_nbr_iview_idx on
clmcom1  (cost=0.00..380534.32 rows=1269095 width=270) (actual
time=0.27..82159.37 rows=1268649 loops=1)
                     ->  Index Scan using clmcom2_inv_nbr_iview_idx on
clmcom2  (cost=0.00..159636.25 rows=1271198 width=96) (actual
time=464.56..21774.02 rows=1494019 loops=1)
 Total runtime: 227369.39 msec
(10 rows)



On 7/27/05, Chris Hoover <revoohc@gmail.com> wrote:
> I am having a problem with a view on one of my db's.  This view is
> trying to sequentially can the 2 tables it is accessing.  However,
> when I explain the view on most of my other db's (all have the same
> schema's), it is using the indexes.  Can anyone please help me
> understand why postgres is choosing to sequenially scan both tables?
>
> Both tables in the view have a primary key defined on inv_nbr,
> inv_qfr.  Vacuum and analyze have been run on the tables in question
> to try and make sure stats are up to date.
>
> Thanks,
>
> Chris
> PG - 7.3.4
> RH 2.1
>
>
> Here is the view definition:
> SELECT DISTINCT clmcom1.inv_nbr AS inventory_number,
>                               clmcom1.inv_qfr AS inventory_qualifier,
>                               clmcom1.pat_addr_1 AS patient_address_1,
>                               clmcom1.pat_addr_2 AS patient_address_2,
>                               clmcom1.pat_city AS patient_city,
>                               clmcom1.pat_cntry AS patient_country,
>                               clmcom1.pat_dob AS patient_date_of_birth,
>                               clmcom1.pat_gender_cd AS patient_gender_code,
>                               clmcom1.pat_info_pregnancy_ind AS pregnancy_ind,
>                               clmcom1.pat_state AS patient_state,
>                               clmcom1.pat_suffix AS patient_suffix,
>                               clmcom1.pat_zip AS patient_zip_code,
>                               clmcom1.payto_addr_1 AS payto_address_1,
>                               clmcom1.payto_addr_2 AS payto_address_2,
>                               clmcom1.payto_city,
>                               clmcom1.payto_cntry AS payto_country,
>                               clmcom1.payto_f_name AS payto_first_name,
>                               clmcom1.payto_m_name AS payto_middle_name,
>                               clmcom1.payto_state,
>                               clmcom1.payto_zip AS payto_zip_code,
>                               clmcom1.clm_tot_clm_chgs AS total_claim_charge,
>                               clmcom1.bill_l_name_org AS
> billing_last_name_or_org,
>                               clmcom1.clm_delay_rsn_cd AS
> claim_delay_reason_code,
>                               clmcom1.clm_submit_rsn_cd AS
> claim_submit_reason_code,
>                               clmcom1.payto_l_name_org AS
> payto_last_name_or_org,
>                               clmcom1.payto_prim_id AS payto_primary_id,
>                               clmcom1.bill_prim_id AS billing_prov_primary_id,
>                               clmcom1.clm_tot_ncov_chgs AS total_ncov_charge,
>                               clmcom2.contract_amt AS contract_amount,
>                               clmcom2.svc_fac_or_lab_name,
>                               clmcom2.svc_fac_addr_1 AS svc_fac_address_1,
>                               clmcom2.svc_fac_addr_2 AS svc_fac_address_2,
>                               clmcom2.svc_fac_city,
>                               clmcom2.svc_fac_zip AS svc_fac_zip_code
> FROM (clmcom1 LEFT JOIN clmcom2 ON (((clmcom1.inv_nbr =
> clmcom2.inv_nbr) AND
>
> (clmcom1.inv_qfr = clmcom2.inv_qfr))))
> ORDER BY clmcom1.inv_nbr,
>                   clmcom1.inv_qfr,
>                   clmcom1.pat_addr_1,
>                   clmcom1.pat_addr_2,
>                   clmcom1.pat_city,
>                   clmcom1.pat_cntry,
>                   clmcom1.pat_dob,
>                   clmcom1.pat_gender_cd,
>                   clmcom1.pat_info_pregnancy_ind,
>                   clmcom1.pat_state,
>                   clmcom1.pat_suffix,
>                   clmcom1.pat_zip,
>                   clmcom1.payto_addr_1,
>                   clmcom1.payto_addr_2,
>                   clmcom1.payto_city,
>                   clmcom1.payto_cntry,
>                   clmcom1.payto_f_name,
>                   clmcom1.payto_m_name,
>                   clmcom1.payto_state,
>                   clmcom1.payto_zip,
>                   clmcom1.clm_tot_clm_chgs,
>                   clmcom1.bill_l_name_org,
>                   clmcom1.clm_delay_rsn_cd,
>                   clmcom1.clm_submit_rsn_cd,
>                   clmcom1.payto_l_name_org,
>                   clmcom1.payto_prim_id,
>                   clmcom1.bill_prim_id,
>                   clmcom1.clm_tot_ncov_chgs,
>                   clmcom2.contract_amt,
>                   clmcom2.svc_fac_or_lab_name,
>                   clmcom2.svc_fac_addr_1,
>                   clmcom2.svc_fac_addr_2,
>                   clmcom2.svc_fac_city,
>                   clmcom2.svc_fac_zip;
>
>
> Here is the explain analyze from the problem db:
> prob_db=# explain analyze select * from clm_com;
>
>
>
>                           QUERY PLAN
>
>
>
>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Subquery Scan clm_com  (cost=1039824.35..1150697.61 rows=126712
> width=367) (actual time=311792.78..405819.03 rows=1266114 loops=1)
>    ->  Unique  (cost=1039824.35..1150697.61 rows=126712 width=367)
> (actual time=311792.74..386313.14 rows=1266114 loops=1)
>          ->  Sort  (cost=1039824.35..1042992.16 rows=1267123
> width=367) (actual time=311792.74..338189.48 rows=1266114 loops=1)
>                Sort Key: clmcom1.inv_nbr, clmcom1.inv_qfr,
> clmcom1.pat_addr_1, clmcom1.pat_addr_2, clmcom1.pat_city,
> clmcom1.pat_cntry, clmcom1.pat_dob, clmcom1.pat_gender_cd,
> clmcom1.pat_info_pregnancy_ind, clmcom1.pat_state, clmcom1.pat_suffix,
> clmcom1.pat_zip, clmcom1.payto_addr_1,
> clmcom1.payto_addr_2, clmcom1.payto_city, clmcom1.payto_cntry,
> clmcom1.payto_f_name, clmcom1.payto_m_name, clmcom1.payto_state,
> clmcom1.payto_zip, clmcom1.clm_tot_clm_chgs, clmcom1.bill_l_name_org,
> clmcom1.clm_delay_rsn_cd, clmcom1.clm_submit_rsn_cd,
> clmcom1.payto_l_name_org, clmcom1.payto_prim_id, clmcom1.bill_prim_id,
> clmcom1.clm_tot_ncov_chgs, clmcom2.contract_amt,
> clmcom2.svc_fac_or_lab_name, clmcom2.svc_fac_addr_1,
> clmcom2.svc_fac_addr_2, clmcom2.svc_fac_city, clmcom2.svc_fac_zip
>                ->  Hash Join  (cost=132972.78..548171.70 rows=1267123
> width=367) (actual time=16999.32..179359.43 rows=1266114 loops=1)
>                      Hash Cond: ("outer".inv_nbr = "inner".inv_nbr)
>                      Join Filter: ("outer".inv_qfr = "inner".inv_qfr)
>                      ->  Seq Scan on clmcom1  (cost=0.00..267017.23
> rows=1267123 width=271) (actual time=0.11..84711.83 rows=1266114
> loops=1)
>                      ->  Hash  (cost=111200.82..111200.82 rows=1269582
> width=96) (actual time=16987.45..16987.45 rows=0 loops=1)
>                            ->  Seq Scan on clmcom2
> (cost=0.00..111200.82 rows=1269582 width=96) (actual
> time=0.07..12164.81 rows=1266108 loops=1)
>  Total runtime: 407317.47 msec
> (11 rows)
> ~
>
>
>
> Here is the explain analyze from a good db (on the same postgres cluster);
> good_db=# explain analyze select * from clm_com;
>
>
>
>
>
>                             QUERY PLAN
>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Subquery Scan clm_com  (cost=78780.59..89498.29 rows=12249 width=359)
> (actual time=73045.36..79974.37 rows=122494 loops=1)
>    ->  Unique  (cost=78780.59..89498.29 rows=12249 width=359) (actual
> time=73045.28..78031.99 rows=122494 loops=1)
>          ->  Sort  (cost=78780.59..79086.81 rows=122488 width=359)
> (actual time=73045.28..73362.94 rows=122494 loops=1)
>                Sort Key: clmcom1.inv_nbr, clmcom1.inv_qfr,
> clmcom1.pat_addr_1, clmcom1.pat_addr_2, clmcom1.pat_city,
> clmcom1.pat_cntry, clmcom1.pat_dob, clmcom1.pat_gender_cd,
> clmcom1.pat_info_pregnancy_ind, clmcom1.pat_state, clmcom1.pat_suffix,
> clmcom1.pat_zip, clmcom1.payto_addr_1, clmcom1.payto_addr_2,
> clmcom1.payto_city, clmcom1.payto_cntry, clmcom1.payto_f_name,
> clmcom1.payto_m_name, clmcom1.payto_state, clmcom1.payto_zip,
> clmcom1.clm_tot_clm_chgs, clmcom1.bill_l_name_org,
> clmcom1.clm_delay_rsn_cd, clmcom1.clm_submit_rsn_cd,
> clmcom1.payto_l_name_org, clmcom1.payto_prim_id, clmcom1.bill_prim_id,
> clmcom1.clm_tot_ncov_chgs, clmcom2.contract_amt,
> clmcom2.svc_fac_or_lab_name, clmcom2.svc_fac_addr_1,
> clmcom2.svc_fac_addr_2, clmcom2.svc_fac_city, clmcom2.svc_fac_zip
>                ->  Merge Join  (cost=0.00..56945.12 rows=122488
> width=359) (actual time=54.76..71635.65 rows=122494 loops=1)
>                      Merge Cond: (("outer".inv_nbr = "inner".inv_nbr)
> AND ("outer".inv_qfr = "inner".inv_qfr))
>                      ->  Index Scan using clmcom1_pkey on clmcom1
> (cost=0.00..38645.61 rows=122488 width=267) (actual
> time=25.60..49142.16 rows=122494 loops=1)
>                      ->  Index Scan using clmcom2_pkey on clmcom2
> (cost=0.00..16004.08 rows=122488 width=92) (actual
> time=29.09..19418.94 rows=122494 loops=1)
>  Total runtime: 80162.26 msec
> (9 rows)
>

pgsql-performance by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: Finding bottleneck
Next
From: Dan Harris
Date:
Subject: Re: Fwd: Help with view performance problem