Re: Performance Optimization for Dummies 2 - the SQL - Mailing list pgsql-performance

From Carlo Stonebanks
Subject Re: Performance Optimization for Dummies 2 - the SQL
Date
Msg-id egua6n$1fv4$1@news.hub.org
Whole thread Raw
In response to Performance Optimization for Dummies 2 - the SQL  ("Carlo Stonebanks" <stonec.register@sympatico.ca>)
Responses Re: Performance Optimization for Dummies 2 - the SQL
Re: Performance Optimization for Dummies 2 - the SQL
Re: Performance Optimization for Dummies 2 - the SQL
List pgsql-performance
Hi Merlin,

Well, I'm back. first of all, thanks for your dogged determination to help
me out - it is much appreciated. I owe you a beer or twelve.

The import has been running for a week. The import program got faster as I
tuned things. I capture the dynamic SQL statements generated by the app, as
well as an accompanying EXPLAIN - and put it out to an XML file. I turned
off seq scan in the config, and ran a trial import. I knew that with seq
scan off that if I saw a seq scan in my log, it's because there were no
indexes available to satisfy the query - I adjusted accordingly and this
worked really well.

When the import runs against an empty or small db, it's blisteringly fast
(considering that it's a heauristically based process). This proved that it
wasn't the app or the SQL connection that was slow. Once again, though, as
the data db grows, it slows down. Now it's crawling again. All of the
queries appear to be fine, taking advantage of the indexes. There is ONE
query, though, that seems to be the troublemaker - the same one I had
brought up before. I believe that it is one sub-query that is causing the
problem, taking what appears to be 500 to 1000+ms to run every time. (See
below).

Curiously, it's using index scans, and it really looks like a simple query
to me. I am completely baffled. The two tables in question have about 800K
rows each - not exactly an incredible number. The EXPLAIN is simple, but the
performance is dreadful. All the other queries run much faster than this -
does ANYTHING about this query strike you as odd?

Carlo

/*
Find all facilities that do not have full address information
but do have default location information that indicates
its the facilitiy's US zip code.
NULL values cast as columns are placeholders to allow
this sub-query to be unioned with another subquery
that contains full address data
*/
select
    f.facility_id,
    null as facility_address_id,
    null as address_id,
    f.facility_type_code,
    f.name,
    null as address,
    f.default_city as city,
    f.default_state_code as state_code,
    f.default_postal_code as postal_code,
    f.default_country_code as country_code,
    null as parsed_unit
from
    mdx_core.facility as f
left outer join mdx_core.facility_address as fa
    on fa.facility_id = f.facility_id
where
     facility_address_id is null
     and f.default_country_code = 'US'
     and (f.default_postal_code = '14224-1945' or f.default_postal_code =
'14224')

"Nested Loop Left Join  (cost=22966.70..23594.84 rows=93 width=71) (actual
time=662.075..662.075 rows=0 loops=1)"
"  Filter: ("inner".facility_address_id IS NULL)"
"  ->  Bitmap Heap Scan on facility f  (cost=22966.70..23231.79 rows=93
width=71) (actual time=661.907..661.929 rows=7 loops=1)"
"        Recheck Cond: (((default_country_code = 'US'::bpchar) AND
((default_postal_code)::text = '14224-1945'::text)) OR
((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text =
'14224'::text)))"
"        ->  BitmapOr  (cost=22966.70..22966.70 rows=93 width=0) (actual
time=661.891..661.891 rows=0 loops=1)"
"              ->  Bitmap Index Scan on
facility_country_state_postal_code_idx  (cost=0.00..11483.35 rows=47
width=0) (actual time=374.284..374.284 rows=7 loops=1)"
"                    Index Cond: ((default_country_code = 'US'::bpchar) AND
((default_postal_code)::text = '14224-1945'::text))"
"              ->  Bitmap Index Scan on
facility_country_state_postal_code_idx  (cost=0.00..11483.35 rows=47
width=0) (actual time=287.599..287.599 rows=0 loops=1)"
"                    Index Cond: ((default_country_code = 'US'::bpchar) AND
((default_postal_code)::text = '14224'::text))"
"  ->  Index Scan using facility_address_facility_address_address_type_idx
on facility_address fa  (cost=0.00..3.89 rows=1 width=8) (actual
time=0.014..0.016 rows=1 loops=7)"
"        Index Cond: (fa.facility_id = "outer".facility_id)"
"Total runtime: 662.203 ms"
>



pgsql-performance by date:

Previous
From: Tobias Brox
Date:
Subject: Re: [ADMIN] autovacuum on a -mostly- r/o table
Next
From: Tom Lane
Date:
Subject: Re: Performance Optimization for Dummies 2 - the SQL