Re: Performance Optimization for Dummies 2 - the SQL - Mailing list pgsql-performance
From | Carlo Stonebanks |
---|---|
Subject | Re: Performance Optimization for Dummies 2 - the SQL |
Date | |
Msg-id | eh0fmq$mpp$1@news.hub.org Whole thread Raw |
In response to | Performance Optimization for Dummies 2 - the SQL ("Carlo Stonebanks" <stonec.register@sympatico.ca>) |
Responses |
Re: Performance Optimization for Dummies 2 - the SQL
|
List | pgsql-performance |
> Can you try temporarily disabling bitmap scans and see what comes up? Well, that's slowing everything down. I've got a couple of results, below 1) Bitmap scan off, but seq scan enabled. 2) Bitmap scan and seq scan off 3) Bitmap scan back on, seq scan back on, and a new index created 4) VACUUM VERBOSE on the tables involved 5) Original SQL with original EXPLAIN to show the code that started this. Carlo 1) Bitmap scan off, but seq scan enabled. It created a suprisingly expensive seq scan. "Nested Loop Left Join (cost=0.00..34572.43 rows=109 width=71) (actual time=1536.827..1536.827 rows=0 loops=1)" " Filter: ("inner".facility_address_id IS NULL)" " -> Seq Scan on facility f (cost=0.00..34146.91 rows=109 width=71) (actual time=621.100..1536.606 rows=7 loops=1)" " Filter: ((default_country_code = 'US'::bpchar) AND (((default_postal_code)::text = '14224-1945'::text) OR ((default_postal_code)::text = '14224'::text)))" " -> Index Scan using facility_address_facility_address_address_type_idx on facility_address fa (cost=0.00..3.89 rows=1 width=8) (actual time=0.020..0.023 rows=1 loops=7)" " Index Cond: (fa.facility_id = "outer".facility_id)" "Total runtime: 1536.957 ms" 2) So I turned both bitmap scan and seq scan off - now we get index scans, the performance is suprisingly horrible: "Nested Loop Left Join (cost=0.00..39286.55 rows=109 width=71) (actual time=3598.462..3598.462 rows=0 loops=1)" " Filter: ("inner".facility_address_id IS NULL)" " -> Index Scan using facility_pkey on facility f (cost=0.00..38861.03 rows=109 width=71) (actual time=1500.690..3598.201 rows=7 loops=1)" " Filter: ((default_country_code = 'US'::bpchar) AND (((default_postal_code)::text = '14224-1945'::text) OR ((default_postal_code)::text = '14224'::text)))" " -> Index Scan using facility_address_facility_address_address_type_idx on facility_address fa (cost=0.00..3.89 rows=1 width=8) (actual time=0.024..0.027 rows=1 loops=7)" " Index Cond: (fa.facility_id = "outer".facility_id)" "Total runtime: 3598.600 ms" 3) So I turned bitmap scan back on, seq scan back on, and created an index to EXPLICITLY to satisfy this condition. Iintuitivly, I thought that combinations of other indexes should have satisfied the optimizer, but figured better overkill than nothing. I thought this would solve it - but no. We is using a BRAND NEW INDEX which is unlikely to be corrupt so expensive? "Nested Loop Left Join (cost=25300.96..26043.67 rows=110 width=71) (actual time=1339.216..1339.216 rows=0 loops=1)" " Filter: ("inner".facility_address_id IS NULL)" " -> Bitmap Heap Scan on facility f (cost=25300.96..25614.42 rows=110 width=71) (actual time=1339.043..1339.066 rows=7 loops=1)" " Recheck Cond: (((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224-1945'::text)) OR ((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224'::text)))" " -> BitmapOr (cost=25300.96..25300.96 rows=110 width=0) (actual time=1339.027..1339.027 rows=0 loops=1)" " -> Bitmap Index Scan on facility_facility_country_state_postal_code_idx (cost=0.00..12650.48 rows=55 width=0) (actual time=796.146..796.146 rows=7 loops=1)" " Index Cond: ((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224-1945'::text))" " -> Bitmap Index Scan on facility_facility_country_state_postal_code_idx (cost=0.00..12650.48 rows=55 width=0) (actual time=542.873..542.873 rows=0 loops=1)" " Index Cond: ((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224'::text))" " -> Index Scan using facility_address_facility_address_address_type_idx on facility_address fa (cost=0.00..3.89 rows=1 width=8) (actual time=0.014..0.016 rows=1 loops=7)" " Index Cond: (fa.facility_id = "outer".facility_id)" "Total runtime: 1339.354 ms" 4) VACUUM VERBOSE on the tables involved. Note how much more painful in elapsed time it is to vacuum facility vs facility_address, even though the number of rows is comparable: INFO: vacuuming "mdx_core.facility" INFO: index "facility_pkey" now contains 964123 row versions in 3682 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.03s/0.03u sec elapsed 0.18 sec. INFO: index "facility_country_state_city_idx" now contains 964188 row versions in 7664 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.25s/0.17u sec elapsed 84.14 sec. INFO: index "facility_country_state_postal_code_idx" now contains 964412 row versions in 7689 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.42s/0.10u sec elapsed 137.12 sec. INFO: index "facility_facility_country_state_city_idx" now contains 964493 row versions in 6420 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.17s/0.09u sec elapsed 2.23 sec. INFO: index "facility_facility_country_state_postal_code_idx" now contains 964494 row versions in 6895 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.01s/0.01u sec elapsed 0.95 sec. INFO: "facility": found 0 removable, 964123 nonremovable row versions in 17398 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.90s/0.57u sec elapsed 224.80 sec. INFO: vacuuming "pg_toast.pg_toast_58570311" INFO: index "pg_toast_58570311_index" now contains 0 row versions in 1 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.01 sec. INFO: "pg_toast_58570311": found 0 removable, 0 nonremovable row versions in 0 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.01 sec. Query returned successfully with no result in 224903 ms. INFO: vacuuming "mdx_core.facility_address" INFO: index "facility_address_pkey" now contains 893157 row versions in 3411 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.17s/0.04u sec elapsed 11.10 sec. INFO: index "facility_address_address_idx" now contains 893157 row versions in 3164 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.07s/0.04u sec elapsed 0.61 sec. INFO: index "facility_address_facility_address_address_type_idx" now contains 893157 row versions in 3797 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.01s/0.00u sec elapsed 0.07 sec. INFO: "facility_address": found 0 removable, 893139 nonremovable row versions in 9210 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.26s/0.15u sec elapsed 12.14 sec. Query returned successfully with no result in 12297 ms. 5) Here is the original query, plus original explain analyze. Number of rows have increased since this was run, so the costs are lower, but still significant: /* Find all facilities that do not have full address information but do have default location information that indicates its the facilitiy's US zip code. NULL values cast as columns are placeholders to allow this sub-query to be unioned with another subquery that contains full address data */ select f.facility_id, null as facility_address_id, null as address_id, f.facility_type_code, f.name, null as address, f.default_city as city, f.default_state_code as state_code, f.default_postal_code as postal_code, f.default_country_code as country_code, null as parsed_unit from mdx_core.facility as f left outer join mdx_core.facility_address as fa on fa.facility_id = f.facility_id where facility_address_id is null and f.default_country_code = 'US' and (f.default_postal_code = '14224-1945' or f.default_postal_code = '14224') "Nested Loop Left Join (cost=22966.70..23594.84 rows=93 width=71) (actual time=662.075..662.075 rows=0 loops=1)" " Filter: ("inner".facility_address_id IS NULL)" " -> Bitmap Heap Scan on facility f (cost=22966.70..23231.79 rows=93 width=71) (actual time=661.907..661.929 rows=7 loops=1)" " Recheck Cond: (((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224-1945'::text)) OR ((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224'::text)))" " -> BitmapOr (cost=22966.70..22966.70 rows=93 width=0) (actual time=661.891..661.891 rows=0 loops=1)" " -> Bitmap Index Scan on facility_country_state_postal_code_idx (cost=0.00..11483.35 rows=47 width=0) (actual time=374.284..374.284 rows=7 loops=1)" " Index Cond: ((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224-1945'::text))" " -> Bitmap Index Scan on facility_country_state_postal_code_idx (cost=0.00..11483.35 rows=47 width=0) (actual time=287.599..287.599 rows=0 loops=1)" " Index Cond: ((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = '14224'::text))" " -> Index Scan using facility_address_facility_address_address_type_idx on facility_address fa (cost=0.00..3.89 rows=1 width=8) (actual time=0.014..0.016 rows=1 loops=7)" " Index Cond: (fa.facility_id = "outer".facility_id)" "Total runtime: 662.203 ms" >
pgsql-performance by date: