Re: Help with tuning this query (with explain analyze finally) - Mailing list pgsql-performance
From | Ken Egervari |
---|---|
Subject | Re: Help with tuning this query (with explain analyze finally) |
Date | |
Msg-id | 004601c51fbe$828bbb50$6a01a8c0@KEN Whole thread Raw |
In response to | Help with tuning this query ("Ken Egervari" <ken@upfactor.com>) |
Responses |
Re: Help with tuning this query (with explain analyze finally)
|
List | pgsql-performance |
Josh, >1) To determine your query order ala Dan Tow and drive off of person, >please >SET JOIN_COLLAPSE_LIMIT = 1 and then run Mark Kirkwood's version of the >query. (Not that I believe in Dan Tow ... see previous message ... but it >would be interesting to see the results. Unfortunately, the query still takes 250 milliseconds. I tried it with other queries and the results are the same as before. Here is the explain analayze anyway: Sort (cost=2036.83..2036.87 rows=16 width=103) (actual time=328.000..328.000 rows=39 loops=1) Sort Key: cs.date -> Nested Loop Left Join (cost=620.61..2036.51 rows=16 width=103) (actual time=250.000..328.000 rows=39 loops=1) -> Hash Join (cost=620.61..1984.90 rows=16 width=78) (actual time=250.000..328.000 rows=39 loops=1) Hash Cond: ("outer".carrier_code_id = "inner".id) -> Merge Join (cost=606.11..1965.99 rows=825 width=74) (actual time=250.000..328.000 rows=310 loops=1) Merge Cond: ("outer".current_status_id = "inner".id) -> Index Scan using shipment_current_status_id_idx on shipment s (cost=0.00..2701.26 rows=60307 width=66) (actual time=0.000..77.000 rows=27711 loops=1) Filter: (is_purged = false) -> Sort (cost=606.11..610.50 rows=1756 width=12) (actual time=141.000..141.000 rows=6902 loops=1) Sort Key: cs.id -> Hash Join (cost=1.11..511.48 rows=1756 width=12) (actual time=0.000..109.000 rows=6902 loops=1) Hash Cond: ("outer".release_code_id = "inner".id) -> Index Scan Backward using current_status_date_idx on shipment_status cs (cost=0.01..422.58 rows=14047 width=16) (actual time=0.000..78.000 rows=14925 loops=1) Index Cond: ((date >= (('now'::text)::date - 31)) AND (date <= ('now'::text)::date)) Filter: (cs.* IS NOT NULL) -> Hash (cost=1.10..1.10 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=1) -> Seq Scan on release_code rc (cost=0.00..1.10 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1) Filter: ((number)::text = '9'::text) -> Hash (cost=14.49..14.49 rows=2 width=8) (actual time=0.000..0.000 rows=0 loops=1) -> Nested Loop (cost=6.87..14.49 rows=2 width=8) (actual time=0.000..0.000 rows=2 loops=1) -> Index Scan using person_pkey on person p (cost=0.00..5.73 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1) Index Cond: (id = 355) -> Hash Join (cost=6.87..8.74 rows=2 width=8) (actual time=0.000..0.000 rows=2 loops=1) Hash Cond: ("outer".carrier_id = "inner".carrier_id) -> Seq Scan on carrier_code cc (cost=0.00..1.57 rows=57 width=8) (actual time=0.000..0.000 rows=57 loops=1) -> Hash (cost=6.86..6.86 rows=1 width=12) (actual time=0.000..0.000 rows=0 loops=1) -> Hash Join (cost=3.04..6.86 rows=1 width=12) (actual time=0.000..0.000 rows=1 loops=1) Hash Cond: ("outer".id = "inner".carrier_id) -> Seq Scan on carrier c (cost=0.00..3.54 rows=54 width=4) (actual time=0.000..0.000 rows=54 loops=1) -> Hash (cost=3.04..3.04 rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=1) -> Index Scan using carrier_to_person_person_id_idx on carrier_to_person ctp (cost=0.00..3.04 rows=1 width=8) (actual time=0.000..0.000 rows=1 loops=1) Index Cond: (355 = person_id) -> Index Scan using shipment_status_shipment_id_idx on shipment_status ss (cost=0.00..3.20 rows=2 width=25) (actual time=0.000..0.000 rows=1 loops=39) Index Cond: (ss.shipment_id = "outer".id) Total runtime: 328.000 ms >2) Force PG to drop the merge join via SET ENABLE_MERGEJOIN = FALSE; Setting this option had no effect either In fact, the query is a bit slower (266 milliseconds but 250 came up once in 20 executions). >Also, please let us know some about the server you're using and your >configuration parameters, particularly: >shared_buffers >work_mem >effective_cache_size >random_page_cost Well, I'm on a test machine so the settings haven't changed one bit from the defaults. This may sound embarrassing, but I bet the production server is not custom configured either. The computer I'm running these queries on is just a simple Athon XP 2100+ on WinXP with 1GB of RAM. The production server is a faster P4, but the rest is the same. Here are the 4 values in my configuration, but 3 of them were commented: shared_buffers = 1000 #work_mem = 1024 #effective_cache_size = 1000 #random_page_cost = 4 I'm not sure what these do, but I'm guessing the last 2 affect the planner to do different things with the statistics. Should I increase the first two? Regards, Ken
pgsql-performance by date: