Planner problem - Mailing list pgsql-performance
From | Mikael Kjellström |
---|---|
Subject | Planner problem |
Date | |
Msg-id | 40C066C1.2090808@mksoft.nu Whole thread Raw |
Responses |
Re: Planner problem
|
List | pgsql-performance |
Hi! I am having a bit of problem with the plan that the planner produces. First a little info: OS: Linux Debian Version: PostgreSQL 7.4.2 on i686-pc-linux-gnu, compiled by GCC gcc GCC) 3.3.3 (Debian 20040401) CPU: AMD Athlon XP 2000+ Memory: 1GB Disk: 4 SCSI-3 UW 10000rpm in RAID 1 (mirror) mode shared_mem: 20000 Sort mem: 8192 effective_cache_size: 80000 Row count for the tables involved: priceavailable: 8564 product: 3 productCodeAlias: 7 locationIATA: 3402 locationConnection: 64 price: 4402 I have runned both vacuum full and vacuum analyze before I run my query and shared_mem is set to 20000 Here is the result I am getting: EXPLAIN ANALYZE SELECT CASE WHEN 'D' = 'A' THEN price.priceArrival ELSE price.priceDeparture END AS price, price.vatPercent AS vatPercent FROM priceavailable pa, product product, productCodeAlias productAlias, locationiata la, locationconnection lc, price price WHERE pa.direction IN('D', 'B') AND pa.productId = product.productId AND product.productId = productAlias.productId AND productAlias.productCode = 'TAXI' AND pa.locationConnectionId = lc.locationConnectionId AND lc.locationConnectionCode = 'RNB' AND pa.locationId = la.locationId AND la.iataCode = 'KB8' AND price.pricegroupId = pa.priceGroupId AND price.productId = product.productId AND '2004-06-01 05:30:00.000000+02' BETWEEN price.validstartdate AND price.validstopdate AND product.organizationId = 1 AND price.organizationId = product.organizationId AND pa.deletionDate IS NULL AND product.deletionDate IS NULL AND la.deletionDate IS NULL AND lc.deletionDate IS NULL AND price.deletionDate IS NULL ; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=2.14..17.16 rows=1 width=8) (actual time=3274.196..4620.299 rows=1 loops=1) -> Nested Loop (cost=2.14..13.49 rows=1 width=12) (actual time=2643.441..4617.246 rows=47 loops=1) Join Filter: (("outer".productid = "inner".productid) AND ("inner".pricegroupid = "outer".pricegroupid)) -> Nested Loop (cost=0.00..7.72 rows=1 width=12) (actual time=0.238..5.455 rows=111 loops=1) -> Seq Scan on locationconnection lc (cost=0.00..1.80 rows=1 width=4) (actual time=0.153..0.245 rows=1 loops=1) Filter: ((locationconnectioncode = 'RNB'::text) AND (deletiondate IS NULL)) -> Index Scan using priceavailableix1 on priceavailable pa (cost=0.00..5.91 rows=1 width=16) (actual time=0.067..4.182 rows=111 loops=1) Index Cond: (pa.locationconnectionid = "outer".locationconnectionid) Filter: (((direction = 'D'::text) OR (direction = 'B'::text)) AND (deletiondate IS NULL)) -> Hash Join (cost=2.14..5.74 rows=2 width=24) (actual time=0.058..39.116 rows=1243 loops=111) Hash Cond: ("outer".productid = "inner".productid) -> Index Scan using priceix2 on price (cost=0.00..3.41 rows=23 width=20) (actual time=0.042..25.811 rows=4402 loops=111) Index Cond: ('2004-06-01'::date <= validstopdate) Filter: (('2004-06-01'::date >= validstartdate) AND (deletiondate IS NULL) AND (organizationid = 1)) -> Hash (cost=2.14..2.14 rows=1 width=12) (actual time=0.132..0.132 rows=0 loops=1) -> Nested Loop (cost=0.00..2.14 rows=1 width=12) (actual time=0.088..0.123 rows=1 loops=1) Join Filter: ("outer".productid = "inner".productid) -> Seq Scan on product (cost=0.00..1.04 rows=1 width=8) (actual time=0.013..0.022 rows=3 loops=1) Filter: ((organizationid = 1) AND (deletiondate IS NULL)) -> Seq Scan on productcodealias productalias (cost=0.00..1.09 rows=1 width=4) (actual time=0.022..0.024 rows=1 loops=3) Filter: (productcode = 'TAXI'::text) -> Index Scan using locationiataix4 on locationiata la (cost=0.00..3.66 rows=1 width=4) (actual time=0.052..0.052 rows=0 loops=47) Index Cond: ("outer".locationid = la.locationid) Filter: ((iatacode = 'KB8'::text) AND (deletiondate IS NULL)) Total runtime: 4620.852 ms If I do an "set enable_nestloop = 0" and run the exact same question I get: QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=150.55..154.20 rows=1 width=8) (actual time=68.095..103.006 rows=1 loops=1) Hash Cond: ("outer".locationid = "inner".locationid) -> Hash Join (cost=145.65..149.29 rows=1 width=12) (actual time=66.304..102.709 rows=47 loops=1) Hash Cond: (("outer".productid = "inner".productid) AND ("outer".pricegroupid = "inner".pricegroupid)) -> Hash Join (cost=2.15..5.75 rows=2 width=24) (actual time=0.343..38.844 rows=1243 loops=1) Hash Cond: ("outer".productid = "inner".productid) -> Index Scan using priceix2 on price (cost=0.00..3.41 rows=23 width=20) (actual time=0.053..26.225 rows=4402 loops=1) Index Cond: ('2004-06-01'::date <= validstopdate) Filter: (('2004-06-01'::date >= validstartdate) AND (deletiondate IS NULL) AND (organizationid = 1)) -> Hash (cost=2.14..2.14 rows=1 width=12) (actual time=0.191..0.191 rows=0 loops=1) -> Hash Join (cost=1.09..2.14 rows=1 width=12) (actual time=0.174..0.184 rows=1 loops=1) Hash Cond: ("outer".productid = "inner".productid) -> Seq Scan on product (cost=0.00..1.04 rows=1 width=8) (actual time=0.017..0.025 rows=3 loops=1) Filter: ((organizationid = 1) AND (deletiondate IS NULL)) -> Hash (cost=1.09..1.09 rows=1 width=4) (actual time=0.073..0.073 rows=0 loops=1) -> Seq Scan on productcodealias productalias (cost=0.00..1.09 rows=1 width=4) (actual time=0.042..0.045 rows=1 loops=1) Filter: (productcode = 'TAXI'::text) -> Hash (cost=143.50..143.50 rows=1 width=12) (actual time=61.650..61.650 rows=0 loops=1) -> Merge Join (cost=1.81..143.50 rows=1 width=12) (actual time=59.914..61.419 rows=111 loops=1) Merge Cond: ("outer".locationconnectionid = "inner".locationconnectionid) -> Index Scan using priceavailableix1 on priceavailable pa (cost=0.00..141.57 rows=43 width=16) (actual time=0.048..48.739 rows=6525 loops=1) Filter: (((direction = 'D'::text) OR (direction = 'B'::text)) AND (deletiondate IS NULL)) -> Sort (cost=1.81..1.81 rows=1 width=4) (actual time=0.215..0.290 rows=1 loops=1) Sort Key: lc.locationconnectionid -> Seq Scan on locationconnection lc (cost=0.00..1.80 rows=1 width=4) (actual time=0.125..0.194 rows=1 loops=1) Filter: ((locationconnectioncode = 'RNB'::text) AND (deletiondate IS NULL)) -> Hash (cost=4.89..4.89 rows=1 width=4) (actual time=0.122..0.122 rows=0 loops=1) -> Index Scan using locationiataix1 on locationiata la (cost=0.00..4.89 rows=1 width=4) (actual time=0.108..0.112 rows=1 loops=1) Index Cond: (iatacode = 'KB8'::text) Filter: (deletiondate IS NULL) Total runtime: 103.563 ms Which is a lot faster than the original question. If I then enable nestloop again and lower the geqo_threshold from 11 (default) to 5 and run the same query again I get: QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=7.72..17.19 rows=1 width=8) (actual time=9.752..42.278 rows=1 loops=1) -> Nested Loop (cost=7.72..13.52 rows=1 width=12) (actual time=5.698..41.079 rows=47 loops=1) Join Filter: ("outer".productid = "inner".productid) -> Nested Loop (cost=7.72..12.47 rows=1 width=28) (actual time=5.663..39.992 rows=47 loops=1) Join Filter: ("inner".productid = "outer".productid) -> Hash Join (cost=7.72..11.37 rows=1 width=24) (actual time=5.470..36.848 rows=111 loops=1) Hash Cond: (("outer".pricegroupid = "inner".pricegroupid) AND ("outer".productid = "inner".productid)) -> Index Scan using priceix2 on price (cost=0.00..3.41 rows=23 width=20) (actual time=0.050..26.475 rows=4402 loops=1) Index Cond: ('2004-06-01'::date <= validstopdate) Filter: (('2004-06-01'::date >= validstartdate) AND (deletiondate IS NULL) AND (organizationid = 1)) -> Hash (cost=7.72..7.72 rows=1 width=12) (actual time=1.809..1.809 rows=0 loops=1) -> Nested Loop (cost=0.00..7.72 rows=1 width=12) (actual time=0.253..1.587 rows=111 loops=1) -> Seq Scan on locationconnection lc (cost=0.00..1.80 rows=1 width=4) (actual time=0.163..0.234 rows=1 loops=1) Filter: ((locationconnectioncode = 'RNB'::text) AND (deletiondate IS NULL)) -> Index Scan using priceavailableix1 on priceavailable pa (cost=0.00..5.91 rows=1 width=16) (actual time=0.070..0.965 rows=111 loops=1) Index Cond: (pa.locationconnectionid = "outer".locationconnectionid) Filter: (((direction = 'D'::text) OR (direction = 'B'::text)) AND (deletiondate IS NULL)) -> Seq Scan on productcodealias productalias (cost=0.00..1.09 rows=1 width=4) (actual time=0.020..0.022 rows=1 loops=111) Filter: (productcode = 'TAXI'::text) -> Seq Scan on product (cost=0.00..1.04 rows=1 width=8) (actual time=0.005..0.013 rows=3 loops=47) Filter: ((organizationid = 1) AND (deletiondate IS NULL)) -> Index Scan using locationiataix4 on locationiata la (cost=0.00..3.66 rows=1 width=4) (actual time=0.022..0.022 rows=0 loops=47) Index Cond: ("outer".locationid = la.locationid) Filter: ((iatacode = 'KB8'::text) AND (deletiondate IS NULL)) Total runtime: 42.852 ms Which is even faster than disable:ing nestloop. So my question is why is the planner making such a bad choice and how can I make it choose a better plan? /Mikael
pgsql-performance by date: