Planner problem - Mailing list pgsql-performance

From Mikael Kjellström
Subject Planner problem
Date
Msg-id 40C066C1.2090808@mksoft.nu
Whole thread Raw
Responses Re: Planner problem
List pgsql-performance
Hi!

I am having a bit of problem with the plan that the planner produces.

First a little info:

OS: Linux Debian
Version: PostgreSQL 7.4.2 on i686-pc-linux-gnu, compiled by GCC gcc GCC)
3.3.3 (Debian 20040401)
CPU: AMD Athlon XP 2000+
Memory: 1GB
Disk: 4 SCSI-3 UW 10000rpm in RAID 1 (mirror) mode

shared_mem: 20000
Sort mem: 8192
effective_cache_size: 80000


Row count for the tables involved:
priceavailable: 8564
product: 3
productCodeAlias: 7
locationIATA: 3402
locationConnection: 64
price: 4402


I have runned both vacuum full and vacuum analyze before I run my query
and shared_mem is set to 20000


Here is the result I am getting:

EXPLAIN ANALYZE
SELECT
   CASE WHEN 'D' = 'A' THEN price.priceArrival ELSE price.priceDeparture
END AS price,
   price.vatPercent AS vatPercent
FROM
   priceavailable pa,
   product product,
   productCodeAlias productAlias,
   locationiata la,
   locationconnection lc,
   price price
WHERE
   pa.direction IN('D', 'B')
   AND pa.productId = product.productId
   AND product.productId = productAlias.productId
   AND productAlias.productCode = 'TAXI'
   AND pa.locationConnectionId = lc.locationConnectionId
   AND lc.locationConnectionCode = 'RNB'
   AND pa.locationId = la.locationId
   AND la.iataCode = 'KB8'
   AND price.pricegroupId = pa.priceGroupId
   AND price.productId = product.productId
   AND '2004-06-01 05:30:00.000000+02' BETWEEN price.validstartdate AND
price.validstopdate
   AND product.organizationId = 1
   AND price.organizationId = product.organizationId
   AND pa.deletionDate IS NULL
   AND product.deletionDate IS NULL
   AND la.deletionDate IS NULL
   AND lc.deletionDate IS NULL
   AND price.deletionDate IS NULL
;

  QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop  (cost=2.14..17.16 rows=1 width=8) (actual
time=3274.196..4620.299 rows=1 loops=1)
    ->  Nested Loop  (cost=2.14..13.49 rows=1 width=12) (actual
time=2643.441..4617.246 rows=47 loops=1)
          Join Filter: (("outer".productid = "inner".productid) AND
("inner".pricegroupid = "outer".pricegroupid))
          ->  Nested Loop  (cost=0.00..7.72 rows=1 width=12) (actual
time=0.238..5.455 rows=111 loops=1)
                ->  Seq Scan on locationconnection lc  (cost=0.00..1.80
rows=1 width=4) (actual time=0.153..0.245 rows=1 loops=1)
                      Filter: ((locationconnectioncode = 'RNB'::text)
AND (deletiondate IS NULL))
                ->  Index Scan using priceavailableix1 on priceavailable
pa  (cost=0.00..5.91 rows=1 width=16) (actual time=0.067..4.182 rows=111
loops=1)
                      Index Cond: (pa.locationconnectionid =
"outer".locationconnectionid)
                      Filter: (((direction = 'D'::text) OR (direction =
'B'::text)) AND (deletiondate IS NULL))
          ->  Hash Join  (cost=2.14..5.74 rows=2 width=24) (actual
time=0.058..39.116 rows=1243 loops=111)
                Hash Cond: ("outer".productid = "inner".productid)
                ->  Index Scan using priceix2 on price  (cost=0.00..3.41
rows=23 width=20) (actual time=0.042..25.811 rows=4402 loops=111)
                      Index Cond: ('2004-06-01'::date <= validstopdate)
                      Filter: (('2004-06-01'::date >= validstartdate)
AND (deletiondate IS NULL) AND (organizationid = 1))
                ->  Hash  (cost=2.14..2.14 rows=1 width=12) (actual
time=0.132..0.132 rows=0 loops=1)
                      ->  Nested Loop  (cost=0.00..2.14 rows=1 width=12)
(actual time=0.088..0.123 rows=1 loops=1)
                            Join Filter: ("outer".productid =
"inner".productid)
                            ->  Seq Scan on product  (cost=0.00..1.04
rows=1 width=8) (actual time=0.013..0.022 rows=3 loops=1)
                                  Filter: ((organizationid = 1) AND
(deletiondate IS NULL))
                            ->  Seq Scan on productcodealias
productalias  (cost=0.00..1.09 rows=1 width=4) (actual time=0.022..0.024
rows=1 loops=3)
                                  Filter: (productcode = 'TAXI'::text)
    ->  Index Scan using locationiataix4 on locationiata la
(cost=0.00..3.66 rows=1 width=4) (actual time=0.052..0.052 rows=0 loops=47)
          Index Cond: ("outer".locationid = la.locationid)
          Filter: ((iatacode = 'KB8'::text) AND (deletiondate IS NULL))
  Total runtime: 4620.852 ms


If I do an "set enable_nestloop = 0" and run the exact same question I get:

      QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Hash Join  (cost=150.55..154.20 rows=1 width=8) (actual
time=68.095..103.006 rows=1 loops=1)
    Hash Cond: ("outer".locationid = "inner".locationid)
    ->  Hash Join  (cost=145.65..149.29 rows=1 width=12) (actual
time=66.304..102.709 rows=47 loops=1)
          Hash Cond: (("outer".productid = "inner".productid) AND
("outer".pricegroupid = "inner".pricegroupid))
          ->  Hash Join  (cost=2.15..5.75 rows=2 width=24) (actual
time=0.343..38.844 rows=1243 loops=1)
                Hash Cond: ("outer".productid = "inner".productid)
                ->  Index Scan using priceix2 on price  (cost=0.00..3.41
rows=23 width=20) (actual time=0.053..26.225 rows=4402 loops=1)
                      Index Cond: ('2004-06-01'::date <= validstopdate)
                      Filter: (('2004-06-01'::date >= validstartdate)
AND (deletiondate IS NULL) AND (organizationid = 1))
                ->  Hash  (cost=2.14..2.14 rows=1 width=12) (actual
time=0.191..0.191 rows=0 loops=1)
                      ->  Hash Join  (cost=1.09..2.14 rows=1 width=12)
(actual time=0.174..0.184 rows=1 loops=1)
                            Hash Cond: ("outer".productid =
"inner".productid)
                            ->  Seq Scan on product  (cost=0.00..1.04
rows=1 width=8) (actual time=0.017..0.025 rows=3 loops=1)
                                  Filter: ((organizationid = 1) AND
(deletiondate IS NULL))
                            ->  Hash  (cost=1.09..1.09 rows=1 width=4)
(actual time=0.073..0.073 rows=0 loops=1)
                                  ->  Seq Scan on productcodealias
productalias  (cost=0.00..1.09 rows=1 width=4) (actual time=0.042..0.045
rows=1 loops=1)
                                        Filter: (productcode = 'TAXI'::text)
          ->  Hash  (cost=143.50..143.50 rows=1 width=12) (actual
time=61.650..61.650 rows=0 loops=1)
                ->  Merge Join  (cost=1.81..143.50 rows=1 width=12)
(actual time=59.914..61.419 rows=111 loops=1)
                      Merge Cond: ("outer".locationconnectionid =
"inner".locationconnectionid)
                      ->  Index Scan using priceavailableix1 on
priceavailable pa  (cost=0.00..141.57 rows=43 width=16) (actual
time=0.048..48.739 rows=6525 loops=1)
                            Filter: (((direction = 'D'::text) OR
(direction = 'B'::text)) AND (deletiondate IS NULL))
                      ->  Sort  (cost=1.81..1.81 rows=1 width=4) (actual
time=0.215..0.290 rows=1 loops=1)
                            Sort Key: lc.locationconnectionid
                            ->  Seq Scan on locationconnection lc
(cost=0.00..1.80 rows=1 width=4) (actual time=0.125..0.194 rows=1 loops=1)
                                  Filter: ((locationconnectioncode =
'RNB'::text) AND (deletiondate IS NULL))
    ->  Hash  (cost=4.89..4.89 rows=1 width=4) (actual time=0.122..0.122
rows=0 loops=1)
          ->  Index Scan using locationiataix1 on locationiata la
(cost=0.00..4.89 rows=1 width=4) (actual time=0.108..0.112 rows=1 loops=1)
                Index Cond: (iatacode = 'KB8'::text)
                Filter: (deletiondate IS NULL)
Total runtime: 103.563 ms

Which is a lot faster than the original question.

If I then enable nestloop again and lower the geqo_threshold from 11
(default) to 5 and run the same query again I get:


           QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop  (cost=7.72..17.19 rows=1 width=8) (actual
time=9.752..42.278 rows=1 loops=1)
    ->  Nested Loop  (cost=7.72..13.52 rows=1 width=12) (actual
time=5.698..41.079 rows=47 loops=1)
          Join Filter: ("outer".productid = "inner".productid)
          ->  Nested Loop  (cost=7.72..12.47 rows=1 width=28) (actual
time=5.663..39.992 rows=47 loops=1)
                Join Filter: ("inner".productid = "outer".productid)
                ->  Hash Join  (cost=7.72..11.37 rows=1 width=24)
(actual time=5.470..36.848 rows=111 loops=1)
                      Hash Cond: (("outer".pricegroupid =
"inner".pricegroupid) AND ("outer".productid = "inner".productid))
                      ->  Index Scan using priceix2 on price
(cost=0.00..3.41 rows=23 width=20) (actual time=0.050..26.475 rows=4402
loops=1)
                            Index Cond: ('2004-06-01'::date <=
validstopdate)
                            Filter: (('2004-06-01'::date >=
validstartdate) AND (deletiondate IS NULL) AND (organizationid = 1))
                      ->  Hash  (cost=7.72..7.72 rows=1 width=12)
(actual time=1.809..1.809 rows=0 loops=1)
                            ->  Nested Loop  (cost=0.00..7.72 rows=1
width=12) (actual time=0.253..1.587 rows=111 loops=1)
                                  ->  Seq Scan on locationconnection lc
  (cost=0.00..1.80 rows=1 width=4) (actual time=0.163..0.234 rows=1 loops=1)
                                        Filter: ((locationconnectioncode
= 'RNB'::text) AND (deletiondate IS NULL))
                                  ->  Index Scan using priceavailableix1
on priceavailable pa  (cost=0.00..5.91 rows=1 width=16) (actual
time=0.070..0.965 rows=111 loops=1)
                                        Index Cond:
(pa.locationconnectionid = "outer".locationconnectionid)
                                        Filter: (((direction =
'D'::text) OR (direction = 'B'::text)) AND (deletiondate IS NULL))
                ->  Seq Scan on productcodealias productalias
(cost=0.00..1.09 rows=1 width=4) (actual time=0.020..0.022 rows=1 loops=111)
                      Filter: (productcode = 'TAXI'::text)
          ->  Seq Scan on product  (cost=0.00..1.04 rows=1 width=8)
(actual time=0.005..0.013 rows=3 loops=47)
                Filter: ((organizationid = 1) AND (deletiondate IS NULL))
    ->  Index Scan using locationiataix4 on locationiata la
(cost=0.00..3.66 rows=1 width=4) (actual time=0.022..0.022 rows=0 loops=47)
          Index Cond: ("outer".locationid = la.locationid)
          Filter: ((iatacode = 'KB8'::text) AND (deletiondate IS NULL))
Total runtime: 42.852 ms

Which is even faster than disable:ing nestloop.

So my question is why is the planner making such a bad choice and how
can I make it choose a better plan?


/Mikael


pgsql-performance by date:

Previous
From: Greg Stark
Date:
Subject: Re: Most transactions per second on largest box?
Next
From: Tom Lane
Date:
Subject: Re: Planner problem