Re: Cluster vs. non-cluster query planning - Mailing list pgsql-performance

From Nolan Cafferky
Subject Re: Cluster vs. non-cluster query planning
Date
Msg-id 44566612.8020709@rbsinteractive.com
Whole thread Raw
In response to Cluster vs. non-cluster query planning  (Nolan Cafferky <Nolan.Cafferky@rbsinteractive.com>)
Responses Re: Cluster vs. non-cluster query planning  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
> Questions:
> * What can I do to reduce the estimated row count on the query?
> * Why does clustering drive down the estimated cost for the index scan
> so much? Does a change in correlation from .72 to 1 make that much of
> a difference?
> * Can I convince my query planner to index scan without clustering on
> the order_statuses_id index, or setting enable_seqscan = off?


After some more digging on the mailing list, I found some comments on
effective_cache_size.  Bringing it up from the default of 1000 does pust
the estimated cost for the index scan below that of the sequential scan,
but not by much.

With SET effective_cache_size = 1000:
 Seq Scan on orders o  (cost=1.20..11395.53 rows=7029 width=8) (actual
time=280.148..281.512 rows=105 loops=1)
   Filter: (order_statuses_id = $0)
   InitPlan
     ->  Seq Scan on order_statuses  (cost=0.00..1.20 rows=1 width=4)
(actual time=0.012..0.020 rows=1 loops=1)
           Filter: ((id_name)::text = 'new'::text)
 Total runtime: 281.700 ms

With SET effective_cache_size = 10000:
 Index Scan using orders_status_btree_idx on orders o
(cost=1.20..9710.91 rows=7029 width=8) (actual time=0.050..0.372
rows=105 loops=1)
   Index Cond: (order_statuses_id = $0)
   InitPlan
     ->  Seq Scan on order_statuses  (cost=0.00..1.20 rows=1 width=4)
(actual time=0.016..0.024 rows=1 loops=1)
           Filter: ((id_name)::text = 'new'::text)

The ratios between estimated costs are still nowhere near the ratio of
actual costs.

pgsql-performance by date:

Previous
From: Erik Myllymaki
Date:
Subject: Re: hardare config question
Next
From: Tom Lane
Date:
Subject: Re: Cluster vs. non-cluster query planning