Home > mailing lists

Re: Cluster vs. non-cluster query planning - Mailing list pgsql-performance

From	Nolan Cafferky
Subject	Re: Cluster vs. non-cluster query planning
Date	May 2, 2006 00:02:15
Msg-id	4456A177.3020505@rbsinteractive.com Whole thread Raw
In response to	Re: Cluster vs. non-cluster query planning (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-performance

Tree view

Tom Lane wrote:

Nolan Cafferky <Nolan.Cafferky@rbsinteractive.com> writes:

But, I'm guessing that random_page_cost = 1 is not a realistic value.


Well, that depends.  If all your data can be expected to fit in memory
then it is a realistic value.  (If not, you should be real careful not
to make performance decisions on the basis of test cases that *do* fit
in RAM...)

In any case, if I recall your numbers correctly you shouldn't need to
drop it nearly that far to get the thing to make the right choice.
A lot of people run with random_page_cost set to 2 or so.

Thanks for the advice. I will check what changing random_page_cost does for the rest of the queries on our system. I did learn why the estimated row count was so high. This is new knowledge to me, so I'm going to share it. SELECT reltuples FROM pg_class WHERE relname = 'orders'; -> produces 98426. SELECT n_distinct FROM pg_stats WHERE tablename = 'orders' and attname = 'order_statuses_id'; -> currently 13. Seq Scan on orders o (cost=1.20..11395.53 rows=7570 width=8) (actual time=283.599..285.031 rows=105 loops=1) Filter: (order_statuses_id = $0) InitPlan -> Seq Scan on order_statuses (cost=0.00..1.20 rows=1 width=4) (actual time=0.031..0.038 rows=1 loops=1) Filter: ((id_name)::text = 'new'::text) Total runtime: 285.225 ms (98426 / 13)::integer = 7571 ~= 7570, the estimated row count. So the query planner isn't able to combine the knowledge of the id value from order_statuses with most_common_vals, most_common_freqs, or histogram_bounds from pg_stats. That seems a little odd to me, but maybe it makes sense. I suppose the planner can't start executing parts of the query to aid in the planning process. In the future, I will probably pre-select from order_statuses before executing this query. Thanks!

pgsql-performance by date:

From: Tom Lane
Date: 01 May 2006, 23:35:14
Subject: Re: Cluster vs. non-cluster query planning

From: "Mark Liberman"
Date: 02 May 2006, 02:14:40
Subject: Why is plan (and performance) different on partitioned table?

Re: Cluster vs. non-cluster query planning - Mailing list pgsql-performance

Previous

Next