Re: More thoughts about planner's cost estimates - Mailing list pgsql-hackers

From Tom Lane
Subject Re: More thoughts about planner's cost estimates
Date
Msg-id 11319.1149287800@sss.pgh.pa.us
Whole thread Raw
In response to More thoughts about planner's cost estimates  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: More thoughts about planner's cost estimates
Re: More thoughts about planner's cost estimates
List pgsql-hackers
I wrote:
> In general it seems to me that for CPU-bound databases, the default values
> of the cpu_xxx_cost variables are too low.  ... rather than telling people
> to manipulate all three of these variables individually, I think it might
> also be a good idea to provide a new GUC variable named something like
> "cpu_speed_scale" that would just be a multiplier for the other variables.
> It would default to 1.0 and our standard advice for CPU-bound databases
> would be "decrease random_page_cost to 1.0 and raise cpu_speed_scale to
> 10.0 or so".  Seems cleaner than telling people to muck with three or so
> individual values.

Nicolai Petri's comment about per-tablespace access costs caused me to
rethink the above proposal.  Instead of inventing "cpu_speed_scale",
which seems rather baroque after thinking about it more, what I now
think we should do is invent a "seq_page_cost" GUC to replace the
traditionally hardwired value of 1.0 cost unit per sequential page
fetch.  Then, if you've got different tablespaces with different disk
speeds, you could imagine having per-tablespace values of seq_page_cost
and random_page_cost, whereas you probably want the CPU cost numbers
to remain the same across all tables.

I don't really want to get into inventing per-tablespace settings right
now, because the need hasn't been demonstrated; but if we ever do want
to do it, this approach will be a whole lot less confusing than
something involving a cpu_speed_scale knob.

This still leaves you twiddling two knobs (now random_page_cost and
seq_page_cost) if you want to set up the planner for an all-in-memory
database.  So it's not any more complicated for that purpose.

One objection to this is that after moving "off the gold standard" of
1.0 = one page fetch, there is no longer any clear meaning to the
cost estimate units; you're faced with the fact that they're just an
arbitrary scale.  I'm not sure that's such a bad thing, though.  For
instance, some people might want to try to tune their settings so that
the estimates are actually comparable to milliseconds of real time.

Comments?
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Mark Woodward"
Date:
Subject: Re: COPY (query) TO file
Next
From: "Rodrigo Hjort"
Date:
Subject: Re: Connection Broken with Custom Dicts for TSearch2