Re: reducing random_page_cost from 4 to 2 to force index scan - Mailing list pgsql-performance

From Robert Haas
Subject Re: reducing random_page_cost from 4 to 2 to force index scan
Date
Msg-id BANLkTik6OaVkJGEjs5--vcynUE36mWbzbg@mail.gmail.com
Whole thread Raw
In response to Re: reducing random_page_cost from 4 to 2 to force index scan  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: reducing random_page_cost from 4 to 2 to force index scan
Re: reducing random_page_cost from 4 to 2 to force index scan
List pgsql-performance
On Tue, Apr 26, 2011 at 9:04 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Tue, Apr 26, 2011 at 4:37 PM, Kevin Grittner
> <Kevin.Grittner@wicourts.gov> wrote:
>> Sok Ann Yap <sokann@gmail.com> wrote:
>>
>>> So, index scan wins by a very small margin over sequential scan
>>> after the tuning. I am a bit puzzled because index scan is more
>>> than 3000 times faster in this case, but the estimated costs are
>>> about the same. Did I do something wrong?
>>
>> Tuning is generally needed to get best performance from PostgreSQL.
>> Needing to reduce random_page_cost is not unusual in situations
>> where a good portion of the active data is in cache (between
>> shared_buffers and the OS cache).  Please show us your overall
>> configuration and give a description of the hardware (how many of
>> what kind of cores, how much RAM, what sort of storage system).  The
>> configuration part can be obtained by running the query on this page
>> and pasting the result into your next post:
>>
>> http://wiki.postgresql.org/wiki/Server_Configuration
>>
>> There are probably some other configuration adjustments you could do
>> to ensure that good plans are chosen.
>
> The very first thing to check is effective_cache_size and to set it to
> a reasonable value.

Actually, effective_cache_size has no impact on costing except when
planning a nested loop with inner index scan.  So, a query against a
single table can never benefit from changing that setting.  Kevin's
suggestion of adjusting seq_page_cost and random_page_cost is the way
to go.

We've talked in the past (and I still think it's a good idea, but
haven't gotten around to doing anything about it) about adjusting the
planner to attribute to each relation the percentage of its pages
which we believe we'll find in cache.  Although many complicated ideas
for determining that percentage have been proposed, my favorite one is
fairly simple: assume that small relations will be mostly or entirely
cached, and that big ones won't be.   Allow the administrator to
override the result on a per-relation basis.  It's difficult to
imagine a situation where the planner should assume that a relation
with only handful of pages isn't going to be cached.  Even if it
isn't, as soon as someone begins accessing it, it will be.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-performance by date:

Previous
From: Maciek Sakrejda
Date:
Subject: Re: Link error when use Pgtypes function in windows
Next
From: "Kevin Grittner"
Date:
Subject: Re: reducing random_page_cost from 4 to 2 to force index scan