Re: random_page_cost vs seq_page_cost - Mailing list pgsql-hackers

From Benedikt Grundmann
Subject Re: random_page_cost vs seq_page_cost
Date
Msg-id 20120111082651.GV6419@ldn-qws-004.delacy.com
Whole thread Raw
In response to Re: random_page_cost vs seq_page_cost  (Greg Smith <greg@2ndQuadrant.com>)
Responses Re: random_page_cost vs seq_page_cost
Re: random_page_cost vs seq_page_cost
List pgsql-hackers
(replying just to you)
On 10/01/12 15:22, Greg Smith wrote:
> On 1/5/12 5:04 AM, Benedikt Grundmann wrote:
> That sort of thing is one reason why all attempts so far to set
> random_page_cost based on physical characteristics haven't gone
> anywhere useful.  The setting is sort of overloaded right now, it's a
> fuzzy mix of true random seek cost blended with some notion of cache
> percentage. Trying to bring some measurements to bear on it is a less
> effective approach than what people actually do here.  Monitor the
> profile of query execution, change the value, see what happens.  Use
> that as feedback for what direction to keep going; repeat until
> you're just spinning with no improvements.
> 
Thank you very much for the reply it is very interesting.  I'm
excited to hear that documentation in that area will improve in
9.2.  It's interesting postgres has remarkable good documentation
but it is a sufficiently complex system that to actually sensible
tune the knobs provided you have to understand quite a lot about
what is going on.  A colleague of mine likes to say 
"all abstractions leak", which seems very appropriate in this case.

> >We are not sure if the database used to choose differently
> >before the move to the new hardware and the hardware is
> >performing worse for random seeks.  Or if the planner is
> >now making different choices.
> 
> I don't recommend ever deploying new hardware without first doing
> some low-level benchmarks to validate its performance.  Once stuff
> goes into production, you can't do that anymore.  See
> http://www.2ndquadrant.com/en/talks/ for my hardware benchmarking
> talks if you'd like some ideas on what to collect.
> 
We had actually done lots of tests on the sequential read performance.
But you are right we could have done better (and I'll definitely read
through your talks).

Did you see my follow up?  Based on the feedback we did further tests
and It is now clear that neither the hardware nor the database version 
are at fault.  A different plan is chosen by both new and old database 
version if spun up on the database as it is right now.

Our best guess is that the clusters we run after we had moved to the
hardware (it having more diskspace and faster sequential I/O making
it possible) changed the planners perception of how the joins will
perform in relation to each other.

Cheers,

Bene


pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: JSON for PG 9.2
Next
From: Benedikt Grundmann
Date:
Subject: Re: random_page_cost vs seq_page_cost