Re: Mini improvement: statement_cost_limit - Mailing list pgsql-hackers

From Greg Smith
Subject Re: Mini improvement: statement_cost_limit
Date
Msg-id Pine.GSO.4.64.0808041358470.29907@westnet.com
Whole thread Raw
In response to Re: Mini improvement: statement_cost_limit  (daveg <daveg@sonic.net>)
Responses Re: Mini improvement: statement_cost_limit  (daveg <daveg@sonic.net>)
List pgsql-hackers
On Mon, 4 Aug 2008, daveg wrote:

> On Sun, Aug 03, 2008 at 10:57:55PM -0400, Robert Treat wrote:
>> Not such a great argument. Cost models on development servers can and often
>> are quite different from those on production, so you might be putting an
>> artifical limit on top of your developers.
>
> We load the production dumps into our dev environment, which are the same
> hardware spec, so the costs should be identical.

Not identical, just close.  ANALYZE samples data from your table randomly. 
The statistics used to compute the costs will therefore be slightly 
different on the two servers even if the data is the same.  The problem of 
discovering one plan on production and another on development is not quite 
that easy to remove.  Ultimately, if your developers aren't thorough 
enough to do thinks like look at EXPLAIN plans enough to discover things 
that are just bad, I just chuckle at your thinking that putting a single 
limiter on their bad behavior will somehow magically make that better.

Anyway, if your production server is small enough that you can afford to 
have another one just like it for the developers to work on, that's great. 
Robert's point is that many installs don't work like that.  The 
development teams in lots of places only get a subset of the production 
data because it's too large to deploy on anything but a big server, which 
often is hard to cost justify buying just for development purposes.

I like the concept of a cost limit, but I'm a bit horrified by the thought 
of it being exposed simply through the internal cost numbers because they 
are so arbitrary.  One of the endless projects I think about but never 
start coding is to write something that measures the things the planner 
cost constants estimate on a particular machine, so that all those numbers 
actually can be tied to some real-world time measure.  If you did that, 
you'd actually have a shot at accomplishing the real goal here, making 
statement_cost_limit cut off statements expected to take longer than 
statement_timeout before they even get started.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD


pgsql-hackers by date:

Previous
From: "David Blewett"
Date:
Subject: Re: PL/Python
Next
From: Hannu Krosing
Date:
Subject: Re: PL/Python