Thread: Mini improvement: statement_cost_limit

Mini improvement: statement_cost_limit

From

"Hans-Jürgen Schönig"

Date:

02 August 2008, 15:20:38

hello ...

i picked up csaba nagy's idea and implemented a very simple yet very
useful extension.
i introduced a GUC called statement_cost_limit which can be used to
error out if a statement is expected to be too expensive.
the advantage over statement_timeout is that we are actually able to
error out before spending many seconds which is killed by
statement_timeout anyway.

    best regards,

        hans




--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: www.postgresql-support.de

Attachment

statement_cost_limit1.patch

Re: Mini improvement: statement_cost_limit

From

Andrew Dunstan

Date:

02 August 2008, 15:35:15


Hans-Jürgen Schönig wrote:
> hello ...
>
> i picked up csaba nagy's idea and implemented a very simple yet very 
> useful extension.
> i introduced a GUC called statement_cost_limit which can be used to 
> error out if a statement is expected to be too expensive.
> the advantage over statement_timeout is that we are actually able to 
> error out before spending many seconds which is killed by 
> statement_timeout anyway.
>
>  
>   

You clearly have far more faith in the cost estimates than I do.

cheers

andrew

Re: Mini improvement: statement_cost_limit

From

Tom Lane

Date:

02 August 2008, 15:39:09

Andrew Dunstan <andrew@dunslane.net> writes:
> Hans-J�rgen Sch�nig wrote:
>> i introduced a GUC called statement_cost_limit which can be used to 
>> error out if a statement is expected to be too expensive.

> You clearly have far more faith in the cost estimates than I do.

Wasn't this exact proposal discussed and rejected awhile back?
        regards, tom lane

Re: Mini improvement: statement_cost_limit

From

"Hans-Jürgen Schönig"

Date:

02 August 2008, 16:31:51

On Aug 2, 2008, at 8:38 PM, Tom Lane wrote:

> Andrew Dunstan <andrew@dunslane.net> writes:
>> Hans-Jürgen Schönig wrote:
>>> i introduced a GUC called statement_cost_limit which can be used to
>>> error out if a statement is expected to be too expensive.
>
>> You clearly have far more faith in the cost estimates than I do.
>
> Wasn't this exact proposal discussed and rejected awhile back?
>
>             regards, tom lane
>


i don't remember precisely.
i have seen it on simon's wiki page and it is something which would
have been useful in some cases in the past.
many thanks,
    hans


--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: www.postgresql-support.de

Re: Mini improvement: statement_cost_limit

From

daveg

Date:

03 August 2008, 04:44:15

On Sat, Aug 02, 2008 at 09:30:08PM +0200, Hans-Jürgen Schönig wrote:
> On Aug 2, 2008, at 8:38 PM, Tom Lane wrote:
> 
> >Andrew Dunstan <andrew@dunslane.net> writes:
> >>Hans-Jürgen Schönig wrote:
> >>>i introduced a GUC called statement_cost_limit which can be used to
> >>>error out if a statement is expected to be too expensive.
> >
> >>You clearly have far more faith in the cost estimates than I do.
> >
> >Wasn't this exact proposal discussed and rejected awhile back?
> >
> >            regards, tom lane
> >
> 
> 
> i don't remember precisely.
> i have seen it on simon's wiki page and it is something which would  
> have been useful in some cases in the past.

I think a variation on this could be very useful in development and test
environments. Suppose it raised a warning or notice if the cost was over
the limit. Then one could set a limit of a few million on the development
and test servers and developers would at least have a clue that they needed
to look at explain for that query. As it is now, one can exhort them to
run explain, but it has no effect.  Instead we later see queries killed
by a 24 hour timeout with estimated costs ranging from "until they unplug
the machine and dump it" to "until the sun turns into a red giant".

-dg

-- 
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.

Re: Mini improvement: statement_cost_limit

From

Simon Riggs

Date:

03 August 2008, 16:05:51

On Sun, 2008-08-03 at 00:44 -0700, daveg wrote:
> On Sat, Aug 02, 2008 at 09:30:08PM +0200, Hans-Jürgen Schönig wrote:
> > On Aug 2, 2008, at 8:38 PM, Tom Lane wrote:
> > 
> > >Andrew Dunstan <andrew@dunslane.net> writes:
> > >>Hans-Jürgen Schönig wrote:
> > >>>i introduced a GUC called statement_cost_limit which can be used to
> > >>>error out if a statement is expected to be too expensive.
> > >
> > >>You clearly have far more faith in the cost estimates than I do.
> > >
> > >Wasn't this exact proposal discussed and rejected awhile back?
> > >
> > i don't remember precisely.
> > i have seen it on simon's wiki page and it is something which would  
> > have been useful in some cases in the past.

I still support it. Regrettably, many SQL developers introduce product
joins and other unintentional errors. Why let problem queries through?
Security-wise they're great Denial of Service attacks, bringing the
server to its knees better than most ways I know, in conjunction with a
nice hefty work_mem setting. 27 table product joins: memory, CPU, I/O
and diskspace resources used all in a simple killer query.

If anybody thinks costs are inaccurate, don't use it. Or better still
improve the cost models. It isn't any harder or easier to find a useful
value than it is to use statement_timeout. What's the difference between
picking an arbitrary time and an arbitrary cost? You need to alter the
value according to people's complaints in both cases.

> I think a variation on this could be very useful in development and test
> environments. Suppose it raised a warning or notice if the cost was over
> the limit. Then one could set a limit of a few million on the development
> and test servers and developers would at least have a clue that they needed
> to look at explain for that query. As it is now, one can exhort them to
> run explain, but it has no effect.  Instead we later see queries killed
> by a 24 hour timeout with estimated costs ranging from "until they unplug
> the machine and dump it" to "until the sun turns into a red giant".

Great argument. So that's 4 in favour at least.

A compromise would be to have log_min_statement_cost (or
warn_min_statement_cost) which will at least help find these problems in
testing before we put things live, but that still won't help with
production issues.

Another alternative would be to have a plugin that can examine the plan
immediately after planner executes, so you can implement this yourself,
plus some other possibilities.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Mini improvement: statement_cost_limit

From

"Hans-Jürgen Schönig"

Date:

03 August 2008, 17:09:37

hello ...


>
> I still support it. Regrettably, many SQL developers introduce product
> joins and other unintentional errors. Why let problem queries through?


i think the killer is that we don't have to wait until the query dies
with a statement_timeout.
it is ways more elegant to kill things before they have already eaten
too many cycles.
one thing which is important as well: statement_cost_limit  does not
kill queries which have just been waiting for a lock.
this makes things slightly more predictable.


> Security-wise they're great Denial of Service attacks, bringing the
> server to its knees better than most ways I know, in conjunction
> with a
> nice hefty work_mem setting. 27 table product joins: memory, CPU, I/O
> and diskspace resources used all in a simple killer query.
>


i am not too concerned about DNS, i have to admit.
i would rather see it as a way to make developers do better things.


> If anybody thinks costs are inaccurate, don't use it. Or better still
> improve the cost models. It isn't any harder or easier to find a
> useful
> value than it is to use statement_timeout. What's the difference
> between
> picking an arbitrary time and an arbitrary cost? You need to alter the
> value according to people's complaints in both cases.


the cost model is good enough to see if something is good or bad.
this is basically all we want to do here --- killing all evil.


> *snip*


>
>
> A compromise would be to have log_min_statement_cost (or
> warn_min_statement_cost) which will at least help find these
> problems in
> testing before we put things live, but that still won't help with
> production issues.
>


definitely. a good idea as well - but people will hardly read it, i
guess :(.


> Another alternative would be to have a plugin that can examine the
> plan
> immediately after planner executes, so you can implement this
> yourself,
> plus some other possibilities.
>


this would be really fancy.
how could a plugin like that look like?
hans



--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: www.postgresql-support.de

Re: Mini improvement: statement_cost_limit

From

Simon Riggs

Date:

03 August 2008, 17:37:26

On Sun, 2008-08-03 at 22:09 +0200, Hans-Jürgen Schönig wrote:

> > Another alternative would be to have a plugin that can examine the  
> > plan
> > immediately after planner executes, so you can implement this  
> > yourself,
> > plus some other possibilities.
> >

> this would be really fancy.
> how could a plugin like that look like?

Hmm...thinks: exactly like the existing planner_hook().

So, rewrite this as a planner hook and submit as a contrib module.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Mini improvement: statement_cost_limit

From

Josh Berkus

Date:

03 August 2008, 22:01:42

Tom,

> Wasn't this exact proposal discussed and rejected awhile back?

We rejected Greenplum's much more invasive resource manager, because it 
created a large performance penalty on small queries whether or not it was 
turned on.  However, I don't remember any rejection of an idea as simple 
as a cost limit rejection.

This would, IMHO, be very useful for production instances of PostgreSQL.  
The penalty for mis-rejection of a poorly costed query is much lower than 
the penalty for having a bad query eat all your CPU.

-- 
--Josh

Josh Berkus
PostgreSQL
San Francisco

Re: Mini improvement: statement_cost_limit

From

Robert Treat

Date:

04 August 2008, 01:20:44

On Sunday 03 August 2008 15:12:22 Simon Riggs wrote:
> On Sun, 2008-08-03 at 00:44 -0700, daveg wrote:
> > On Sat, Aug 02, 2008 at 09:30:08PM +0200, Hans-Jürgen Schönig wrote:
> > > On Aug 2, 2008, at 8:38 PM, Tom Lane wrote:
> > > >Andrew Dunstan <andrew@dunslane.net> writes:
> > > >>Hans-Jürgen Schönig wrote:
> > > >>>i introduced a GUC called statement_cost_limit which can be used to
> > > >>>error out if a statement is expected to be too expensive.
> > > >>
> > > >>You clearly have far more faith in the cost estimates than I do.
> > > >
> > > >Wasn't this exact proposal discussed and rejected awhile back?
> > >
> > > i don't remember precisely.
> > > i have seen it on simon's wiki page and it is something which would
> > > have been useful in some cases in the past.
>
> I still support it. Regrettably, many SQL developers introduce product
> joins and other unintentional errors. Why let problem queries through?
> Security-wise they're great Denial of Service attacks, bringing the
> server to its knees better than most ways I know, in conjunction with a
> nice hefty work_mem setting. 27 table product joins: memory, CPU, I/O
> and diskspace resources used all in a simple killer query.
>

ISTR that what ended up killing the enthusiasm for this was that most people
realized that this GUC was just a poor tool to take a stab at solving other
problems (ie. rate limiting cpu for queries).

> If anybody thinks costs are inaccurate, don't use it. Or better still
> improve the cost models. It isn't any harder or easier to find a useful
> value than it is to use statement_timeout. What's the difference between
> picking an arbitrary time and an arbitrary cost? You need to alter the
> value according to people's complaints in both cases.
>

I think the original argument for statement_timeout was that long running
queries were known to cause have wrt vacuum strategies (remember, that one
has been in the back end a long time). ISTR some recent threds on -hackers
questioning whether statement_timeout should be eliminated itself.

> > I think a variation on this could be very useful in development and test
> > environments. Suppose it raised a warning or notice if the cost was over
> > the limit. Then one could set a limit of a few million on the development
> > and test servers and developers would at least have a clue that they
> > needed to look at explain for that query. As it is now, one can exhort
> > them to run explain, but it has no effect.  Instead we later see queries
> > killed by a 24 hour timeout with estimated costs ranging from "until they
> > unplug the machine and dump it" to "until the sun turns into a red
> > giant".
>
> Great argument. So that's 4 in favour at least.
>

Not such a great argument. Cost models on development servers can and often
are quite different from those on production, so you might be putting an
artifical limit on top of your developers.

> A compromise would be to have log_min_statement_cost (or
> warn_min_statement_cost) which will at least help find these problems in
> testing before we put things live, but that still won't help with
> production issues.
>
> Another alternative would be to have a plugin that can examine the plan
> immediately after planner executes, so you can implement this yourself,
> plus some other possibilities.
>

I still think it is worth revisiting what problems people are trying to solve,
and see if there are better tools they can be given to solve them.  Barring
that, I suppose a crude solution is better than nothing, though I fear people
might point at the crude solution as a good enough solution to justify not
working on better solutions.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: Mini improvement: statement_cost_limit

From

Mark Kirkwood

Date:

04 August 2008, 02:12:36

Josh Berkus wrote:
> Tom,
>
>   
>> Wasn't this exact proposal discussed and rejected awhile back?
>>     
>
> We rejected Greenplum's much more invasive resource manager, because it 
> created a large performance penalty on small queries whether or not it was 
> turned on.  However, I don't remember any rejection of an idea as simple 
> as a cost limit rejection.
>
> This would, IMHO, be very useful for production instances of PostgreSQL.  
> The penalty for mis-rejection of a poorly costed query is much lower than 
> the penalty for having a bad query eat all your CPU.
>
>   
Greenplum's introduced a way to creating a cost "threshold" a bit like 
the way Simon was going to do "shared" work_mem. It did 2 things:

1/ Counted the cost of an about-to-be run query against the threshold, 
and made the query wait if it would exhaust it
2/ Aborted the query if its  cost was greater than the threshold

Initially there was quite a noticeable performance penalty with it 
enabled - but as the guy working on it (me) redid bits and pieces then 
penalty decreased massively. Note that in all cases, disabling the 
feature meant there was no penalty.

The latest variant of the code is around in the Bizgres repository 
(src/backend/utils/resscheduler I think) - some bits might be worth 
looking at!

Best wishes

Mark

P.s : I'm not working for Greenplum now.

Re: Mini improvement: statement_cost_limit

From

Simon Riggs

Date:

04 August 2008, 04:06:52

On Sun, 2008-08-03 at 22:57 -0400, Robert Treat wrote:

> I still think it is worth revisiting what problems people are trying
> to solve, and see if there are better tools they can be given to solve
> them.  Barring that, I suppose a crude solution is better than
> nothing, though I fear people might point at the crude solution as a
> good enough solution to justify not working on better solutions. 

I advocate solutions to the problems of users I've worked with.

My preference is to help people in the next release, then improve from
there. We need to work with what we have.

In this case, an existing solution has been found.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Mini improvement: statement_cost_limit

From

daveg

Date:

04 August 2008, 04:50:57

On Sun, Aug 03, 2008 at 10:57:55PM -0400, Robert Treat wrote:
> 
> ISTR that what ended up killing the enthusiasm for this was that most people 
> realized that this GUC was just a poor tool to take a stab at solving other 
> problems (ie. rate limiting cpu for queries). 

I'm not concerned with that, I want developers to have feed back on costs in
a way that is obvious.
> > > I think a variation on this could be very useful in development and test
> > > environments. Suppose it raised a warning or notice if the cost was over
> > > the limit. Then one could set a limit of a few million on the development
> > > and test servers and developers would at least have a clue that they
> > > needed to look at explain for that query. As it is now, one can exhort
> > > them to run explain, but it has no effect.  Instead we later see queries
> > > killed by a 24 hour timeout with estimated costs ranging from "until they
> > > unplug the machine and dump it" to "until the sun turns into a red
> > > giant".
> >
> > Great argument. So that's 4 in favour at least.
> >
> 
> Not such a great argument. Cost models on development servers can and often 
> are quite different from those on production, so you might be putting an 
> artifical limit on top of your developers. 

We load the production dumps into our dev environment, which are the same
hardware spec, so the costs should be identical.
> I still think it is worth revisiting what problems people are trying to solve, 
> and see if there are better tools they can be given to solve them.  Barring 
> that, I suppose a crude solution is better than nothing, though I fear people 
> might point at the crude solution as a good enough solution to justify not 
> working on better solutions. 

Alerting developers and QA to potentially costly queries would help solve
some of the probems we are trying to solve. Better tools are welcome, an
argument that the good is the enemy of the best so we should be content with
nothing is not.

-dg

-- 
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.

Re: Mini improvement: statement_cost_limit

From

Gregory Stark

Date:

04 August 2008, 06:53:28

"Josh Berkus" <josh@agliodbs.com> writes:

> Tom,
>
>> Wasn't this exact proposal discussed and rejected awhile back?
>
> We rejected Greenplum's much more invasive resource manager, because it 
> created a large performance penalty on small queries whether or not it was 
> turned on.  However, I don't remember any rejection of an idea as simple 
> as a cost limit rejection.

The idea's certainly come up before. It probably received the usual
non-committal cold shoulder rather than an outright "rejection".

> This would, IMHO, be very useful for production instances of PostgreSQL.  
> The penalty for mis-rejection of a poorly costed query is much lower than 
> the penalty for having a bad query eat all your CPU.

Well that's going to depend on the application.... But I suppose there's
nothing wrong with having options which aren't always a good idea to use. The
real question I guess is whether there's ever a situation where it would be a
good idea to use this. I'm not 100% sure.

What I would probably use myself is an option to print a warning before
starting the query. That would be handy for interactive sessions so you would
be able to hit C-c instead of waiting for several minutes and then wondering
whether you got the query wrong.

I wonder if it would be useful to have a flag on some GUC options to make them
not globally settable. That is, for example, you could set enable_seqscan in
an individual session but not in postgres.conf. Or perhaps again just print a
warning that it's not recommended as a global configuration.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication
support!

Re: Mini improvement: statement_cost_limit

From

Robert Treat

Date:

04 August 2008, 15:39:29

On Monday 04 August 2008 03:50:40 daveg wrote:
> On Sun, Aug 03, 2008 at 10:57:55PM -0400, Robert Treat wrote:
> > ISTR that what ended up killing the enthusiasm for this was that most
> > people realized that this GUC was just a poor tool to take a stab at
> > solving other problems (ie. rate limiting cpu for queries).
>
> I'm not concerned with that, I want developers to have feed back on costs
> in a way that is obvious.
>

That was one of the other use cases that was pushed forward in the past. 

> > > > I think a variation on this could be very useful in development and
> > > > test environments. Suppose it raised a warning or notice if the cost
> > > > was over the limit. Then one could set a limit of a few million on
> > > > the development and test servers and developers would at least have a
> > > > clue that they needed to look at explain for that query. As it is
> > > > now, one can exhort them to run explain, but it has no effect. 
> > > > Instead we later see queries killed by a 24 hour timeout with
> > > > estimated costs ranging from "until they unplug the machine and dump
> > > > it" to "until the sun turns into a red giant".
> > >
> > > Great argument. So that's 4 in favour at least.
> >
> > Not such a great argument. Cost models on development servers can and
> > often are quite different from those on production, so you might be
> > putting an artifical limit on top of your developers.
>
> We load the production dumps into our dev environment, which are the same
> hardware spec, so the costs should be identical.
>

That's great for you, I am talking in the scope of a general solution. (Note 
I'd also bet that even given the same hardware, different production loads 
can produce different relative mappings of cost vs. performance, but 
whatever)

> > I still think it is worth revisiting what problems people are trying to
> > solve, and see if there are better tools they can be given to solve them.
> >  Barring that, I suppose a crude solution is better than nothing, though
> > I fear people might point at the crude solution as a good enough solution
> > to justify not working on better solutions.
>
> Alerting developers and QA to potentially costly queries would help solve
> some of the probems we are trying to solve. Better tools are welcome, an
> argument that the good is the enemy of the best so we should be content
> with nothing is not.
>

And you'll note, I specifically said that a crude tool is better than nothing. 
But your completely ignoring that a crude tool can often end-up as a foot-gun 
once relased into the wild. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: Mini improvement: statement_cost_limit

From

Josh Berkus

Date:

04 August 2008, 15:59:34

Greg,

> Well that's going to depend on the application.... But I suppose there's
> nothing wrong with having options which aren't always a good idea to use. The
> real question I guess is whether there's ever a situation where it would be a
> good idea to use this. I'm not 100% sure.

I can think of *lots*.   Primarily, simple web applications, where 
queries are never supposed to take more than 50ms.  If a query turns up 
with an estimated cost of 10000000000, then you know something's wrong; 
in the statistics if not in the query.  In either case, that query has a 
good chance of dragging down the whole system.

In such a production application, it is better to have false positives 
and reject otherwise-OK queries becuase their costing is wrong, than to 
let a single cartesian join bog down an application serving 5000 
simultaneous users.  Further, with a SQL error, this would allow the 
query rejection to be handled in a user-friendly way from the UI 
("Search too complex.  Try changing search terms.") rather than timing 
out, which is very difficult to handle well.

The usefulness of this feature for interactive sessions is 
limited-to-nonexistant.  It's for production applications.

--Josh Berkus

Re: Mini improvement: statement_cost_limit

From

Greg Smith

Date:

04 August 2008, 16:09:49

On Mon, 4 Aug 2008, daveg wrote:

> On Sun, Aug 03, 2008 at 10:57:55PM -0400, Robert Treat wrote:
>> Not such a great argument. Cost models on development servers can and often
>> are quite different from those on production, so you might be putting an
>> artifical limit on top of your developers.
>
> We load the production dumps into our dev environment, which are the same
> hardware spec, so the costs should be identical.

Not identical, just close.  ANALYZE samples data from your table randomly. 
The statistics used to compute the costs will therefore be slightly 
different on the two servers even if the data is the same.  The problem of 
discovering one plan on production and another on development is not quite 
that easy to remove.  Ultimately, if your developers aren't thorough 
enough to do thinks like look at EXPLAIN plans enough to discover things 
that are just bad, I just chuckle at your thinking that putting a single 
limiter on their bad behavior will somehow magically make that better.

Anyway, if your production server is small enough that you can afford to 
have another one just like it for the developers to work on, that's great. 
Robert's point is that many installs don't work like that.  The 
development teams in lots of places only get a subset of the production 
data because it's too large to deploy on anything but a big server, which 
often is hard to cost justify buying just for development purposes.

I like the concept of a cost limit, but I'm a bit horrified by the thought 
of it being exposed simply through the internal cost numbers because they 
are so arbitrary.  One of the endless projects I think about but never 
start coding is to write something that measures the things the planner 
cost constants estimate on a particular machine, so that all those numbers 
actually can be tied to some real-world time measure.  If you did that, 
you'd actually have a shot at accomplishing the real goal here, making 
statement_cost_limit cut off statements expected to take longer than 
statement_timeout before they even get started.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Mini improvement: statement_cost_limit

From

daveg

Date:

04 August 2008, 16:49:55

On Mon, Aug 04, 2008 at 03:09:34PM -0400, Greg Smith wrote:
> On Mon, 4 Aug 2008, daveg wrote:
> >We load the production dumps into our dev environment, which are the same
> >hardware spec, so the costs should be identical.
> 
> Not identical, just close.  ANALYZE samples data from your table randomly. 
> The statistics used to compute the costs will therefore be slightly 
> different on the two servers even if the data is the same.  The problem of 
> discovering one plan on production and another on development is not quite 
> that easy to remove.  Ultimately, if your developers aren't thorough 
> enough to do thinks like look at EXPLAIN plans enough to discover things 
> that are just bad, I just chuckle at your thinking that putting a single 
> limiter on their bad behavior will somehow magically make that better.

Not all developers can be persuaded to run explain on every change.
However, many will investigate a new message. I'm only hoping to try to
focus their attention toward possible problem queries.

> Anyway, if your production server is small enough that you can afford to 
> have another one just like it for the developers to work on, that's great. 
> Robert's point is that many installs don't work like that.  The 
> development teams in lots of places only get a subset of the production 
> data because it's too large to deploy on anything but a big server, which 
> often is hard to cost justify buying just for development purposes.

Not to get into a size war ;-), but the production environment I'd like this
feature for is over 40 32GB 16 scsi drive quadcore boxes. These are dedicated
to postgresql and run one or just a few databases. There are also a bunch
of client boxes that we will not speak of. The staging and test environments
are similar hardware but have only a subset of the databases copied to them.
There are probably than a dozen DB hosts for that.

> I like the concept of a cost limit, but I'm a bit horrified by the thought 
> of it being exposed simply through the internal cost numbers because they 
> are so arbitrary.  One of the endless projects I think about but never 

> start coding is to write something that measures the things the planner 
> cost constants estimate on a particular machine, so that all those numbers 
> actually can be tied to some real-world time measure.  If you did that, 
> you'd actually have a shot at accomplishing the real goal here, making 
> statement_cost_limit cut off statements expected to take longer than 
> statement_timeout before they even get started.

That is a nice idea. Possibly it could be a utility like the fsync tester.

But planner estimates are never going to be all that accurate even with solid
cost numbers because for some classes of queries, particularly those with
many joins the stats can be good at each level but the error accumulates
exponentially. Which is why I think a warning is appropriate instead of an
error. Even a notice in the logs would be useful.

-dg

-- 
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.

Re: Mini improvement: statement_cost_limit

From

daveg

Date:

04 August 2008, 16:57:01

On Mon, Aug 04, 2008 at 02:35:07PM -0400, Robert Treat wrote:
> On Monday 04 August 2008 03:50:40 daveg wrote:
> 
> That's great for you, I am talking in the scope of a general solution. (Note 
> I'd also bet that even given the same hardware, different production loads 
> can produce different relative mappings of cost vs. performance, but 
> whatever)

Even on different hardware it would still likely warn of mistakes like
products due to missing join conditions etc.
> > > I still think it is worth revisiting what problems people are trying to
> > > solve, and see if there are better tools they can be given to solve them.
> > >  Barring that, I suppose a crude solution is better than nothing, though
> > > I fear people might point at the crude solution as a good enough solution
> > > to justify not working on better solutions.
> >
> > Alerting developers and QA to potentially costly queries would help solve
> > some of the probems we are trying to solve. Better tools are welcome, an
> > argument that the good is the enemy of the best so we should be content
> > with nothing is not.
> 
> And you'll note, I specifically said that a crude tool is better than nothing. 

I released somewhat after I sent the above that it might have sounded a bit
snippy. I hope I have not offended.

> But your completely ignoring that a crude tool can often end-up as a foot-gun 
> once relased into the wild. 

I'm suggesting a warning, or even just a notice into the logs, I don't see
the footgun. What am I missing?

Regards

-dg

-- 
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.

Re: Mini improvement: statement_cost_limit

From

Hannu Krosing

Date:

04 August 2008, 17:00:30

On Mon, 2008-08-04 at 14:35 -0400, Robert Treat wrote:
> On Monday 04 August 2008 03:50:40 daveg wrote:
> > On Sun, Aug 03, 2008 at 10:57:55PM -0400, Robert Treat wrote:
...
> > > I still think it is worth revisiting what problems people are trying to
> > > solve, and see if there are better tools they can be given to solve them.
> > >  Barring that, I suppose a crude solution is better than nothing, though
> > > I fear people might point at the crude solution as a good enough solution
> > > to justify not working on better solutions.
> >
> > Alerting developers and QA to potentially costly queries would help solve
> > some of the probems we are trying to solve. Better tools are welcome, an
> > argument that the good is the enemy of the best so we should be content
> > with nothing is not.
> >
> 
> And you'll note, I specifically said that a crude tool is better than nothing. 
> But your completely ignoring that a crude tool can often end-up as a foot-gun 
> once relased into the wild. 

On the other other hand, _anything_ can end up as a foot-gun in hands of
ingenious users.

I was once told about a company, who claimed to have produced a
positively fool-proof lawn-mower, only to find out, that a university
professor had tried to use it to trim a hedge and cut off his toes.

-------------
Hannu

Re: Mini improvement: statement_cost_limit

From

Gregory Stark

Date:

04 August 2008, 17:02:52

"Josh Berkus" <josh@agliodbs.com> writes:

> In such a production application, it is better to have false positives and
> reject otherwise-OK queries becuase their costing is wrong, than to let a
> single cartesian join bog down an application serving 5000 simultaneous users.
> Further, with a SQL error, this would allow the query rejection to be handled
> in a user-friendly way from the UI ("Search too complex.  Try changing search
> terms.") rather than timing out, which is very difficult to handle well.
>
> The usefulness of this feature for interactive sessions is
> limited-to-nonexistant.  It's for production applications.

Wow. I couldn't disagree more.

For such an application this would be a major foot-gun which would give a
false sense of security simultaneously causing random outages and not
providing even the protection you're counting on.

It would be quite likely to miss some cartesian joins and allow problematic
queries through randomly and block other perfectly legitimate queries. I's no
substitute for writing your search engine query generator to actually check
that it has enough constraints to avoid any disallowed cartesion joins.

That people might think it's reliable enough to use for such applications is
my major concern and if my guess is right, Tom's as well. I suspect you may
have just sunk any chance of getting him on-side.

Where I see it useful is a) during development when it might help catch
erroneous queries as a kind of sql-lint. and b) when running ad-hoc DBA
queries where it might let the DBA catch the error before letting it run for a
while. I'm sure I'm not the only DBA who let a query run for 5 minutes before
wondering if it should really be taking that long.

I would be much more comfortable if it produced a warning, not an error. And
much more if we implemented my previous thought of having some settings which
generate warnings if they're set at startup saying that's not recommended.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication
support!

Re: Mini improvement: statement_cost_limit

From

daveg

Date:

04 August 2008, 17:05:15

On Mon, Aug 04, 2008 at 11:59:03AM -0700, Josh Berkus wrote:
> Greg,
> 
> >Well that's going to depend on the application.... But I suppose there's
> >nothing wrong with having options which aren't always a good idea to use. 
> >The
> >real question I guess is whether there's ever a situation where it would 
> >be a
> >good idea to use this. I'm not 100% sure.
> 
> I can think of *lots*.   Primarily, simple web applications, where 
> queries are never supposed to take more than 50ms.  If a query turns up 
> with an estimated cost of 10000000000, then you know something's wrong; 
> in the statistics if not in the query.  In either case, that query has a 
> good chance of dragging down the whole system.
> 
> In such a production application, it is better to have false positives 
> and reject otherwise-OK queries becuase their costing is wrong, than to 
> let a single cartesian join bog down an application serving 5000 
> simultaneous users.  Further, with a SQL error, this would allow the 
> query rejection to be handled in a user-friendly way from the UI 
> ("Search too complex.  Try changing search terms.") rather than timing 
> out, which is very difficult to handle well.
> 
> The usefulness of this feature for interactive sessions is 
> limited-to-nonexistant.  It's for production applications.

Ok, that is a different use case where an error seems very useful. What
about slightly extending the proposal to have the severity of exceeding
the limit configurable too. Something like:
  costestimate_limit = 1000000000         # default 0 to ignore limit  costestimate_limit_severity = error     # debug,
notice,warning, error
 

-dg

-- 
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.

Re: Mini improvement: statement_cost_limit

From

"Kevin Grittner"

Date:

04 August 2008, 17:08:10

>>> Hannu Krosing <hannu@krosing.net> wrote:
> I was once told about a company, who claimed to have produced a
> positively fool-proof lawn-mower, only to find out, that a
university
> professor had tried to use it to trim a hedge and cut off his toes.
Odd.  Seriously, about 45 years ago I lived next door to a university
botany professor who cut off his fingertips that way.  I wonder if
professors are more prone to this or whether the story got mangled
over time.
Perhaps software should have special protection for professors....
-Kevin

Re: Mini improvement: statement_cost_limit

From

Simon Riggs

Date:

04 August 2008, 17:43:10

On Mon, 2008-08-04 at 14:35 -0400, Robert Treat wrote:
> On Monday 04 August 2008 03:50:40 daveg wrote:

> And you'll note, I specifically said that a crude tool is better than
> nothing. But your completely ignoring that a crude tool can often
> end-up as a foot-gun once relased into the wild. 

The proposal is for an option with no consequences when turned off. We
respect your right not to use it. What is the danger exactly? 

If we cancel stupid queries before people run them, everybody is a
winner. Even the person who submitted the stupid query, since they find
out faster.

Sure, its an estimate, but it's got to be a based upon an estimate if it
acts *before* it runs. And surely there is no better estimate of the
cost than the plan cost?

It doesn't stop anyone from putting in resource limits, later.

We'll have to do something with enable_seqscan, BTW, chaps.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Mini improvement: statement_cost_limit

From

Josh Berkus

Date:

04 August 2008, 17:49:14

Greg,

> For such an application this would be a major foot-gun which would give
> a false sense of security simultaneously causing random outages and not
> providing even the protection you're counting on.

Hmmm.  That sounds like a call for some testing.  While our cost estimation 
has some issues, I don't think it's unreliable as all that.  And it's easy 
enough to document the limitations.  If it's 80% accurate, then it's 
fixing more problems than it causes.  If it's 30% accurate, then obviously 
it's a bad idea.

Personally, I don't see much use for this interactively, because an 
experienced DBA can easily enough run an EXPLAIN before running the query.  
I usually do, on production systems.

-- 
--Josh

Josh Berkus
PostgreSQL
San Francisco

Re: Mini improvement: statement_cost_limit

From

Robert Treat

Date:

04 August 2008, 18:12:41

On Monday 04 August 2008 16:49:43 Simon Riggs wrote:
> On Mon, 2008-08-04 at 14:35 -0400, Robert Treat wrote:
> > On Monday 04 August 2008 03:50:40 daveg wrote:
> >
> > And you'll note, I specifically said that a crude tool is better than
> > nothing. But your completely ignoring that a crude tool can often
> > end-up as a foot-gun once relased into the wild.
>
> The proposal is for an option with no consequences when turned off. We
> respect your right not to use it. What is the danger exactly?
>

All of the proposals I have seen for adding query hints would also have no 
consequence if not used, but no one seems to care about that argument. 

:-)

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: Mini improvement: statement_cost_limit

From

Robert Treat

Date:

04 August 2008, 18:20:07

On Monday 04 August 2008 15:56:25 daveg wrote:
> On Mon, Aug 04, 2008 at 02:35:07PM -0400, Robert Treat wrote:
> > On Monday 04 August 2008 03:50:40 daveg wrote:
> >
> > That's great for you, I am talking in the scope of a general solution.
> > (Note I'd also bet that even given the same hardware, different
> > production loads can produce different relative mappings of cost vs.
> > performance, but whatever)
>
> Even on different hardware it would still likely warn of mistakes like
> products due to missing join conditions etc.
>

See, this is what we ended up talking about before. Someone will say "I'd like 
to prevent my devs from accidentally doing queries with cartesian products" 
and they will use this to do it... but that will only work in some cases, so 
it becomes a poor tool to solve a different problem. 

BTW, what I really love about statement costs, is that they aren't even 
reliable on the same machine with the same data. I have seen query plans 
which run on the same data on the same machine where the resultant query 
runtime can vary from 2 hours to 5 hours, depending on how much other 
concurrent traffic is on the machine. Awesome eh? 

> > > > I still think it is worth revisiting what problems people are trying
> > > > to solve, and see if there are better tools they can be given to
> > > > solve them. Barring that, I suppose a crude solution is better than
> > > > nothing, though I fear people might point at the crude solution as a
> > > > good enough solution to justify not working on better solutions.
> > >
> > > Alerting developers and QA to potentially costly queries would help
> > > solve some of the probems we are trying to solve. Better tools are
> > > welcome, an argument that the good is the enemy of the best so we
> > > should be content with nothing is not.
> >
> > And you'll note, I specifically said that a crude tool is better than
> > nothing.
>
> I released somewhat after I sent the above that it might have sounded a bit
> snippy. I hope I have not offended.
>
> > But your completely ignoring that a crude tool can often end-up as a
> > foot-gun once relased into the wild.
>
> I'm suggesting a warning, or even just a notice into the logs, I don't see
> the footgun. What am I missing?
>

The footgun in my mind is that people will think this solves a number of 
problems even though it doesnt solve them well.  However, the footgun for you 
might be that the current proposal will actually abort the query, not emit a 
warning (not sure if that changes your opinion of it).  

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: Mini improvement: statement_cost_limit

From

daveg

Date:

04 August 2008, 22:06:55

On Mon, Aug 04, 2008 at 05:19:50PM -0400, Robert Treat wrote:
> See, this is what we ended up talking about before. Someone will say "I'd like 
> to prevent my devs from accidentally doing queries with cartesian products" 
> and they will use this to do it... but that will only work in some cases, so 
> it becomes a poor tool to solve a different problem. 
> 
> BTW, what I really love about statement costs, is that they aren't even 
> reliable on the same machine with the same data. I have seen query plans 
> which run on the same data on the same machine where the resultant query 
> runtime can vary from 2 hours to 5 hours, depending on how much other 
> concurrent traffic is on the machine. Awesome eh? 

Sure, I don't think anyone believes that costs are precise. But the case that
is interesting is 2 hours versus years and years.

> The footgun in my mind is that people will think this solves a number of 
> problems even though it doesnt solve them well.  However, the footgun for yo

I suspect that a good solution to this problem is impossible as it is more
or less the halting problem. So I'm willing to accept a poor solution based
on costs and then hope we improve the cost model.

-dg

-- 
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.

Re: Mini improvement: statement_cost_limit

From

"Heikki Linnakangas"

Date:

05 August 2008, 03:37:18

Simon Riggs wrote:
> On Sun, 2008-08-03 at 22:09 +0200, Hans-Jürgen Schönig wrote:
> 
>>> Another alternative would be to have a plugin that can examine the  
>>> plan
>>> immediately after planner executes, so you can implement this  
>>> yourself,
>>> plus some other possibilities.
>>>
> 
>> this would be really fancy.
>> how could a plugin like that look like?
> 
> Hmm...thinks: exactly like the existing planner_hook().
> 
> So, rewrite this as a planner hook and submit as a contrib module.

Now that's a good idea!

I personally don't think this feature is a good idea, for all the 
reasons others have mentioned, but as a pgfoundry project it can be 
downloaded by those who want it, and perhaps prove its usefulness for 
others as well.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Mini improvement: statement_cost_limit

From

Decibel!

Date:

11 August 2008, 16:03:06

On Aug 3, 2008, at 9:57 PM, Robert Treat wrote:
>>> I think a variation on this could be very useful in development  
>>> and test
>>> environments. Suppose it raised a warning or notice if the cost  
>>> was over
>>> the limit. Then one could set a limit of a few million on the  
>>> development
>>> and test servers and developers would at least have a clue that they
>>> needed to look at explain for that query. As it is now, one can  
>>> exhort
>>> them to run explain, but it has no effect.  Instead we later see  
>>> queries
>>> killed by a 24 hour timeout with estimated costs ranging from  
>>> "until they
>>> unplug the machine and dump it" to "until the sun turns into a red
>>> giant".
>>
>> Great argument. So that's 4 in favour at least.
>>
>
> Not such a great argument. Cost models on development servers can  
> and often
> are quite different from those on production, so you might be  
> putting an
> artifical limit on top of your developers.


We should have an approved API for dumping stats from one database  
and loading them into another. pg_dump needs this as well, IMO.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Re: Mini improvement: statement_cost_limit

From

Decibel!

Date:

11 August 2008, 16:03:48

On Aug 4, 2008, at 3:49 PM, Simon Riggs wrote:
> On Mon, 2008-08-04 at 14:35 -0400, Robert Treat wrote:
>> On Monday 04 August 2008 03:50:40 daveg wrote:
>
>> And you'll note, I specifically said that a crude tool is better than
>> nothing. But your completely ignoring that a crude tool can often
>> end-up as a foot-gun once relased into the wild.
>
> The proposal is for an option with no consequences when turned off. We
> respect your right not to use it. What is the danger exactly?
>
> If we cancel stupid queries before people run them, everybody is a
> winner. Even the person who submitted the stupid query, since they  
> find
> out faster.

I could *really* use this. Unfortunately, we have a lot of folks  
writing some horrible queries and killing our slave databases. I'd  
*love* to be able to throw out any queries that had insane limits...

> We'll have to do something with enable_seqscan, BTW, chaps.

My thought would be to back the cost penalty out if we end up with a  
seqscan anyway.

Speaking of which, there is a semi-related issue... if you have a  
large enough table the fixed-size cost we add to a seqscan won't be  
enough to make an alternative plan come out cheaper. Instead of  
adding a fixed cost, I think we should multiply by the estimated  
number of rows.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Re: Mini improvement: statement_cost_limit

From

Casey Duncan

Date:

11 August 2008, 22:14:52

On Aug 4, 2008, at 1:04 PM, daveg wrote:

> Ok, that is a different use case where an error seems very useful.  
> What
> about slightly extending the proposal to have the severity of  
> exceeding
> the limit configurable too. Something like:
>
>   costestimate_limit = 1000000000         # default 0 to ignore limit
>   costestimate_limit_severity = error     # debug, notice, warning,  
> error

I very much like this idea, and I would definitely use something like  
this on our production oltp app. We had a case recently where a query  
joining two large tables was very fast 99.9% of the time (i.e., a few  
ms), but for particular, rare key combinations the planner would make  
a poor choice turning into a multi-minute monster. It ran longer than  
the web server timeout, and the client was programmed to retry on  
error, essentially causing a database DoS.

The monster version of the plan had an outrageous cost estimate, many  
orders of magnitude higher than any regular app query, and would be  
easy to peg using even a crudely chosen limit value.

The problem was first mitigated by setting a query timeout a little  
longer than the web server timeout (since the query results are  
discarded for anything running longer), but even this was not a  
solution, since the client would retry on timeout, still keeping the  
db too busy. The real solution was to not do the query, but it would  
have been better to identify this via ERRORs in the logs than by the  
database becoming saturated in the middle of the day.

For our application it is far better for an expensive query to be  
rejected outright than to attempt to run it in vain. Just thought I'd  
throw that out as anecdotal support.

-Casey

Re: Mini improvement: statement_cost_limit

From

Bruce Momjian

Date:

15 August 2008, 12:16:45

Josh Berkus wrote:
> Greg,
> 
> > Well that's going to depend on the application.... But I suppose there's
> > nothing wrong with having options which aren't always a good idea to use. The
> > real question I guess is whether there's ever a situation where it would be a
> > good idea to use this. I'm not 100% sure.
> 
> I can think of *lots*.   Primarily, simple web applications, where 
> queries are never supposed to take more than 50ms.  If a query turns up 
> with an estimated cost of 10000000000, then you know something's wrong; 
> in the statistics if not in the query.  In either case, that query has a 
> good chance of dragging down the whole system.
> 
> In such a production application, it is better to have false positives 
> and reject otherwise-OK queries becuase their costing is wrong, than to 
> let a single cartesian join bog down an application serving 5000 
> simultaneous users.  Further, with a SQL error, this would allow the 

How about a simpler approach that throws an error or warning for
cartesian products?  That seems fool-proof.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Mini improvement: statement_cost_limit

From

Ron Mayer

Date:

15 August 2008, 12:47:23

Bruce Momjian wrote:
> Josh Berkus wrote:
>> ...simple web applications, where 
>> queries are never supposed to take more than 50ms.  If a query turns up 
>> with an estimated cost of 10000000000, then you know something's wrong; 
>> ...
> 
> How about a simpler approach that throws an error or warning for
> cartesian products?  That seems fool-proof.

Seems less fool-proof to me.

Sometimes cartesian products produce plans that run 200 times
faster than plans that don't use the cartesian product.

The first link below shows a cartesian join that took 1.1
seconds (within the range of OK for some web apps), while
plans for the same query that don't use one took 200 seconds.

http://archives.postgresql.org/pgsql-performance/2008-03/msg00391.php
http://archives.postgresql.org/pgsql-performance/2007-12/msg00090.php
http://archives.postgresql.org/pgsql-performance/2008-03/msg00361.php

Re: Mini improvement: statement_cost_limit

From

Bruce Momjian

Date:

15 August 2008, 13:12:52

Ron Mayer wrote:
> Bruce Momjian wrote:
> > Josh Berkus wrote:
> >> ...simple web applications, where 
> >> queries are never supposed to take more than 50ms.  If a query turns up 
> >> with an estimated cost of 10000000000, then you know something's wrong; 
> >> ...
> > 
> > How about a simpler approach that throws an error or warning for
> > cartesian products?  That seems fool-proof.
> 
> Seems less fool-proof to me.
> 
> Sometimes cartesian products produce plans that run 200 times
> faster than plans that don't use the cartesian product.
> 
> The first link below shows a cartesian join that took 1.1
> seconds (within the range of OK for some web apps), while
> plans for the same query that don't use one took 200 seconds.
> 
> http://archives.postgresql.org/pgsql-performance/2008-03/msg00391.php
> http://archives.postgresql.org/pgsql-performance/2007-12/msg00090.php
> http://archives.postgresql.org/pgsql-performance/2008-03/msg00361.php

My point is that people should _know_ they are using a cartesian
product, and a warning would do that for users who have no need for a
cartesian product and want to be warned about a possible error.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Mini improvement: statement_cost_limit

From

Tom Lane

Date:

15 August 2008, 18:54:27

Bruce Momjian <bruce@momjian.us> writes:
> Ron Mayer wrote:
>> Seems less fool-proof to me.

> My point is that people should _know_ they are using a cartesian
> product, and a warning would do that for users who have no need for a
> cartesian product and want to be warned about a possible error.

There are quite a lot of standard applications where small cartesian
products make sense --- star schemas are the traditional example.
I recall some discussions awhile back about how to persuade the planner
to consider such cases, in fact.  Right now it'll generally use a series
of hash joins where maybe just one would be better.

I concur with Ron that the only merit of this proposal is that it's
(relatively) simple to implement.
        regards, tom lane

Re: Mini improvement: statement_cost_limit

From

"Robert Haas"

Date:

15 August 2008, 19:03:55

> My point is that people should _know_ they are using a cartesian
> product, and a warning would do that for users who have no need for a
> cartesian product and want to be warned about a possible error.

I think Cartesian products are a red herring.  Cartesian products are
primarily bad if they generate bad performance, and bad performance
can be generated without Cartesian products.  I've certainly written
them intentionally, from time to time.  The bigger issue is - if you
start critiquing people's query-writing, where will you stop?

SELECT * FROM foo WHERE a = NULL
WARNING: a = NULL is always false.  Did you mean id IS NULL?

SELECT * FROM foo LEFT JOIN bar ON foo.a = bar.a LEFT JOIN baz ON
foo.b = bar.b AND foo.c = baz.c
WARNING: Maybe you meant foo.b = baz.b instead of foo.b = bar.b?

I'm sure there are a hundred others - these just happen to be a few of
my old mistakes (the first one was generated by some buggy Perl
code... the second by poor cut-and-paste skills).  In any event, I
don't think it's the job of the database to argue with you about
whether you really want the data you asked for - it's job is just to
get you that data.  Of course if doing so will take longer than the
amount of time remaining before the heat death of the universe, a
warning might be appropriate.

...Robert

Re: Mini improvement: statement_cost_limit

From

Josh Berkus

Date:

15 August 2008, 20:00:59

Bruce,

> How about a simpler approach that throws an error or warning for
> cartesian products?  That seems fool-proof.

Well, throwing a warning is pretty useless for an unattended application.  

Also, it's perfectly possible to write queries which will never complete 
without a cartesian join.

Basically, *unless* someone has a plan to integrate the greenplum resource 
management stuff sometime soon, I think we should take Hans' idea (pending 
patch quality, of course).  There's an argument to be made that even if we 
took the greenplum resource controller, statement_cost_limit would be much 
simpler and worth having for the small-effort simple-application users.

FYI, the concept behind the Greenplum RM is similar, except that it 
maintains a total "pool" of query costs on a per-role basis.  

-- 
--Josh

Josh Berkus
PostgreSQL
San Francisco

Re: Mini improvement: statement_cost_limit

From

"Hans-Jürgen Schönig"

Date:

17 August 2008, 12:15:51

>>

this entire thing is not about cartesian products at all.
it is about kicking out "expensive" queries before they even start to
eat up tons of CPU.
imagine a user asking for "give me all phone call in the US within the
past 10 years". you could kill the guy instantly because you know that
this would take ages.
in addition to that you know that in an OLTP context everything which
is expected to take longer than X cannot be useful anyway.
this has nothing to do with cartesian products or other bad things you
can do in SQL.
it is just a simple and heuristic check.
many thanks,
    hans



> My point is that people should _know_ they are using a cartesian
> product, and a warning would do that for users who have no need for a
> cartesian product and want to be warned about a possible error.
>
> --
>  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
>  EnterpriseDB                             http://enterprisedb.com
>
>  + If your life is a hard drive, Christ can be your backup. +


--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: www.postgresql-support.de