Re: BUG #5611: SQL Function STABLE promoting to VOLATILE - Mailing list pgsql-bugs

From Robert Haas
Subject Re: BUG #5611: SQL Function STABLE promoting to VOLATILE
Date
Msg-id AANLkTinCUYn-_i17a_GrGWyyc8x-j9QE6LqmYnfE22RW@mail.gmail.com
Whole thread Raw
In response to Re: BUG #5611: SQL Function STABLE promoting to VOLATILE  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On Thu, Aug 12, 2010 at 11:15 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm not exactly following this. =A0My guess is that the breakeven point
>> is going to be pretty low because I think Param nodes are pretty
>> cheap.
>
> If you have any significant number of executions of the expression, then
> of course converting it to an initplan is a win. =A0What I'm worried about
> is where you have just a small number (like maybe only one, if it gets
> converted to an indexqual for instance). =A0Then the added expense of
> setting up the initplan isn't going to be repaid. =A0As long as the
> expression is pretty expensive, the percentage overhead of wrapping it
> in an initplan will probably be tolerable anyway, but I'm not sure where
> the threshold of "pretty expensive" is for that.

Oh, I see.  It seems a lot more elegant if we can start by determining
whether the expression is in a context where it's likely to be
executed more than once.

*thinks*

I wonder if we could make this decision mostly based on node types.
Maybe it's reasonable to say that if we're doing say, a Seq Scan, we
always assume it's going to get evaluated more than once.  Yeah, the
table could have <2 rows in it, but mostly it won't, and I don't know
that it's wise to optimize for that case even if we *think* that's
what the statistics are telling us.  Similarly for a Function Scan,
CTE Scan, Worktable Scan, etc.  We *could* look at the row estimates,
but I bet it isn't necessary.  On the other hand, for an index qual,
it's probably pointless.  I guess the hard case is a filter qual on an
index scan... it's not too clear to me what the right thing to do is
in that case.

>> Well, that's certainly a good place to start, but I was thinking that
>> it would be nice to optimize things like this:
>
>> SELECT * FROM foo WHERE somecolumn =3D somefunc();
>
>> This is OK if we choose a straight index scan, but it's probably very
>> much worth optimizing if we end up doing anything else. =A0If that's too
>> hairy, then maybe not, but it's not obvious to me why it would be
>> expensive.
>
> Because you have to look at every subexpression of every subexpression
> to figure out if it's (a) stable and (b) expensive. =A0Each of those
> checks is un-cheap in itself, and if you blindly apply them at every
> node of an expression tree the cost will be exponential.

I think you'd need some kind of expression tree walker that builds up
a list of maximal stable subexpression trees.  It would be nice to
figure this out at some point in the process where we already have to
check volatility anyway.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

pgsql-bugs by date:

Previous
From: "Mariusz Majer"
Date:
Subject: BUG #5614: Varchar column (with DEFAULT NULL) stores 'UL' value instead of null
Next
From: Tom Lane
Date:
Subject: Re: BUG #5614: Varchar column (with DEFAULT NULL) stores 'UL' value instead of null