Thread: How embarrassing: optimization of a one-shot query doesn't work

How embarrassing: optimization of a one-shot query doesn't work

From
Tom Lane
Date:
While testing the changes I was making to Pavel's EXECUTE USING patch
to ensure that parameter values were being provided to the planner,
it became painfully obvious that the planner wasn't actually *doing*
anything with them.  For example

    execute 'select count(*) from foo where x like $1' into c using $1;

wouldn't generate an indexscan when $1 was of the form 'prefix%'.

Some investigation showed that the planner is using the passed values
for estimation purposes, but not for any purposes where the value *must*
be correct (not only this LIKE-optimization, but constraint exclusion,
for instance).  The reason is that the parameter values are made
available to estimate_expression_value but not to eval_const_expressions.
This is a thinko in a cleanup patch I made early in 8.3 development:
http://archives.postgresql.org/pgsql-committers/2007-02/msg00352.php
I said to myself "eval_const_expressions doesn't need any context,
because a constant expression's value must be independent of context,
so I can avoid changing its API".  Silly me.

The implication of this is that 8.3 is significantly worse than 8.2
in optimizing unnamed statements in the extended-Query protocol;
a feature that JDBC, at least, relies on.

The fix is simple: add PlannerInfo to eval_const_expressions's
parameter list, as was done for estimate_expression_value.  I am
slightly hesitant to do this in a stable branch, since it would break
any third-party code that might be calling that function.  I doubt there
is currently any production-grade code doing so, but if anyone out there
is actively using those planner hooks we put into 8.3, it's conceivable
this would affect them.

Still, the performance regression here is bad enough that I think there
is little choice.  Comments/objections?

            regards, tom lane

Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> The fix is simple: add PlannerInfo to eval_const_expressions's
> parameter list, as was done for estimate_expression_value.  I am
> slightly hesitant to do this in a stable branch, since it would break
> any third-party code that might be calling that function.  I doubt there
> is currently any production-grade code doing so, but if anyone out there
> is actively using those planner hooks we put into 8.3, it's conceivable
> this would affect them.
>
> Still, the performance regression here is bad enough that I think there
> is little choice.  Comments/objections?

I agree that we should update stable to fix this performance regression,
given the gravity of it.  I also feel that we need to do so ASAP, to
minimize the risk of people being affected by it.  The longer we wait on
it the more likely someone will write something which is affected.

The constraint-exclusion not being used will be a huge impact and
problem for people moving partitioned-data with dynamic pl/pgsql
generation functions to 8.3.

Robert, I'm suprised you weren't affected, or have you just not noticed
yet due to your fancy new-to-you hardware? ;)

    Stephen

Attachment
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Still, the performance regression here is bad enough that I think there
>> is little choice.  Comments/objections?

> I agree that we should update stable to fix this performance regression,
> given the gravity of it.  I also feel that we need to do so ASAP, to
> minimize the risk of people being affected by it.  The longer we wait on
> it the more likely someone will write something which is affected.

In the normal course of events I'd expect that we'd put out 8.3.2
in a month or so.  I'm not quite convinced that this issue is worth
speeding the cycle all by itself.  We've done that for data-loss
issues but never before for a mere performance problem.

The main reason I wanted to make some noise about this is to find out
if anyone is actually trying to call eval_const_expressions (or
relation_excluded_by_constraints, which it turned out needed to change
also) from 8.3 add-on code.  If anyone squawks we could think about a
faster update ...

            regards, tom lane

Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

From
Andrew Dunstan
Date:

Tom Lane wrote:
> The main reason I wanted to make some noise about this is to find out
> if anyone is actually trying to call eval_const_expressions (or
> relation_excluded_by_constraints, which it turned out needed to change
> also) from 8.3 add-on code.  If anyone squawks we could think about a
> faster update ...
>
>
>

That assumes that someone working on using the planner hooks will read
this thread - which might be reasonable - I guess they number of likely
users is fairly small. But if they might miss it then it would be best
to fix it ASAP, ISTM.

cheers

andrew

Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> If anyone squawks we could think about a
>> faster update ...

> That assumes that someone working on using the planner hooks will read
> this thread - which might be reasonable - I guess they number of likely
> users is fairly small. But if they might miss it then it would be best
> to fix it ASAP, ISTM.

Well, it's not like we have never before changed internal APIs in a
minor update.  (There have been security-related cases where we gave
*zero* notice of such changes.)  Nor am I willing to surrender the
option to do so again.  If there's somebody out there with a real
problem with this change, they need to speak up.

            regards, tom lane

Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

From
"Guillaume Smet"
Date:
On Tue, Apr 1, 2008 at 7:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>  Well, it's not like we have never before changed internal APIs in a
>  minor update.  (There have been security-related cases where we gave
>  *zero* notice of such changes.)  Nor am I willing to surrender the
>  option to do so again.  If there's somebody out there with a real
>  problem with this change, they need to speak up.

+1 to fix it ASAP. A lot of people usually wait for 8.x.1 before
considering a migration and people are usually quite cautious with the
8.3 migration because of all the cast errors in the existing app.

Another question is how we can be sure it doesn't happen again. The
easiest way to test this is probably to have a JDBC test testing this
exact feature in the future benchfarm. Any comment?

--
Guillaume

"Guillaume Smet" <guillaume.smet@gmail.com> writes:
> Another question is how we can be sure it doesn't happen again. The
> easiest way to test this is probably to have a JDBC test testing this
> exact feature in the future benchfarm. Any comment?

Yeah, the lack of any formal testing of the extended-Query protocol
is a real problem.  I'm not sure of a good fix, but it bears some
thinking about.  Not only do we not have an automated way to notice
if we broke functionality, but we don't really notice for either
extended or basic protocol if we hurt performance.

            regards, tom lane

Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

From
"Guillaume Smet"
Date:
On Tue, Apr 1, 2008 at 8:06 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>  Yeah, the lack of any formal testing of the extended-Query protocol
>  is a real problem.  I'm not sure of a good fix, but it bears some
>  thinking about.  Not only do we not have an automated way to notice
>  if we broke functionality, but we don't really notice for either
>  extended or basic protocol if we hurt performance.

I just posted something to -hackers about the availability of boxes
for QA purposes. It doesn't solve the problem by itself though.

A good answer is probably to plan optional JDBC benchmarks in the
benchfarm design - not all people want to run Java on their boxes but
we have servers of our own to do so. Andrew?

--
Guillaume

Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

From
Michael Paesold
Date:
Am 01.04.2008 um 01:26 schrieb Tom Lane:
> While testing the changes I was making to Pavel's EXECUTE USING patch
> to ensure that parameter values were being provided to the planner,
> it became painfully obvious that the planner wasn't actually *doing*
> anything with them.  For example
>
>     execute 'select count(*) from foo where x like $1' into c using $1;
>
> wouldn't generate an indexscan when $1 was of the form 'prefix%'.
...
> The implication of this is that 8.3 is significantly worse than 8.2
> in optimizing unnamed statements in the extended-Query protocol;
> a feature that JDBC, at least, relies on.
>
> The fix is simple: add PlannerInfo to eval_const_expressions's
> parameter list, as was done for estimate_expression_value.  I am
> slightly hesitant to do this in a stable branch, since it would break
> any third-party code that might be calling that function.  I doubt
> there
> is currently any production-grade code doing so, but if anyone out
> there
> is actively using those planner hooks we put into 8.3, it's
> conceivable
> this would affect them.
>
> Still, the performance regression here is bad enough that I think
> there
> is little choice.  Comments/objections?

Yeah, please fix this performance regression in the 8.3 branch. This
would affect most of the JDBC applications out there, I think.

Best Regards
Michael Paesold

On 1-Apr-08, at 6:25 AM, Michael Paesold wrote:

>
> Am 01.04.2008 um 01:26 schrieb Tom Lane:
>> While testing the changes I was making to Pavel's EXECUTE USING patch
>> to ensure that parameter values were being provided to the planner,
>> it became painfully obvious that the planner wasn't actually *doing*
>> anything with them.  For example
>>
>>     execute 'select count(*) from foo where x like $1' into c using $1;
>>
>> wouldn't generate an indexscan when $1 was of the form 'prefix%'.
> ...
>> The implication of this is that 8.3 is significantly worse than 8.2
>> in optimizing unnamed statements in the extended-Query protocol;
>> a feature that JDBC, at least, relies on.
>>
>> The fix is simple: add PlannerInfo to eval_const_expressions's
>> parameter list, as was done for estimate_expression_value.  I am
>> slightly hesitant to do this in a stable branch, since it would break
>> any third-party code that might be calling that function.  I doubt
>> there
>> is currently any production-grade code doing so, but if anyone out
>> there
>> is actively using those planner hooks we put into 8.3, it's
>> conceivable
>> this would affect them.
>>
>> Still, the performance regression here is bad enough that I think
>> there
>> is little choice.  Comments/objections?
>
> Yeah, please fix this performance regression in the 8.3 branch. This
> would affect most of the JDBC applications out there, I think.
>
Was the driver ever changed to take advantage of the above strategy?


Dave
> Best Regards
> Michael Paesold
>
> --
> Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-jdbc


Dave Cramer <pg@fastcrypt.com> writes:
> Was the driver ever changed to take advantage of the above strategy?

Well, it's automatic as long as you use the unnamed statement.  About
all that might need to be done on the client side is to use unnamed
statements more often in preference to named ones, and I believe that
something like that did get done in JDBC.

            regards, tom lane

Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

From
Michael Paesold
Date:
Am 01.04.2008 um 13:14 schrieb Dave Cramer:
>
> On 1-Apr-08, at 6:25 AM, Michael Paesold wrote:
>
>>
>> Am 01.04.2008 um 01:26 schrieb Tom Lane:
>>> While testing the changes I was making to Pavel's EXECUTE USING
>>> patch
>>> to ensure that parameter values were being provided to the planner,
>>> it became painfully obvious that the planner wasn't actually *doing*
>>> anything with them.  For example
>>>
>>>     execute 'select count(*) from foo where x like $1' into c using $1;
>>>
>>> wouldn't generate an indexscan when $1 was of the form 'prefix%'.
>> ...
>>> The implication of this is that 8.3 is significantly worse than 8.2
>>> in optimizing unnamed statements in the extended-Query protocol;
>>> a feature that JDBC, at least, relies on.
>>>
>>> The fix is simple: add PlannerInfo to eval_const_expressions's
>>> parameter list, as was done for estimate_expression_value.  I am
>>> slightly hesitant to do this in a stable branch, since it would
>>> break
>>> any third-party code that might be calling that function.  I doubt
>>> there
>>> is currently any production-grade code doing so, but if anyone out
>>> there
>>> is actively using those planner hooks we put into 8.3, it's
>>> conceivable
>>> this would affect them.
>>>
>>> Still, the performance regression here is bad enough that I think
>>> there
>>> is little choice.  Comments/objections?
>>
>> Yeah, please fix this performance regression in the 8.3 branch.
>> This would affect most of the JDBC applications out there, I think.
>>
> Was the driver ever changed to take advantage of the above strategy?

IIRC, it is used in most cases with the v3 protocol, as long as you
don't set a prepare-threshold.

Best Regards
Michael Paesold

So if I write

conn.prepareStatement("select col from table where col like ?")

then setString(1,'hello%')

The driver will do

prepare foo as select col from table where col like $1

and then

execute foo('hello%')

this will take advantage of the strategy automatically ?

If so this should be changed. The driver does this all the time.

Dave

On 1-Apr-08, at 10:06 AM, Tom Lane wrote:

> Dave Cramer <pg@fastcrypt.com> writes:
>> Was the driver ever changed to take advantage of the above strategy?
>
> Well, it's automatic as long as you use the unnamed statement.  About
> all that might need to be done on the client side is to use unnamed
> statements more often in preference to named ones, and I believe that
> something like that did get done in JDBC.
>
>             regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


On Tue, 01 Apr 2008 16:06:01 +0200, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Dave Cramer <pg@fastcrypt.com> writes:
>> Was the driver ever changed to take advantage of the above strategy?
>
> Well, it's automatic as long as you use the unnamed statement.  About
> all that might need to be done on the client side is to use unnamed
> statements more often in preference to named ones, and I believe that
> something like that did get done in JDBC.
>
>             regards, tom lane
>

PHP is also affected if you use pg_query_params...
Syntax : pg_query_params( "SQL with $ params", array( parameters )

Note that value is TEXT, indexed, there are 100K rows in table.

pg_query( "SELECT * FROM test WHERE id =12345" ); 1 rows in
0.15931844711304 ms
pg_query( "SELECT * FROM test WHERE value LIKE '1234%'" ); 11 rows in
0.26795864105225 ms

pg_query_params( "SELECT * FROM test WHERE id =$1", array( 12345 ) ); 1
rows in 0.16618013381958 ms
pg_query_params( "SELECT * FROM test WHERE value LIKE $1", array( '1234%'
)); 11 rows in 40.66633939743 ms

Last query does not use index.
However since noone uses pg_query_params in PHP (since PHP coders just
LOVE to manually escape their strings, or worse use magicquotes), noone
should notice ;)

On Tue, 1 Apr 2008, Guillaume Smet wrote:

> A good answer is probably to plan optional JDBC benchmarks in the
> benchfarm design - not all people want to run Java on their boxes but
> we have servers of our own to do so.

The original pgbench was actually based on an older test named JDBCbench.
That code is kind of old and buggy at this point.  But with some care and
cleanup it's possible to benchmark not only relative Java performance with
it, but you can compare it with pgbench running the same queries on the
same tables to see how much overhead going through Java is adding.

Original code at http://mmmysql.sourceforge.net/performance/ , there's
also some improved versions at
http://developer.mimer.com/features/feature_16.htm

I'm not sure if all of those changes are net positive for PostgreSQL
though, they weren't last time I played with this.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

From
"Guillaume Smet"
Date:
On Wed, Apr 2, 2008 at 2:05 AM, Greg Smith <gsmith@gregsmith.com> wrote:
>  I'm not sure if all of those changes are net positive for PostgreSQL
>  though, they weren't last time I played with this.

I fixed most of the bugs of JDBCBench I found when I benchmarked
Sequoia a long time ago. Totally forgot about it. I'll see if I can
find the sources somewhere.

--
Guillaume

Guillaume,

I for one would be very interested in the JDBCBench code.

Dave
On 1-Apr-08, at 8:35 PM, Guillaume Smet wrote:

> On Wed, Apr 2, 2008 at 2:05 AM, Greg Smith <gsmith@gregsmith.com>
> wrote:
>> I'm not sure if all of those changes are net positive for PostgreSQL
>> though, they weren't last time I played with this.
>
> I fixed most of the bugs of JDBCBench I found when I benchmarked
> Sequoia a long time ago. Totally forgot about it. I'll see if I can
> find the sources somewhere.
>
> --
> Guillaume
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

From
"Guillaume Smet"
Date:
On Wed, Apr 2, 2008 at 2:53 AM, Dave Cramer <pg@fastcrypt.com> wrote:
>  I for one would be very interested in the JDBCBench code.

OK, I didn't make anything fancy, I just fixed the problem I
encountered when profiling Sequoia (I mostly used it as an injector).

I'll post the code tomorrow if I can find it somewhere (I lost a
couple of disks and I don't remember the box I used to run the tests).

--
Guillaume

Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

From
Thomas Burdairon
Date:
Is there any patch available for this one?

I'm encountering troubles with some JDBC queries and I'd like to test
it before asking some help on the JDBC list.

Thanks.
Tom

It's pretty easy to test.

prepare the query
and
run explain analyze on the prepared statement.

Dave
On 10-Apr-08, at 5:47 AM, Thomas Burdairon wrote:

> Is there any patch available for this one?
>
> I'm encountering troubles with some JDBC queries and I'd like to
> test it before asking some help on the JDBC list.
>
> Thanks.
> Tom
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

From
Thomas Burdairon
Date:
On 10 avr. 08, at 12:44, Dave Cramer wrote:
> It's pretty easy to test.
>
> prepare the query
> and
> run explain analyze on the prepared statement.
>
> Dave
>
Thank you Dave for your answer, this helped me to find the problem.

1 - it's not a JDBC related problem (sorry for the noise) as i could
reproduce it within psql

2 - Comparing query plans between the prepared and unprepared queries
helped me to find a useless INNER JOIN in the query. Removing it made
the query much faster and I don't have the problem anymore.

3 - The problem was coming from my query (sorry for the noise, again).

Problem solved, at least for me.
Thanks again


Now, i'm still wondering if there could be a planner problem, because
the prepared query was running 10times slower than the unprepared
one. Or maybe a bad configuration on my side

Just for the story here are the different plans :

The prepared query :
http://rafb.net/p/Tn9g6X27.html

the same, unprepared :
http://rafb.net/p/lutugN55.html

the prepared query, fixed :
http://rafb.net/p/HmHetm36.html




Re: How embarrassing: optimization of a one-shot query doesn't work

From
Dave Cramer
Date:
Tom,

I believe this is pretty much a show stopper for anyone using jdbc to
upgrade to 8.3.x.

Any word on 8.3.2 ?

Dave
On 31-Mar-08, at 7:26 PM, Tom Lane wrote:

> While testing the changes I was making to Pavel's EXECUTE USING patch
> to ensure that parameter values were being provided to the planner,
> it became painfully obvious that the planner wasn't actually *doing*
> anything with them.  For example
>
>     execute 'select count(*) from foo where x like $1' into c using $1;
>
> wouldn't generate an indexscan when $1 was of the form 'prefix%'.
>
> Some investigation showed that the planner is using the passed values
> for estimation purposes, but not for any purposes where the value
> *must*
> be correct (not only this LIKE-optimization, but constraint exclusion,
> for instance).  The reason is that the parameter values are made
> available to estimate_expression_value but not to
> eval_const_expressions.
> This is a thinko in a cleanup patch I made early in 8.3 development:
> http://archives.postgresql.org/pgsql-committers/2007-02/msg00352.php
> I said to myself "eval_const_expressions doesn't need any context,
> because a constant expression's value must be independent of context,
> so I can avoid changing its API".  Silly me.
>
> The implication of this is that 8.3 is significantly worse than 8.2
> in optimizing unnamed statements in the extended-Query protocol;
> a feature that JDBC, at least, relies on.
>
> The fix is simple: add PlannerInfo to eval_const_expressions's
> parameter list, as was done for estimate_expression_value.  I am
> slightly hesitant to do this in a stable branch, since it would break
> any third-party code that might be calling that function.  I doubt
> there
> is currently any production-grade code doing so, but if anyone out
> there
> is actively using those planner hooks we put into 8.3, it's
> conceivable
> this would affect them.
>
> Still, the performance regression here is bad enough that I think
> there
> is little choice.  Comments/objections?
>
>             regards, tom lane
>
> --
> Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-jdbc


Re: How embarrassing: optimization of a one-shot query doesn't work

From
Tom Lane
Date:
Dave Cramer <pg@fastcrypt.com> writes:
> Any word on 8.3.2 ?

Obviously, nothing is happening during PGCon ;-)

There was some discussion a week or so back about scheduling a set of
releases in early June, but it's not formally decided.

            regards, tom lane

On 23-May-08, at 9:20 AM, Tom Lane wrote:

> Dave Cramer <pg@fastcrypt.com> writes:
>> Any word on 8.3.2 ?
>
> Obviously, nothing is happening during PGCon ;-)
>
> There was some discussion a week or so back about scheduling a set of
> releases in early June, but it's not formally decided.
>

Now that PGCon is over has there been any more discussion ?

Dave

Dave Cramer <pg@fastcrypt.com> writes:
> On 23-May-08, at 9:20 AM, Tom Lane wrote:
>> There was some discussion a week or so back about scheduling a set of
>> releases in early June, but it's not formally decided.

> Now that PGCon is over has there been any more discussion ?

Yeah, I just posted an announcement about it on -hackers.

            regards, tom lane

Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

From
"Jignesh K. Shah"
Date:
Are there any head fixes proposed for it?

I am seeing some scaling problems with EAStress which uses JDBC with
8.3.0 and this one could be the reason why I am seeing some problems.. I
will be happy to try it out and report on it.. The setup is ready right
now if someone can point me to a patch that I can try it out and
hopefully see if the patch fixes my problem.

-Jignesh


Dave Cramer wrote:
> It's pretty easy to test.
>
> prepare the query
> and
> run explain analyze on the prepared statement.
>
> Dave
> On 10-Apr-08, at 5:47 AM, Thomas Burdairon wrote:
>
>> Is there any patch available for this one?
>>
>> I'm encountering troubles with some JDBC queries and I'd like to test
>> it before asking some help on the JDBC list.
>>
>> Thanks.
>> Tom
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

From
"Jignesh K. Shah"
Date:

Tom Lane wrote:
> "Jignesh K. Shah" <J.K.Shah@Sun.COM> writes:
>
>> Are there any head fixes proposed for it?
>>
>
> It's been fixed in CVS for a month.  We just haven't pushed a release yet.
>

Let me try it out and see what I find out in my EAStress workload.

Regards,
Jignesh


"Jignesh K. Shah" <J.K.Shah@Sun.COM> writes:
> Are there any head fixes proposed for it?

It's been fixed in CVS for a month.  We just haven't pushed a release yet.

            regards, tom lane