Re: Trouble with plan statistics for behaviour for query. - Mailing list pgsql-performance

From Craig James
Subject Re: Trouble with plan statistics for behaviour for query.
Date
Msg-id CAFwQ8rdiCgdqjjgLNKirHTuZP_bWX_+aXrkqXyWPqg1+-0BNgQ@mail.gmail.com
Whole thread Raw
In response to Re: Trouble with plan statistics for behaviour for query.  (Trevor Campbell <tcampbell@atlassian.com>)
List pgsql-performance
On Thu, May 31, 2012 at 4:34 PM, Trevor Campbell <tcampbell@atlassian.com> wrote:
Thanks Craig, that certainly leads down the right path.

The following is all done in pgAdmin3:

Using an actual value we I get the plan I expect
explain analyze select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
  from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID = CI.GROUPID where CG.ISSUEID=10006 order by CG.CREATED asc, CI.ID asc

"Sort  (cost=106.18..106.22 rows=13 width=434) (actual time=0.115..0.115 rows=12 loops=1)"
"  Sort Key: cg.created, ci.id"
"  Sort Method: quicksort  Memory: 29kB"
"  ->  Nested Loop  (cost=0.00..105.94 rows=13 width=434) (actual time=0.019..0.067 rows=12 loops=1)"
"        ->  Index Scan using chggroup_issue on changegroup cg  (cost=0.00..19.73 rows=10 width=29) (actual time=0.009..0.013 rows=10 loops=1)"
"              Index Cond: (issueid = 10006::numeric)"
"        ->  Index Scan using chgitem_chggrp on changeitem ci  (cost=0.00..8.58 rows=3 width=411) (actual time=0.004..0.005 rows=1 loops=10)"
"              Index Cond: (groupid = cg.id)"
"Total runtime: 0.153 ms"

Using a prepared statement with a variable , I get a poor plan requiring a sequential scan
prepare t2(real) as
 select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
  from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID = CI.GROUPID where CG.ISSUEID=$1 order by CG.CREATED asc, CI.ID asc;

 explain analyze execute t2 (10006);

"Sort  (cost=126448.89..126481.10 rows=12886 width=434) (actual time=1335.615..1335.616 rows=12 loops=1)"
"  Sort Key: cg.created, ci.id"
"  Sort Method: quicksort  Memory: 29kB"
"  ->  Nested Loop  (cost=0.00..125569.19 rows=12886 width=434) (actual time=0.046..1335.556 rows=12 loops=1)"
"        ->  Seq Scan on changegroup cg  (cost=0.00..44709.26 rows=10001 width=29) (actual time=0.026..1335.460 rows=10 loops=1)"
"              Filter: ((issueid)::double precision = $1)"
"        ->  Index Scan using chgitem_chggrp on changeitem ci  (cost=0.00..8.05 rows=3 width=411) (actual time=0.007..0.008 rows=1 loops=10)"
"              Index Cond: (groupid = cg.id)"
"Total runtime: 1335.669 ms"

Using a prepared statement with a cast of the variable to the right type, I get the good plan back
prepare t2(real) as
 select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
  from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID = CI.GROUPID where CG.ISSUEID=cast($1 as numeric) order by CG.CREATED asc, CI.ID asc;

 explain analyze execute t2 (10006);

"Sort  (cost=106.19..106.22 rows=13 width=434) (actual time=0.155..0.156 rows=12 loops=1)"
"  Sort Key: cg.created, ci.id"
"  Sort Method: quicksort  Memory: 29kB"
"  ->  Nested Loop  (cost=0.00..105.95 rows=13 width=434) (actual time=0.048..0.111 rows=12 loops=1)"
"        ->  Index Scan using chggroup_issue on changegroup cg  (cost=0.00..19.73 rows=10 width=29) (actual time=0.031..0.042 rows=10 loops=1)"
"              Index Cond: (issueid = ($1)::numeric)"
"        ->  Index Scan using chgitem_chggrp on changeitem ci  (cost=0.00..8.58 rows=3 width=411) (actual time=0.006..0.006 rows=1 loops=10)"
"              Index Cond: (groupid = cg.id)"
"Total runtime: 0.203 ms"

Now the challenge is to get java/jdbc to get this done right.  We make a big effort to ensure we always use prepared statements and variable bindings to help protect from SQL injection vulnerabilities.

JDBC has some features that are supposed to be convenient (automatic preparing based on a number-of-executions threshold) that strike me as misguided.  It's one thing to hide irrelevant details from the app, and another thing entirely to cause a huge change in the exact SQL that's sent to the server ... which is what JDBC seems to do.

I think the trick is that if you use JDBC prepared statements, you have to understand how it's trying to be trickly and circumvent it so that you're always in full control of what it's doing.

Craig
 



On 01/06/12 09:08, Craig James wrote:
I use Perl, not JDBC, but this thread may be relevant to your problem.

http://postgresql.1045698.n5.nabble.com/Slow-statement-when-using-JDBC-td3368379.html



pgsql-performance by date:

Previous
From: Trevor Campbell
Date:
Subject: Re: Trouble with plan statistics for behaviour for query.
Next
From: Vitalii Tymchyshyn
Date:
Subject: Re: Trouble with plan statistics for behaviour for query.