Re: Trouble with plan statistics for behaviour for query. - Mailing list pgsql-performance
From | Craig James |
---|---|
Subject | Re: Trouble with plan statistics for behaviour for query. |
Date | |
Msg-id | CAFwQ8rdiCgdqjjgLNKirHTuZP_bWX_+aXrkqXyWPqg1+-0BNgQ@mail.gmail.com Whole thread Raw |
In response to | Re: Trouble with plan statistics for behaviour for query. (Trevor Campbell <tcampbell@atlassian.com>) |
List | pgsql-performance |
On Thu, May 31, 2012 at 4:34 PM, Trevor Campbell <tcampbell@atlassian.com> wrote:
JDBC has some features that are supposed to be convenient (automatic preparing based on a number-of-executions threshold) that strike me as misguided. It's one thing to hide irrelevant details from the app, and another thing entirely to cause a huge change in the exact SQL that's sent to the server ... which is what JDBC seems to do.
I think the trick is that if you use JDBC prepared statements, you have to understand how it's trying to be trickly and circumvent it so that you're always in full control of what it's doing.
Craig
Thanks Craig, that certainly leads down the right path.
The following is all done in pgAdmin3:
Using an actual value we I get the plan I expect
explain analyze select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID = CI.GROUPID where CG.ISSUEID=10006 order by CG.CREATED asc, CI.ID asc
"Sort (cost=106.18..106.22 rows=13 width=434) (actual time=0.115..0.115 rows=12 loops=1)"
" Sort Key: cg.created, ci.id"
" Sort Method: quicksort Memory: 29kB"
" -> Nested Loop (cost=0.00..105.94 rows=13 width=434) (actual time=0.019..0.067 rows=12 loops=1)"
" -> Index Scan using chggroup_issue on changegroup cg (cost=0.00..19.73 rows=10 width=29) (actual time=0.009..0.013 rows=10 loops=1)"
" Index Cond: (issueid = 10006::numeric)"
" -> Index Scan using chgitem_chggrp on changeitem ci (cost=0.00..8.58 rows=3 width=411) (actual time=0.004..0.005 rows=1 loops=10)"
" Index Cond: (groupid = cg.id)"
"Total runtime: 0.153 ms"
Using a prepared statement with a variable , I get a poor plan requiring a sequential scan
prepare t2(real) as
select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID = CI.GROUPID where CG.ISSUEID=$1 order by CG.CREATED asc, CI.ID asc;
explain analyze execute t2 (10006);
"Sort (cost=126448.89..126481.10 rows=12886 width=434) (actual time=1335.615..1335.616 rows=12 loops=1)"
" Sort Key: cg.created, ci.id"
" Sort Method: quicksort Memory: 29kB"
" -> Nested Loop (cost=0.00..125569.19 rows=12886 width=434) (actual time=0.046..1335.556 rows=12 loops=1)"
" -> Seq Scan on changegroup cg (cost=0.00..44709.26 rows=10001 width=29) (actual time=0.026..1335.460 rows=10 loops=1)"
" Filter: ((issueid)::double precision = $1)"
" -> Index Scan using chgitem_chggrp on changeitem ci (cost=0.00..8.05 rows=3 width=411) (actual time=0.007..0.008 rows=1 loops=10)"
" Index Cond: (groupid = cg.id)"
"Total runtime: 1335.669 ms"
Using a prepared statement with a cast of the variable to the right type, I get the good plan back
prepare t2(real) as
select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID = CI.GROUPID where CG.ISSUEID=cast($1 as numeric) order by CG.CREATED asc, CI.ID asc;
explain analyze execute t2 (10006);
"Sort (cost=106.19..106.22 rows=13 width=434) (actual time=0.155..0.156 rows=12 loops=1)"
" Sort Key: cg.created, ci.id"
" Sort Method: quicksort Memory: 29kB"
" -> Nested Loop (cost=0.00..105.95 rows=13 width=434) (actual time=0.048..0.111 rows=12 loops=1)"
" -> Index Scan using chggroup_issue on changegroup cg (cost=0.00..19.73 rows=10 width=29) (actual time=0.031..0.042 rows=10 loops=1)"
" Index Cond: (issueid = ($1)::numeric)"
" -> Index Scan using chgitem_chggrp on changeitem ci (cost=0.00..8.58 rows=3 width=411) (actual time=0.006..0.006 rows=1 loops=10)"
" Index Cond: (groupid = cg.id)"
"Total runtime: 0.203 ms"
Now the challenge is to get java/jdbc to get this done right. We make a big effort to ensure we always use prepared statements and variable bindings to help protect from SQL injection vulnerabilities.
JDBC has some features that are supposed to be convenient (automatic preparing based on a number-of-executions threshold) that strike me as misguided. It's one thing to hide irrelevant details from the app, and another thing entirely to cause a huge change in the exact SQL that's sent to the server ... which is what JDBC seems to do.
I think the trick is that if you use JDBC prepared statements, you have to understand how it's trying to be trickly and circumvent it so that you're always in full control of what it's doing.
Craig
On 01/06/12 09:08, Craig James wrote:I use Perl, not JDBC, but this thread may be relevant to your problem.
http://postgresql.1045698.n5.nabble.com/Slow-statement-when-using-JDBC-td3368379.html
pgsql-performance by date: