Thread: Slow query after upgrades

Slow query after upgrades

From
Tom Duffey
Date:
Hi All,

We recently upgraded both our PostgreSQL server version (8.1.10 to
8.2.5) and JDBC driver version (pg74.216.jdbc3, yeah, I know... to
8.2-507 JDBC4).  Our application has several components and all but
one are working great.  Unfortunately, one of them is running much
slower when I use the new JDBC driver.  If I leave everything else
exactly the same but roll back to the pg74.216.jdbc3 driver
performance improves quite a bit.  By quite a bit I mean a process
that takes about 15 minutes to run with the old JDBC driver takes over
an hour and a half with the new driver.

I'm leaving the old JDBC driver there for now but am wondering if
anyone has any ideas what I should look for to find a solution to
this?  I jProfiled the execution and it is indeed the JDBC query that
is taking significantly more time to execute with the newer driver in
place.

Best Regards,

Tom

Re: Slow query after upgrades

From
Oliver Jowett
Date:
Tom Duffey wrote:

> I'm leaving the old JDBC driver there for now but am wondering if anyone
> has any ideas what I should look for to find a solution to this?  I
> jProfiled the execution and it is indeed the JDBC query that is taking
> significantly more time to execute with the newer driver in place.

Identify the slow query and build a testcase for that particular query
and send it to the list?

Also, beware of counting blocking I/O time as CPU time when doing Java
profiling. Probably, the Java side is just blocking waiting on the server.

There are server-side logging options you might want to look at too.

The most likely cause is that more recent drivers ("recent" is very
relative here, I think it went in for the 8.0 driver?) use the extended
query protocol which can result in different plans for some queries.

-O

Re: Slow query after upgrades

From
Tom Duffey
Date:
On Dec 13, 2007, at 3:20 PM, Oliver Jowett wrote:

> Tom Duffey wrote:
>
>> I'm leaving the old JDBC driver there for now but am wondering if
>> anyone has any ideas what I should look for to find a solution to
>> this?  I jProfiled the execution and it is indeed the JDBC query
>> that is taking significantly more time to execute with the newer
>> driver in place.
>
> Identify the slow query and build a testcase for that particular
> query and send it to the list?
>
> Also, beware of counting blocking I/O time as CPU time when doing
> Java profiling. Probably, the Java side is just blocking waiting on
> the server.
>
> There are server-side logging options you might want to look at too.
>
> The most likely cause is that more recent drivers ("recent" is very
> relative here, I think it went in for the 8.0 driver?) use the
> extended query protocol which can result in different plans for some
> queries.

The odd thing is that the same query is used in other components of
the application and we have not noticed any issues.  This led me to
believe that something about this particular piece is causing the
problem, possibly external to the query or the JDBC driver.  Running
the query in psql, for example, doesn't seem slow at all.  So maybe
it's not a PostgreSQL problem at all but I thought someone might have
seen this and have some insight into where I should look.

Is there some logging I can use to compare what is happening on the
server side when this query is run by the app with the problem vs.
something else (psql or another of our applications)?

Tom

Re: Slow query after upgrades

From
"Guillaume Smet"
Date:
On Dec 13, 2007 10:28 PM, Tom Duffey <tduffey@techbydesign.com> wrote:
> The odd thing is that the same query is used in other components of
> the application and we have not noticed any issues.  This led me to
> believe that something about this particular piece is causing the
> problem, possibly external to the query or the JDBC driver.  Running
> the query in psql, for example, doesn't seem slow at all.  So maybe
> it's not a PostgreSQL problem at all but I thought someone might have
> seen this and have some insight into where I should look.

Usual questions: did you analyze your database after loading the dump?

One of the problem you can meet is that the new driver uses protocol
v3 by default, which means that your query are prepared and planned
without the value of the parameters as it was the case using v2 of the
protocol. This can lead in bad plans in specific cases.
You may consider using protocol v2 again if it's your problem by
adding protocolVersion=2 to your JDBC connection URL.

> Is there some logging I can use to compare what is happening on the
> server side when this query is run by the app with the problem vs.
> something else (psql or another of our applications)?

Use log_min_duration_statement in your postgresql.conf (0 if you want
to log every queries).

--
Guillaume

Re: Slow query after upgrades

From
Tom Duffey
Date:
On Dec 13, 2007, at 3:58 PM, Guillaume Smet wrote:

> On Dec 13, 2007 10:28 PM, Tom Duffey <tduffey@techbydesign.com> wrote:
>> The odd thing is that the same query is used in other components of
>> the application and we have not noticed any issues.  This led me to
>> believe that something about this particular piece is causing the
>> problem, possibly external to the query or the JDBC driver.  Running
>> the query in psql, for example, doesn't seem slow at all.  So maybe
>> it's not a PostgreSQL problem at all but I thought someone might have
>> seen this and have some insight into where I should look.
>
> Usual questions: did you analyze your database after loading the dump?

Yes.

> One of the problem you can meet is that the new driver uses protocol
> v3 by default, which means that your query are prepared and planned
> without the value of the parameters as it was the case using v2 of the
> protocol. This can lead in bad plans in specific cases.
> You may consider using protocol v2 again if it's your problem by
> adding protocolVersion=2 to your JDBC connection URL.
>
>> Is there some logging I can use to compare what is happening on the
>> server side when this query is run by the app with the problem vs.
>> something else (psql or another of our applications)?
>
> Use log_min_duration_statement in your postgresql.conf (0 if you want
> to log every queries).

This seems to be exactly what is happening.  The old driver executes
the query in 2.6ms while the new driver takes 63516.5ms.  It's worth
pointing out that this is a large table containing 310646676 rows and
growing.

Here is the log output when executing the query in psql:

LOG:  duration: 15.771 ms  statement: SELECT point_history.point_id,
point_history.timestamp, point_history.value, point_history.status
FROM point_history WHERE point_history.point_id = 21001 AND poi
nt_history.timestamp >= '2007-12-11 22:22:45' ORDER BY
point_history.timestamp DESC;

Then the pg74.216.jdbc3 driver:

LOG:  duration: 2.682 ms  statement:        SELECT
point_history.point_id, point_history.timestamp, point_history.value,
point_history.status FROM point_history WHERE point_history.point_id =
21001 AND                    point_history.timestamp >= '2007-12-11
22:27:41'

And the 8.2 JDBC4 driver:

LOG:  duration: 1.047 ms  parse <unnamed>:          SELECT
point_history.point_id, point_history.timestamp, point_history.value,
point_history.status FROM point_history WHERE point_history.point_id =
$1 AND point_history.timestamp >= $2 ORDER BY point_history.timestamp
DESC
LOG:  duration: 0.335 ms  bind <unnamed>:       SELECT
point_history.point_id, point_history.timestamp, point_history.value,
point_history.status FROM point_history WHERE point_history.point_id =
$1 AND                                     point_history.timestamp >=
$2 ORDER BY point_history.timestamp DESC
DETAIL:  parameters: $1 = '21001', $2 = '2007-12-11 22:22:45'
LOG:  duration: 63516.530 ms  execute <unnamed>:        SELECT
point_history.point_id, point_history.timestamp, point_history.value,
point_history.status FROM point_history WHERE point_history.point_id =
$1 AND                                     point_history.timestamp >=
$2 ORDER BY point_history.timestamp DESC
DETAIL:  parameters: $1 = '21001', $2 = '2007-12-11 22:22:45'

Here's what explain has to say about this query:

foo=> EXPLAIN SELECT point_history.point_id, point_history.timestamp,
point_history.value, point_history.status FROM point_history WHERE
point_history.point_id = 21001 AND point_history.timestamp >=
'2007-12-11 22:22:45' ORDER BY point_history.timestamp DESC;
                                                  QUERY PLAN
------------------------------------------------------------------------------------------------------------
  Index Scan Backward using point_history_pkey on point_history
(cost=0.00..264.60 rows=58 width=20)
    Index Cond: ((point_id = 21001) AND ("timestamp" >= '2007-12-11
22:22:45'::timestamp without time zone))
(2 rows)

Any suggestions?

Tom


Re: Slow query after upgrades

From
"Guillaume Smet"
Date:
On Dec 13, 2007 11:37 PM, Tom Duffey <tduffey@techbydesign.com> wrote:
> foo=> EXPLAIN SELECT point_history.point_id, point_history.timestamp,
> point_history.value, point_history.status FROM point_history WHERE
> point_history.point_id = 21001 AND point_history.timestamp >=
> '2007-12-11 22:22:45' ORDER BY point_history.timestamp DESC;
>                                                   QUERY PLAN
> ------------------------------------------------------------------------------------------------------------
>   Index Scan Backward using point_history_pkey on point_history
> (cost=0.00..264.60 rows=58 width=20)
>     Index Cond: ((point_id = 21001) AND ("timestamp" >= '2007-12-11
> 22:22:45'::timestamp without time zone))
> (2 rows)

It's not an EXPLAIN ANALYZE so we can't see if the query is slow or fast.

If it's fast using the parameters in psql, just try to add
protocolVersion=2 to your JDBC connection URL as I told you earlier.
It should solve your problem. You won't be able to use new features of
protocol v3 but if you used 7.4 before, it shouldn't be a problem for
you.

--
Guillaume

Re: Slow query after upgrades

From
Tom Duffey
Date:
On Dec 13, 2007, at 4:52 PM, Guillaume Smet wrote:

> It's not an EXPLAIN ANALYZE so we can't see if the query is slow or
> fast.


QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------
  Index Scan Backward using point_history_pkey on point_history
(cost=0.00..264.60 rows=58 width=20) (actual time=0.131..1.596
rows=192 loops=1)
    Index Cond: ((point_id = 21001) AND ("timestamp" >= '2007-12-11
22:22:45'::timestamp without time zone))
  Total runtime: 1.846 ms

> If it's fast using the parameters in psql, just try to add
> protocolVersion=2 to your JDBC connection URL as I told you earlier.
> It should solve your problem. You won't be able to use new features of
> protocol v3 but if you used 7.4 before, it shouldn't be a problem for
> you.

This makes a huge difference, thanks.  However, is there anything I
can do to help improve the performance of that query when using a
newer protocol?

Tom

Re: Slow query after upgrades

From
Oliver Jowett
Date:
Tom Duffey wrote:

> This makes a huge difference, thanks.  However, is there anything I can
> do to help improve the performance of that query when using a newer
> protocol?

We need to work out what is going wrong under the newer protocol first,
which unfortunately can be a bit tricky because psql doesn't use the
query protocol in the same way as the JDBC driver.

Can you write a little JDBC app that runs the query with EXPLAIN
ANALYZE, setting parameters in the same way that your real app does, and
dump out the resultset it gets back? (which should be the EXPLAIN
results rather than the real data, IIRC)

-O

Re: Slow query after upgrades

From
Tom Duffey
Date:
On Dec 13, 2007, at 5:05 PM, Oliver Jowett wrote:

> Tom Duffey wrote:
>
>> This makes a huge difference, thanks.  However, is there anything I
>> can do to help improve the performance of that query when using a
>> newer protocol?
>
> We need to work out what is going wrong under the newer protocol
> first, which unfortunately can be a bit tricky because psql doesn't
> use the query protocol in the same way as the JDBC driver.
>
> Can you write a little JDBC app that runs the query with EXPLAIN
> ANALYZE, setting parameters in the same way that your real app does,
> and dump out the resultset it gets back? (which should be the
> EXPLAIN results rather than the real data, IIRC)

QUERY PLAN=Sort  (cost=289075.73..289147.11 rows=28554 width=20)
(actual time=141222.112..141222.294 rows=192 loops=1)
QUERY PLAN= Sort Key: "timestamp"
QUERY PLAN=  ->  Bitmap Heap Scan on point_history
(cost=1847.73..286962.53 rows=28554 width=20) (actual
time=140998.758..141221.691 rows=192 loops=1)
QUERY PLAN=        Recheck Cond: (point_id = 21001)
QUERY PLAN=        Filter: (("timestamp")::text >= '2007-12-11
23:20:57'::text)
QUERY PLAN=        ->  Bitmap Index Scan on point_history_pkey
(cost=0.00..1840.59 rows=85663 width=0) (actual time=122.393..122.393
rows=65262 loops=1)
QUERY PLAN=              Index Cond: (point_id = 21001)
QUERY PLAN=Total runtime: 141222.791 ms

Cool, this makes it look like the problem is that the "timestamp"
parameter as a string instead of a timestamp.  What do you think?

Tom

Re: Slow query after upgrades

From
Tom Lane
Date:
Oliver Jowett <oliver@opencloud.com> writes:
> Tom Duffey wrote:
>> This makes a huge difference, thanks.  However, is there anything I can
>> do to help improve the performance of that query when using a newer
>> protocol?

> We need to work out what is going wrong under the newer protocol first,

It looks pretty obvious from the peanut gallery: in the parameterized
query, the planner daren't choose an indexscan, because for the vast
majority of the possible values of the parameter an indexscan would
suck.

Tom's apparently only interested in the case where the parameter is
close to the end of the range, so that only a few rows need to be
retrieved.  In this case the indexscan wins big, but the planner can't
count on that.

A possible hack is to put a reasonably small LIMIT on the query.

            regards, tom lane

Re: Slow query after upgrades

From
Oliver Jowett
Date:
Tom Lane wrote:

> It looks pretty obvious from the peanut gallery: in the parameterized
> query, the planner daren't choose an indexscan, because for the vast
> majority of the possible values of the parameter an indexscan would
> suck.

The JDBC driver is using an unnamed statement in this case, shouldn't
the planner end up using index selectivity estimates based on the actual
parameter values?

 From the explain output that just came through it looks like a type
mismatch problem on the timestamp parameter.

-O

Re: Slow query after upgrades

From
Oliver Jowett
Date:
Tom Duffey wrote:

> Cool, this makes it look like the problem is that the "timestamp"
> parameter as a string instead of a timestamp.  What do you think?

Yep, looks like it. How exactly are you setting that timestamp parameter?

-O


Re: Slow query after upgrades

From
Tom Duffey
Date:
On Dec 13, 2007, at 5:57 PM, Oliver Jowett wrote:

> Tom Duffey wrote:
>
>> Cool, this makes it look like the problem is that the "timestamp"
>> parameter as a string instead of a timestamp.  What do you think?
>
> Yep, looks like it. How exactly are you setting that timestamp
> parameter?

I'm using iBATIS so answering exactly how the parameters get set is
tricky, but if this is indeed the problem I'm pretty sure I can take
it from here.  Thank you for helping get to the bottom of it.

Tom

Re: Slow query after upgrades

From
"Guillaume Smet"
Date:
On Dec 14, 2007 12:54 AM, Oliver Jowett <oliver@opencloud.com> wrote:
> The JDBC driver is using an unnamed statement in this case, shouldn't
> the planner end up using index selectivity estimates based on the actual
> parameter values?

Tom fixed that during the 8.3 development cycle IIRC but it was not
the case in 8.2 and earlier.

--
Guillaume

Re: Slow query after upgrades

From
Tom Lane
Date:
Oliver Jowett <oliver@opencloud.com> writes:
> The JDBC driver is using an unnamed statement in this case, shouldn't
> the planner end up using index selectivity estimates based on the actual
> parameter values?

If he's using a recent enough backend, it should.

>  From the explain output that just came through it looks like a type
> mismatch problem on the timestamp parameter.

Yeah, no question.  Implicit casts to text strike again :-(.  He was
probably getting the wrong answers, not only a slow query.

            regards, tom lane

Re: Slow query after upgrades

From
Albert László-Róbert
Date:
Hi All,

i have the same problem like above, with a little difference. i read all
the messages posted on this topic, and the symptoms were the same.

On Thu, 2007-12-13 at 15:15 -0600, Tom Duffey wrote:
> Hi All,
>
> We recently upgraded both our PostgreSQL server version (8.1.10 to
> 8.2.5) and JDBC driver version (pg74.216.jdbc3, yeah, I know... to

firstly we upgraded from version 8.1.2 (8.1 [i don't know the build
number] JDBC3 Driver) to 8.2.4 (initially we used the same driver and we
upgraded to the 8.2-507 JDBC3 driver) and we noticed this problem that
the queries that were using outer joins were slow. we find out that in
8.2.4 there was a problem with the optimization and the developers
advise to use the 8.2.5 ... so we upgraded the database to 8.2.5.
i must notice that before the database was running on a dual CPU server
with 2GB of memory ... the upgraded database was installed on a 4 CPU
system with 6 GB of memory ... so we expected some speed improvements
but we were disappointed.

> 8.2-507 JDBC4).  Our application has several components and all but
>
> one are working great.  Unfortunately, one of them is running much
> slower when I use the new JDBC driver.  If I leave everything else
> exactly the same but roll back to the pg74.216.jdbc3 driver

we do not tried this because we desperately needed the case sensitive
and case insensitive improvements of the LIKE and ILIKE operators on the
Unicode character set (UTF8-hu).


> performance improves quite a bit.  By quite a bit I mean a process
> that takes about 15 minutes to run with the old JDBC driver takes over
> an hour and a half with the new driver.

the performance before were arount 1-2 minutes ... and after the upgrade
it was about 30-40 minutes and more ... but because our application is
running throw the internet using the CORBA technology in most of the
cases the database throws an org.postgresql.util.PSQLException(: An I/O
error occured while sending to the backend.) exception.

>
> I'm leaving the old JDBC driver there for now but am wondering if
> anyone has any ideas what I should look for to find a solution to
> this?  I jProfiled the execution and it is indeed the JDBC query that
> is taking significantly more time to execute with the newer driver in
> place.
>
> Best Regards,
>
> Tom
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq

i hope this will help in the solving of the problem and maybe to get a
better suggestion to solve our problem.

sorry for my english ... i'm not native :)

regards
--
László-Róbert, Albert
mobile: +40-(0)-742-874-854
office: +40-(0)-365-410-540
e-mail: albertlaszlorobert@gmail.com


Re: Slow query after upgrades

From
"Guillaume Smet"
Date:
Albert,

On Dec 14, 2007 9:55 AM, Albert László-Róbert
<albertlaszlorobert@gmail.com> wrote:
> the performance before were arount 1-2 minutes ... and after the upgrade
> it was about 30-40 minutes and more ... but because our application is
> running throw the internet using the CORBA technology in most of the
> cases the database throws an org.postgresql.util.PSQLException(: An I/O
> error occured while sending to the backend.) exception.

Did you analyze your database after upgrade?

Can you enable log_min_duration_statement to see which queries are
slower than before? You can use pgFouine to obtain an analysis of the
log file.

--
Guillaume

Re: Slow query after upgrades

From
Kris Jurka
Date:

On Fri, 14 Dec 2007, Albert L�szl�-R�bert wrote:

> the performance before were arount 1-2 minutes ... and after the upgrade
> it was about 30-40 minutes and more ... but because our application is
> running throw the internet using the CORBA technology in most of the
> cases the database throws an org.postgresql.util.PSQLException(: An I/O
> error occured while sending to the backend.) exception.
>

I don't understand why running slower results in IOExceptions.  Is the
driver actively broken, or is your app just slower?

Kris Jurka

Re: Slow query after upgrades

From
Albert László-Róbert
Date:
On Fri, 2007-12-14 at 05:24 -0500, Kris Jurka wrote:
>
> On Fri, 14 Dec 2007, Albert Lszl-Rbert wrote:
>
> > the performance before were arount 1-2 minutes ... and after the upgrade
> > it was about 30-40 minutes and more ... but because our application is
> > running throw the internet using the CORBA technology in most of the
> > cases the database throws an org.postgresql.util.PSQLException(: An I/O
> > error occured while sending to the backend.) exception.
> >
>
> I don't understand why running slower results in IOExceptions.  Is the
> driver actively broken, or is your app just slower?
>

Generally the use of the driver is OK, but with some of the queries (for
the moment we've got feedback for two heavily used queries) we had this
kind of error. When we ran the query from the console we've got the same
result (it was faster than through the JDBC driver).

> Kris Jurka
--
László-Róbert, Albert
mobile: +40-(0)-742-874-854
office: +40-(0)-365-410-540
e-mail: albertlaszlorobert@gmail.com


Re: Slow query after upgrades

From
Albert László-Róbert
Date:
On Fri, 2007-12-14 at 10:15 +0100, Guillaume Smet wrote:
> Albert,
>
> On Dec 14, 2007 9:55 AM, Albert László-Róbert
> <albertlaszlorobert@gmail.com> wrote:
> > the performance before were arount 1-2 minutes ... and after the upgrade
> > it was about 30-40 minutes and more ... but because our application is
> > running throw the internet using the CORBA technology in most of the
> > cases the database throws an org.postgresql.util.PSQLException(: An I/O
> > error occured while sending to the backend.) exception.
>
> Did you analyze your database after upgrade?

yes

>
> Can you enable log_min_duration_statement to see which queries are
> slower than before? You can use pgFouine to obtain an analysis of the
> log file.
>
> --
> Guillaume

We would try that if we hadn't to concentrate on solving the problem for
the most resource-demanding queries which currently are not working at
all.

We presume, though, that problem is around optimizing queries with large
result sub-selects joined with other ones.

--
László-Róbert, Albert
mobile: +40-(0)-742-874-854
office: +40-(0)-365-410-540
e-mail: albertlaszlorobert@gmail.com


Re: Slow query after upgrades

From
Tom Duffey
Date:
On Dec 13, 2007, at 7:03 PM, Tom Lane wrote:

> Oliver Jowett <oliver@opencloud.com> writes:
>> The JDBC driver is using an unnamed statement in this case, shouldn't
>> the planner end up using index selectivity estimates based on the
>> actual
>> parameter values?
>
> If he's using a recent enough backend, it should.
>
>> From the explain output that just came through it looks like a type
>> mismatch problem on the timestamp parameter.
>
> Yeah, no question.  Implicit casts to text strike again :-(.  He was
> probably getting the wrong answers, not only a slow query.

What do you think about adding type information to the log?  Something
like:

DETAIL:  parameters: $1 = '21001'::integer, $2 = '2007-12-11
22:22:45'::text

or similar would have revealed the source of my problem earlier.

Tom

Re: Slow query after upgrades

From
Tom Lane
Date:
Tom Duffey <tduffey@techbydesign.com> writes:
> On Dec 13, 2007, at 7:03 PM, Tom Lane wrote:
>> Yeah, no question.  Implicit casts to text strike again :-(.  He was
>> probably getting the wrong answers, not only a slow query.

> What do you think about adding type information to the log?  Something
> like:

> DETAIL:  parameters: $1 = '21001'::integer, $2 = '2007-12-11
> 22:22:45'::text

> or similar would have revealed the source of my problem earlier.

8.3 will already reveal your problem quicker ;-)

regression=# select now() < '2007-12-14 11:15:02.284223-05'::text;
ERROR:  operator does not exist: timestamp with time zone < text
LINE 1: select now() < '2007-12-14 11:15:02.284223-05'::text;
                     ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

            regards, tom lane