Thread: Bound parameters on Linux are extremely slow (compared to unbound and/or Windows)

Bound parameters on Linux are extremely slow (compared to unbound and/or Windows)

From
"Reuven M. Lerner"
Date:
Hi, everyone.  A project I'm assisting is connecting to PostgreSQL (9.2)
via JDBC.    We're experiencing extremely slow query execution, for
precisely the same queries, depending on (a) operating system and (b)
whether we use bind parameters.

We're using two different Red Hat Linux systems (one running RHEL 5, and
the other RHEL 6).  The machines have 16 GB and 64 GB RAM, respectively,
with 4+ GB allocated to shared buffers on each one.  However, given that
the performance difference is between psql and JDBC, I doubt that these
are significant factors.

We have two different Java programs (which do 1,000 inserts), each of
which we've run 10 times on each of Windows and Linux.   The average
time needed for running each program is:

Windows: 232 ms with binding, 152 ms without binding
Linux: 2,233 ms (yes, 2.233 *seconds*) with binding, 176 ms without binding

We've also run the same 1,000 inserts via psql on a remote machine, and
the timing was quite fast (i.e., like what we saw on the Windows machine).

The combination of binding + Linux seems to be the problematic one here,
but I'm at a loss for why.

I can provide the test programs if they'll help, but I'm first wondering
where the problem might lie.

Any suggestions or ideas?  What additional information (e.g., the test
programs) would help to track this problem down?

Thanks in advance,

Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
http://lerner.co.il * Skype/AIM: reuvenlerner


The problem lies in the fact that JDBC uses an unnamed statement to do the query. The planner can't plan these very well as the parameters are not available at plan time.

This does not explain the slowness on linux however. Is postgresql tuned on linux, or windows ?

In order to test this in psql you would have to use a prepared statement such as 

Prepare foo as select * from bar where col = $1
execute foo('blah')


You can also do explain analyze on execute foo ('blah') to see the difference in the plans

Dave

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca


On Sat, May 25, 2013 at 6:52 PM, Reuven M. Lerner <reuven@lerner.co.il> wrote:
Hi, everyone.  A project I'm assisting is connecting to PostgreSQL (9.2) via JDBC.    We're experiencing extremely slow query execution, for precisely the same queries, depending on (a) operating system and (b) whether we use bind parameters.

We're using two different Red Hat Linux systems (one running RHEL 5, and the other RHEL 6).  The machines have 16 GB and 64 GB RAM, respectively, with 4+ GB allocated to shared buffers on each one.  However, given that the performance difference is between psql and JDBC, I doubt that these are significant factors.

We have two different Java programs (which do 1,000 inserts), each of which we've run 10 times on each of Windows and Linux.   The average time needed for running each program is:

Windows: 232 ms with binding, 152 ms without binding
Linux: 2,233 ms (yes, 2.233 *seconds*) with binding, 176 ms without binding

We've also run the same 1,000 inserts via psql on a remote machine, and the timing was quite fast (i.e., like what we saw on the Windows machine).

The combination of binding + Linux seems to be the problematic one here, but I'm at a loss for why.

I can provide the test programs if they'll help, but I'm first wondering where the problem might lie.

Any suggestions or ideas?  What additional information (e.g., the test programs) would help to track this problem down?

Thanks in advance,

Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
http://lerner.co.il * Skype/AIM: reuvenlerner


--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Hi, Dave.  Thanks for the response.  You wrote:

> The problem lies in the fact that JDBC uses an unnamed statement to do
> the query. The planner can't plan these very well as the parameters are
> not available at plan time.

Is there any way to avoid this, and to give the planner more hints?  Is
there an option in JDBC that can give us a more direct connection?
>
> This does not explain the slowness on linux however. Is postgresql tuned
> on linux, or windows ?

The tuning suggestions that I've given were pretty plain overall, and
mostly involved changing the amounts given to shared_buffers and
effective_cache_size.  I know that I gave them suggestions for the Linux
box, so if any tuning was done, it was actually there.  I was quite
surprised to hear that they were even trying it on Windows, let alone
that it gave significantly better performance.

On the Linux side, they've switched from ext3 to (non-journaled) ext2,
to improve performance.  Execution is excellent on Linux, except via
JDBC and bound parameters.

Any other suggestions regarding the differences between Windows and
Linux?  As you can imagine, this is really puzzling me (er, us).

Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
http://lerner.co.il * Skype/AIM: reuvenlerner




Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca


On Sat, May 25, 2013 at 7:21 PM, Reuven M. Lerner <reuven@lerner.co.il> wrote:
Hi, Dave.  Thanks for the response.  You wrote:

The problem lies in the fact that JDBC uses an unnamed statement to do
the query. The planner can't plan these very well as the parameters are
not available at plan time.

Is there any way to avoid this, and to give the planner more hints?

No, not really, pg doesn't have hints.
 
 Is there an option in JDBC that can give us a more direct connection?

Yes but you will have no protection from sql injection. You can use protocol 2  add protocolVersion=2 to the url

This does not explain the slowness on linux however. Is postgresql tuned
on linux, or windows ?

The tuning suggestions that I've given were pretty plain overall, and mostly involved changing the amounts given to shared_buffers and effective_cache_size.  I know that I gave them suggestions for the Linux box, so if any tuning was done, it was actually there.  I was quite surprised to hear that they were even trying it on Windows, let alone that it gave significantly better performance.

OK, so 25% memory to shared memory and 75%effective cache, you should also lower randam page cost as this is very conservative. 

On the Linux side, they've switched from ext3 to (non-journaled) ext2, to improve performance.  Execution is excellent on Linux, except via JDBC and bound parameters.

This is not advisable, the difference is minimal the risk is great. They would be better with ex4.

Can you get the result of explain analyze for prepare/execute and the non prepared statement from psql?


Any other suggestions regarding the differences between Windows and Linux?  As you can imagine, this is really puzzling me (er, us).

 Are the boxes really the same hardware wise ?

Dave


Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
http://lerner.co.il * Skype/AIM: reuvenlerner

Thanks for the suggestions, Dave; I'm traveling, but will get the added
data and post to the list ASAP.  I appreciate your time and ideas.

One of your answers has piqued my curiosity, though:

>       Is there an option in JDBC that can give us a more direct connection?
>
> Yes but you will have no protection from sql injection. You can use protocol 2  add protocolVersion=2 to the url
>

We don't need to worry about SQL injection in this particular
application, so far as I know.  (I'll obviously check with the
developers, to make sure that it's really a zero chance, but I'm 95%
sure that it's an internal app without any external data sources.)

What else would we lose by switching to protocol version 2?  Does this
version not use parameter binding?  I'm just wondering what will and
won't work in protocol 2, and why it might be faster for our purposes.

Reuven




On Sat, May 25, 2013 at 7:47 PM, Reuven M. Lerner <reuven@lerner.co.il> wrote:
Thanks for the suggestions, Dave; I'm traveling, but will get the added data and post to the list ASAP.  I appreciate your time and ideas.

One of your answers has piqued my curiosity, though:


      Is there an option in JDBC that can give us a more direct connection?

Yes but you will have no protection from sql injection. You can use protocol 2  add protocolVersion=2 to the url


We don't need to worry about SQL injection in this particular application, so far as I know.  (I'll obviously check with the developers, to make sure that it's really a zero chance, but I'm 95% sure that it's an internal app without any external data sources.)

What else would we lose by switching to protocol version 2?  Does this version not use parameter binding?  I'm just wondering what will and won't work in protocol 2, and why it might be faster for our purposes.


Well actually at pgcon this week the developers talked about deprecating protocol 2. It does not use binding it builds the query into a string and does param replacing before executing the query.

I'd really prefer to understand this issue.

Thanks!

Dave
Humm... may be out of topic - if so, please forgive...

Are you sure that your Linux box is able to resolve network names?
Try to put all involved computers ip address in hosts file for both
computers, and give a try.
I use Linux and Windows and the slowness network I detected has been
related to name resolution.

Regards,

Edson


Em 25/05/2013 19:52, Reuven M. Lerner escreveu:
> Hi, everyone.  A project I'm assisting is connecting to PostgreSQL
> (9.2) via JDBC.    We're experiencing extremely slow query execution,
> for precisely the same queries, depending on (a) operating system and
> (b) whether we use bind parameters.
>
> We're using two different Red Hat Linux systems (one running RHEL 5,
> and the other RHEL 6).  The machines have 16 GB and 64 GB RAM,
> respectively, with 4+ GB allocated to shared buffers on each one.
> However, given that the performance difference is between psql and
> JDBC, I doubt that these are significant factors.
>
> We have two different Java programs (which do 1,000 inserts), each of
> which we've run 10 times on each of Windows and Linux.   The average
> time needed for running each program is:
>
> Windows: 232 ms with binding, 152 ms without binding
> Linux: 2,233 ms (yes, 2.233 *seconds*) with binding, 176 ms without
> binding
>
> We've also run the same 1,000 inserts via psql on a remote machine,
> and the timing was quite fast (i.e., like what we saw on the Windows
> machine).
>
> The combination of binding + Linux seems to be the problematic one
> here, but I'm at a loss for why.
>
> I can provide the test programs if they'll help, but I'm first
> wondering where the problem might lie.
>
> Any suggestions or ideas?  What additional information (e.g., the test
> programs) would help to track this problem down?
>
> Thanks in advance,
>
> Reuven
>



Just to make certain that we understand the setup. :-)

Machine:
A: PG-server OS: ?
B: Application client. Linux
C: Application client. Windows xx

--
Kim


On 26-05-2013 00:52, Reuven M. Lerner wrote:
> Hi, everyone.  A project I'm assisting is connecting to PostgreSQL
> (9.2) via JDBC.    We're experiencing extremely slow query execution,
> for precisely the same queries, depending on (a) operating system and
> (b) whether we use bind parameters.
>
> We're using two different Red Hat Linux systems (one running RHEL 5,
> and the other RHEL 6).  The machines have 16 GB and 64 GB RAM,
> respectively, with 4+ GB allocated to shared buffers on each one.
> However, given that the performance difference is between psql and
> JDBC, I doubt that these are significant factors.
>
> We have two different Java programs (which do 1,000 inserts), each of
> which we've run 10 times on each of Windows and Linux.   The average
> time needed for running each program is:
>
> Windows: 232 ms with binding, 152 ms without binding
> Linux: 2,233 ms (yes, 2.233 *seconds*) with binding, 176 ms without
> binding
>
> We've also run the same 1,000 inserts via psql on a remote machine,
> and the timing was quite fast (i.e., like what we saw on the Windows
> machine).
>
> The combination of binding + Linux seems to be the problematic one
> here, but I'm at a loss for why.
>
> I can provide the test programs if they'll help, but I'm first
> wondering where the problem might lie.
>
> Any suggestions or ideas?  What additional information (e.g., the test
> programs) would help to track this problem down?
>
> Thanks in advance,
>
> Reuven
>



Hi, everyone.  About a month ago, I e-mailed this list about a project
I'm helping which is having hugely different performance results
depending on whether they use Linux or Windows.  Several list members
were nice enough to offer some answers, but then I disappeared for a
while, thanks to travel + illness + family + crazy schedules.

Just to recap: We have an application written in Java.  When we build
the SQL query manually, using a StringBuilder, we get similar results on
Windows and Linux.  But when we use the JDBC driver's parameter
bindings, we find wildly different performance.  Specifically:

With binding:
    Linux: 2,233 ms (yes, more than 2 seconds)
    Windows: 232 ms

Without binding:
    Linux: 176 ms
    Windows: 152 ms

The two test pieces of code, which we used to benchmark performance, and
which work on the table we want to use in production, are at:

https://gist.github.com/reuven/5858095  # without bindings
https://gist.github.com/reuven/5858092  # with bindings

I believe that the tests were run on localhost, such that the client and
server were both executing on the same computer.

When we execute the query manually, via psql, we get fast results, on
both Windows and Linux.  Thus, it would seem that something in the
parameter-binding code in the JDBC driver is somehow interacting poorly
with Linux.

I'm far from a Java expert, and even less of a Windows guy, so I'm not
sure what is going on here.  Any suggestions as to what we should be
looking for, or tuning in the PostgreSQL and/or Windows configuration to
avoid such problems?

Thanks again for any help that you can offer.

Reuven


Reuven,

Postgresql version < 9.2 suffers from a problem with bound parameters where the planner does not choose a good plan with bound parameters. If I were a betting man I would bet that your linux box is not running version > 9.2

select version() will confirm

Cheers,

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca


On Tue, Jun 25, 2013 at 8:43 AM, Reuven M. Lerner <reuven@lerner.co.il> wrote:
Hi, everyone.  About a month ago, I e-mailed this list about a project I'm helping which is having hugely different performance results depending on whether they use Linux or Windows.  Several list members were nice enough to offer some answers, but then I disappeared for a while, thanks to travel + illness + family + crazy schedules.

Just to recap: We have an application written in Java.  When we build the SQL query manually, using a StringBuilder, we get similar results on Windows and Linux.  But when we use the JDBC driver's parameter bindings, we find wildly different performance.  Specifically:

With binding:
        Linux: 2,233 ms (yes, more than 2 seconds)
        Windows: 232 ms

Without binding:
        Linux: 176 ms
        Windows: 152 ms

The two test pieces of code, which we used to benchmark performance, and which work on the table we want to use in production, are at:

https://gist.github.com/reuven/5858095  # without bindings
https://gist.github.com/reuven/5858092  # with bindings

I believe that the tests were run on localhost, such that the client and server were both executing on the same computer.

When we execute the query manually, via psql, we get fast results, on both Windows and Linux.  Thus, it would seem that something in the parameter-binding code in the JDBC driver is somehow interacting poorly with Linux.

I'm far from a Java expert, and even less of a Windows guy, so I'm not sure what is going on here.  Any suggestions as to what we should be looking for, or tuning in the PostgreSQL and/or Windows configuration to avoid such problems?

Thanks again for any help that you can offer.

Reuven

Hi, Dave. You wrote:
>
> Postgresql version < 9.2 suffers from a problem with bound parameters
> where the planner does not choose a good plan with bound parameters. If
> I were a betting man I would bet that your linux box is not running
> version > 9.2
>
> select version() will confirm

We're actually running 9.2.  But your note only talked about < 9.2 and >
9.2, so I'm not sure where that leaves us...

Should we try the 9.3 beta?

Also, would that account for the differences between Windows and Linux?

Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
http://lerner.co.il * Skype/AIM: reuvenlerner


Reuven,

Sorry, my mistake. >= 9.2

And no that should not account for the differences

Any chance I can login to the box ? Failing that you will need to try to get the query plan

Bound parameters can be replicated using prepare and execute


prepare the query then run 

explain analyze on the execute to see the differences.



Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca


On Tue, Jun 25, 2013 at 8:55 AM, Reuven M. Lerner <reuven@lerner.co.il> wrote:
Hi, Dave. You wrote:

Postgresql version < 9.2 suffers from a problem with bound parameters
where the planner does not choose a good plan with bound parameters. If
I were a betting man I would bet that your linux box is not running
version > 9.2

select version() will confirm

We're actually running 9.2.  But your note only talked about < 9.2 and > 9.2, so I'm not sure where that leaves us...

Should we try the 9.3 beta?

Also, would that account for the differences between Windows and Linux?


Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
http://lerner.co.il * Skype/AIM: reuvenlerner

Hi, Dave.  No, there's no way for you to log into the box -- heck,
there's no way for me to log in there either, unless I'm on-site.

I'm hoping that the on-site folks can run PREPARE and EXPLAIN ANALYZE on
EXECUTE to figure out what's going on here.  As soon as I get that from
them, I'll pass it along.

Thanks again for your suggestions and help.

Reuven


Not surprising

Let me know the results

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca


On Tue, Jun 25, 2013 at 9:12 AM, Reuven M. Lerner <reuven@lerner.co.il> wrote:
Hi, Dave.  No, there's no way for you to log into the box -- heck, there's no way for me to log in there either, unless I'm on-site.

I'm hoping that the on-site folks can run PREPARE and EXPLAIN ANALYZE on EXECUTE to figure out what's going on here.  As soon as I get that from them, I'll pass it along.

Thanks again for your suggestions and help.

Reuven