Thread: odd jdbc driver synchronization issue

odd jdbc driver synchronization issue

From
"George Lessmann"
Date:
Hi,

I'm trying to move my database platform from MSSQL Server to PostgreSQL.
I have a load simulation program I wrote which I'm running against a
test PostgreSQL instance. The application creates a number of client
threads, each with their own database connection, and turns them loose
on a workload. Currently, it's lagging the SQLServer test. I'm still
tuning the postgresql.conf file, but I also noticed something very odd
wrt the postmaster process. I see something like the following:

3 threads started, 0 done,
    pid 1        99%
    pid 2        0%
    pid 3        0%

2 threads started, 1 done,
    pid 1        0%
    pid 2        99%
    pid 3        0%

1 threads started, 2 done,
    pid 1        0%
    pid 2        0%
    pid 3        99%

Even though all the treads are runnable, with their own connection, and
their own postmaster process on the DB, they are still serializing their
writes. If I look in the table where the data is being generated, I see
rows inserted in order similar to what top is showing (Client3's data,
then Client6's data, then Client1's data, etc).

As this application is written in java and using the PostgreSQL jdbc
driver I'm starting here, if you feel I should post elsewhere, let me
know.

Thanks,

George

--Details--

Client machine:
 Windows XP, Java 1.4.2 (build 1.4.2-b28)
 PostgreSQL JDBC driver 7.4, 7.4.1 (no difference)

Database machine:
 Debian Linux 2.6.0-test11 #4 SMP i686 GNU/Linux
 2x 2.4GHz Xeon w/HT, 1GB RAM
 PostgreSQL 7.4




Re: odd jdbc driver synchronization issue

From
"George Lessmann"
Date:
Kris,

Thanks for the test program. It helped a lot, and I believe I've found
my problem. The main issue seems to be the transaction isolation and
locking behavior of PostgreSQL as compared to MSSQL Server, and the way
I was writing my code (specifically, commits) between all my different
tests. My workload uses two stored procedures; each one issues a few
selects, possibly updates, and then an insert. The update branch is what
caused the issues, as a table could be concurrently updated by any
number of clients (basically, something like balance = balance +
amount).

Here's my work section, modeled after your example:

    public void execute(int num) throws SQLException {
        for (int i=0; i<num; i++) {

            parentCall.execute();
            // #1
            for (int j=0; j<9; j++) {
                childCall.execute();
                // #1
            }
            // #2
        }
        // #3
        conn.commit();
    }

#1 a conn.commit() placed here allowed multiple clients/postmasters to
work. No postmaster process went above 8% cpu.
#2 a conn.commit() placed here allowed multiple clients/postmasters to
work. Faster than #1, naturally. No postmaster process went above 25%
cpu.
#3 a conn.commit() placed here will serialize all clients/postmasters
(if you have updates in your functions). 1 postmaster process always at
99% cpu. Although it finished with the highest throughput, it's not the
correct isolation level for the workload.


Here's some comparison data for those who are interested,

Postgres    #1    39.38 tps
Postgres    #2    108.50 tps
Postgres    #3    563.76 tps (near pgAdmin levels?)
MSSQL        #1    389.81 tps
MSSQL        #2    459.18 tps
MSSQL        #3    767.23 tps

Thanks to Dave, Oliver, and Kris for the help.

George

-----Original Message-----
From: pgsql-jdbc-owner@postgresql.org
[mailto:pgsql-jdbc-owner@postgresql.org] On Behalf Of Kris Jurka
Sent: Monday, January 05, 2004 4:42 PM
To: George Lessmann
Cc: pgsql-jdbc@postgresql.org
Subject: Re: [JDBC] odd jdbc driver synchronization issue



On Mon, 5 Jan 2004, George Lessmann wrote:

>
> The original design of my test application allocated one Connection
and
> two CallableStatements per client (one for the parent insert and one
for
> the child insert). As I reported earlier, each client would then be
> serialized, even though they each existed in their own
> thread/connection/callablestatement.
>
> I then refactored the application to allocate a connection per client,
> and prepareCall() before every execute(). This change resulted in each
> client being able to concurrently access the database, even though, in
> my opinion, it violates the whole reason behind a prepareCall() which
is
> to reuse it often without recompilation.

I've attached a program I wrote to try and verify your claims.  I see no
serialization despite a single prepareCall and many executes.  This
program runs with

java CallTest <num clients> <num runs per client>

each client does a loop over numRuns, calling a SELECT stored
procedure and then inserts that value into a table via another stored
procedure.

Let me know if you see a problem with this test or how your application
differs.

Kris Jurka