Thread: odd jdbc driver synchronization issue

odd jdbc driver synchronization issue

From
"George Lessmann"
Date:
Hi,

I'm trying to move my database platform from MSSQL Server to PostgreSQL.
I have a load simulation program I wrote which I'm running against a
test PostgreSQL instance. The application creates a number of client
threads, each with their own database connection, and turns them loose
on a workload. Currently, it's lagging the SQLServer test. I'm still
tuning the postgresql.conf file, but I also noticed something very odd
wrt the postmaster process. I see something like the following:

3 threads started, 0 done,
    pid 1        99%
    pid 2        0%
    pid 3        0%

2 threads started, 1 done,
    pid 1        0%
    pid 2        99%
    pid 3        0%

1 threads started, 2 done,
    pid 1        0%
    pid 2        0%
    pid 3        99%

Even though all the treads are runnable, with their own connection, and
their own postmaster process on the DB, they are still serializing their
writes. If I look in the table where the data is being generated, I see
rows inserted in order similar to what top is showing (Client3's data,
then Client6's data, then Client1's data, etc).

As this application is written in java and using the PostgreSQL jdbc
driver I'm starting here, if you feel I should post elsewhere, let me
know.

Thanks,

George

--Details--

Client machine:
 Windows XP, Java 1.4.2 (build 1.4.2-b28)
 PostgreSQL JDBC driver 7.4, 7.4.1 (no difference)

Database machine:
 Debian Linux 2.6.0-test11 #4 SMP i686 GNU/Linux
 2x 2.4GHz Xeon w/HT, 1GB RAM
 PostgreSQL 7.4

Re: odd jdbc driver synchronization issue

From
Dave Cramer
Date:
One thing that would help is the code?

Dave
On Wed, 2003-12-31 at 15:06, George Lessmann wrote:
> Hi,
>
> I'm trying to move my database platform from MSSQL Server to PostgreSQL.
> I have a load simulation program I wrote which I'm running against a
> test PostgreSQL instance. The application creates a number of client
> threads, each with their own database connection, and turns them loose
> on a workload. Currently, it's lagging the SQLServer test. I'm still
> tuning the postgresql.conf file, but I also noticed something very odd
> wrt the postmaster process. I see something like the following:
>
> 3 threads started, 0 done,
>     pid 1        99%
>     pid 2        0%
>     pid 3        0%
>
> 2 threads started, 1 done,
>     pid 1        0%
>     pid 2        99%
>     pid 3        0%
>
> 1 threads started, 2 done,
>     pid 1        0%
>     pid 2        0%
>     pid 3        99%
>
> Even though all the treads are runnable, with their own connection, and
> their own postmaster process on the DB, they are still serializing their
> writes. If I look in the table where the data is being generated, I see
> rows inserted in order similar to what top is showing (Client3's data,
> then Client6's data, then Client1's data, etc).
>
> As this application is written in java and using the PostgreSQL jdbc
> driver I'm starting here, if you feel I should post elsewhere, let me
> know.
>
> Thanks,
>
> George
>
> --Details--
>
> Client machine:
>  Windows XP, Java 1.4.2 (build 1.4.2-b28)
>  PostgreSQL JDBC driver 7.4, 7.4.1 (no difference)
>
> Database machine:
>  Debian Linux 2.6.0-test11 #4 SMP i686 GNU/Linux
>  2x 2.4GHz Xeon w/HT, 1GB RAM
>  PostgreSQL 7.4
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>
--
Dave Cramer
519 939 0336
ICQ # 1467551


Re: odd jdbc driver synchronization issue

From
"George Lessmann"
Date:
Dave, all,

Sorry, I thought this was such an obvious thing to do; it would be some
setting or something I missed (does the jdbc driver have any debug
options?). Some of the code is proprietary. Hopefully, this will give
you an idea of what is happening. Basically, create a number of clients,
load them with parents, and then run() them. While running, a client
inserts one parent and a number of children.


package jdbhammer;
/**
 * @author George Lessmann
 * copyright 2003, George Lessmann
 * code may not be reused without permission
  */
import java.util.ArrayList;

import org.apache.commons.dbcp.DriverManagerConnectionFactory;
import org.apache.commons.dbcp.PoolableConnectionFactory;
import org.apache.commons.dbcp.PoolingDriver;
import org.apache.commons.pool.impl.GenericObjectPool;

public final class Main {
    private static String resultsLocation = null;

    private static  String username = "postgres";
    private static  String password = "";

    public static void main(String[] args) {

        if (args == null | "?".equals(args[0])) {
            System.out.println("Usage: java -jar
jDBHammer.jar [#clients] [#dbconnections] [#parents] [#children per
parent] [server] [username] [password]");
            return;
        }// if

        if (args.length >= 5) {
            server = args[4];
        }// if

        if (args.length >= 6) {
            username = args[5];
        }// if

        if (args.length >= 7) {
            password = args[6];
        }// if

        resultsLocation =
"c:\\jdbhammer"+"-"+args[0]+"-"+args[1]+"-"+args[2]+"-"+args[3]+".txt";

        System.out.println("Creating DB Pool.");


        try {
            Class.forName("org.postgresql.Driver");
        } catch (Exception e) {
            System.out.println("Error! Unable to load JDBC
Driver! Error was:" + e.getMessage());
            return;
        }// try

        // create the pool
        GenericObjectPool connectionPool = new
GenericObjectPool(null, Integer.parseInt(args[1]),
GenericObjectPool.WHEN_EXHAUSTED_GROW, 0);
        DriverManagerConnectionFactory connectionFactory = new
DriverManagerConnectionFactory("jdbc:postgresql://theserver/thedb",
username, password);

        try {
            PoolableConnectionFactory
poolableConnectionFactory = new
PoolableConnectionFactory(connectionFactory,connectionPool,null,null,fal
se,true);
        } catch (Exception e) {
            System.out.println("Error! Unable to create
PoolableConnectionFactory! Error was:" + e.getMessage());
            return;
        }// try

        PoolingDriver driver = new PoolingDriver();
        driver.registerPool("jdbhammer",connectionPool);

        ArrayList clients = new ArrayList();
        DbClient client = null;

        System.out.println("Creating Clients.");

        // load them
        for (int i=0, j=Integer.parseInt(args[0]); i < j; i++) {

            client = new DbClient("DbClient" +
Integer.toString(i), resultsLocation);
            client.loadParents(Integer.parseInt(args[2]),
Integer.parseInt(args[3]));
            System.out.println("Creating Clients: "
+"DbClient" + Integer.toString(i));
            clients.add(client);
        }// for

        System.out.println("Starting Threads.");
        Thread[] t = new Thread[clients.size()];

        for (int i=0, j=clients.size(); i < j; i++) {
            client = (DbClient) clients.get(i);
            t[i] = new Thread(client, client.getClientId());
            t[i].start();
        }// for

    }// main

}// Main

// DBClient
    private ArrayList workload = new ArrayList(10000);
    private String clientId = null;
    private long startTime = 0L;
    private long endTime = 0L;
    private boolean isComplete = Boolean.FALSE.booleanValue();

    private static final String procInsertParent = "{ call
proc_insert_parent( ?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?
) }";
    private static final String procInsertChild = "{ call
proc_insert_child( ?,?,?,?,?,?,?,?,?,?,?,?,? ) }";
    private String resultsLocation = null;

    private Connection conn = null;
    private CallableStatement parentCall = null;
    private CallableStatement childCall = null;

    public DbClient(String clientId, String resultsLocation) {
        this.clientId = clientId;
        this.resultsLocation = resultsLocation;

        try {
            System.out.println("grabbing connection");
            conn =
DriverManager.getConnection("jdbc:apache:commons:dbcp:jdbhammer");
            conn.setAutoCommit(false);
            parentCall = conn.prepareCall(procInsertParent);
            childCall = conn.prepareCall(procInsertChild);
        } catch (SQLException sql) {
            System.out.println("Error! There was an error
creating the connection. Error was: " + sql.getMessage());
        } // try
    } // DbClient

    public final void run() {

        Parent p = null;
        Child c = null;
        ArrayList children = null;

        startTime = System.currentTimeMillis();
        try {
            for (int i = 0, j = workload.size(); i < j; i++)
{
                p = (Parent) workload.get(i);

                try {

p.setStartTime(System.currentTimeMillis());
                    p.insert(parentCall);

p.setEndTime(System.currentTimeMillis());
                } catch (SQLException sql) {
                    System.out.println("Error! " +
sql.getMessage());
                    continue; // skip the children
                } // try

                children = p.getChildren();

                for (int k = 0, m = children.size(); k <
m; k++) {
                    child = (Child) children.get(k);

                    try {

child.setStartTime(System.currentTimeMillis());
                        child.insert(childCall);

child.setEndTime(System.currentTimeMillis());
                    } catch (SQLException sql) {

System.out.println("Error! " + sql.getMessage());
                        // do nothing, keep on
running
                    } // try
                } // for

                if ((i + 1 == j)) {
                    conn.commit();
                } // if

            } // for
        } catch (SQLException sql) {
            System.out.println("Error! " +
sql.getMessage());
            try {
                conn.rollback();
            } catch (SQLException sqle) {
            }
        } finally {
            try {
                parentCall.close();
                childCall.close();
                conn.close();
            } catch (SQLException sql) {
            }
        }
        endTime = System.currentTimeMillis();
        isComplete = Boolean.TRUE.booleanValue();

        // done
        System.out.println(Thread.currentThread().getName() + "
finished!");

        writeOutput();
    } // run

The parent and child insert() method sets a number of parameters one
each callable statement, and then execute() a function on the PostgreSQL
server, resulting in a row being inserted into a table.

Let me know what other information I can provide,

George

-----Original Message-----
From: Dave Cramer [mailto:pg@fastcrypt.com]
Sent: Wednesday, December 31, 2003 2:28 PM
To: George Lessmann
Cc: pgsql-jdbc@postgresql.org
Subject: Re: [JDBC] odd jdbc driver synchronization issue

One thing that would help is the code?

Dave
On Wed, 2003-12-31 at 15:06, George Lessmann wrote:
> Hi,
>
> I'm trying to move my database platform from MSSQL Server to
PostgreSQL.
> I have a load simulation program I wrote which I'm running against a
> test PostgreSQL instance. The application creates a number of client
> threads, each with their own database connection, and turns them loose
> on a workload. Currently, it's lagging the SQLServer test. I'm still
> tuning the postgresql.conf file, but I also noticed something very odd
> wrt the postmaster process. I see something like the following:
>
> 3 threads started, 0 done,
>     pid 1        99%
>     pid 2        0%
>     pid 3        0%
>
> 2 threads started, 1 done,
>     pid 1        0%
>     pid 2        99%
>     pid 3        0%
>
> 1 threads started, 2 done,
>     pid 1        0%
>     pid 2        0%
>     pid 3        99%
>
> Even though all the treads are runnable, with their own connection,
and
> their own postmaster process on the DB, they are still serializing
their
> writes. If I look in the table where the data is being generated, I
see
> rows inserted in order similar to what top is showing (Client3's data,
> then Client6's data, then Client1's data, etc).
>
> As this application is written in java and using the PostgreSQL jdbc
> driver I'm starting here, if you feel I should post elsewhere, let me
> know.
>
> Thanks,
>
> George
>
> --Details--
>
> Client machine:
>  Windows XP, Java 1.4.2 (build 1.4.2-b28)
>  PostgreSQL JDBC driver 7.4, 7.4.1 (no difference)
>
> Database machine:
>  Debian Linux 2.6.0-test11 #4 SMP i686 GNU/Linux
>  2x 2.4GHz Xeon w/HT, 1GB RAM
>  PostgreSQL 7.4
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that
your
>       message can get through to the mailing list cleanly
>
--
Dave Cramer
519 939 0336
ICQ # 1467551


Re: odd jdbc driver synchronization issue

From
Oliver Jowett
Date:
George Lessmann wrote:

> Dave, all,
>
> Sorry, I thought this was such an obvious thing to do; it would be some
> setting or something I missed (does the jdbc driver have any debug
> options?).

I have multiple connections from a single JVM accessing the same DB
concurrently with no problems; it's likely to be a quirk of the exact
query you run or how you're using the JDBC API.

 > Some of the code is proprietary. Hopefully, this will give
> you an idea of what is happening. Basically, create a number of clients,
> load them with parents, and then run() them. While running, a client
> inserts one parent and a number of children.

Ideally we need a selfcontained, compilable, testcase that demonstrates
the problem. Otherwise it's just guesswork.

Some things you could try to narrow down the problem:

  - Run a couple of instances of your test in separate JVMs, see if you
get any concurrency between them.

  - Get thread dumps from the JVM while your test is running (under
solaris & linux sending SIGQUIT to the JVM does this) and see where your
client threads are blocked.

  - Turn on statement logging (in postgresql.conf) and see exactly when
the queries are being submitted.

  - Run the same queries by hand via psql and see if you get the same
behaviour.

-O


Re: odd jdbc driver synchronization issue

From
"George Lessmann"
Date:
Oliver, all,

The original design of my test application allocated one Connection and
two CallableStatements per client (one for the parent insert and one for
the child insert). As I reported earlier, each client would then be
serialized, even though they each existed in their own
thread/connection/callablestatement.

I then refactored the application to allocate a connection per client,
and prepareCall() before every execute(). This change resulted in each
client being able to concurrently access the database, even though, in
my opinion, it violates the whole reason behind a prepareCall() which is
to reuse it often without recompilation.

So, now I have a test application which I can run against my test
Postgres server and my dev MSSQL Server box. With 5 client connections,
each inserting 1100 rows from 1100 function calls, the Postgres/Java
test is generating around 42 executions/second. MSSQL Server/Java is
generating around 368 executions/second. I then created a sql file with
the 1100 function calls, and it ran at about 580 executions/second in a
pgAdmin III query window. That's where I would like the Postgres/JDBC
test to be at, but obviously something is amiss with the Postgres JDBC
driver. I will work on getting this test application shareable, but does
anyone else have a cross-db loadtest app using JDBC?

Any help would be appreciated; I'm off to look at the ODBC driver...

George

ps--

The Postgres test DB server is a 2x2.4GHz Xeon, 1 GB ram, 10k SCSI
mirror, debian 2.6.0-test11
The MSSQL DB server is a 1x2.8Ghz Celeron, 1 GB ram, 7200RPM IDE mirror,
ms2k adv srvr

-----Original Message-----
From: pgsql-jdbc-owner@postgresql.org
[mailto:pgsql-jdbc-owner@postgresql.org] On Behalf Of Oliver Jowett
Sent: Wednesday, December 31, 2003 6:58 PM
To: George Lessmann
Cc: pg@fastcrypt.com; pgsql-jdbc@postgresql.org
Subject: Re: [JDBC] odd jdbc driver synchronization issue

George Lessmann wrote:

> Dave, all,
>
> Sorry, I thought this was such an obvious thing to do; it would be
some
> setting or something I missed (does the jdbc driver have any debug
> options?).

I have multiple connections from a single JVM accessing the same DB
concurrently with no problems; it's likely to be a quirk of the exact
query you run or how you're using the JDBC API.

 > Some of the code is proprietary. Hopefully, this will give
> you an idea of what is happening. Basically, create a number of
clients,
> load them with parents, and then run() them. While running, a client
> inserts one parent and a number of children.

Ideally we need a selfcontained, compilable, testcase that demonstrates
the problem. Otherwise it's just guesswork.

Some things you could try to narrow down the problem:

  - Run a couple of instances of your test in separate JVMs, see if you
get any concurrency between them.

  - Get thread dumps from the JVM while your test is running (under
solaris & linux sending SIGQUIT to the JVM does this) and see where your

client threads are blocked.

  - Turn on statement logging (in postgresql.conf) and see exactly when
the queries are being submitted.

  - Run the same queries by hand via psql and see if you get the same
behaviour.

-O


---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org

Re: odd jdbc driver synchronization issue

From
Kris Jurka
Date:

On Mon, 5 Jan 2004, George Lessmann wrote:

>
> The original design of my test application allocated one Connection and
> two CallableStatements per client (one for the parent insert and one for
> the child insert). As I reported earlier, each client would then be
> serialized, even though they each existed in their own
> thread/connection/callablestatement.
>
> I then refactored the application to allocate a connection per client,
> and prepareCall() before every execute(). This change resulted in each
> client being able to concurrently access the database, even though, in
> my opinion, it violates the whole reason behind a prepareCall() which is
> to reuse it often without recompilation.

I've attached a program I wrote to try and verify your claims.  I see no
serialization despite a single prepareCall and many executes.  This
program runs with

java CallTest <num clients> <num runs per client>

each client does a loop over numRuns, calling a SELECT stored
procedure and then inserts that value into a table via another stored
procedure.

Let me know if you see a problem with this test or how your application
differs.

Kris Jurka

Attachment

Re: odd jdbc driver synchronization issue

From
Oliver Jowett
Date:
On Mon, Jan 05, 2004 at 03:01:16PM -0600, George Lessmann wrote:
> Oliver, all,
>
> The original design of my test application allocated one Connection and
> two CallableStatements per client (one for the parent insert and one for
> the child insert). As I reported earlier, each client would then be
> serialized, even though they each existed in their own
> thread/connection/callablestatement.

This shouldn't happen. We really need a standalone testcase to locate the
problem down, though.

> I then refactored the application to allocate a connection per client,
> and prepareCall() before every execute(). This change resulted in each
> client being able to concurrently access the database, even though, in
> my opinion, it violates the whole reason behind a prepareCall() which is
> to reuse it often without recompilation.

As you say, this shouldn't be necessary.

One possibility is that your original code was (accidentally)
using CallableStatements that originated from the same Connection for
different clients. In this case, execution of those statements will
serialize (as they contend for the same underlying physical connection).

If the CallableStatements are indeed from different Connections, I can't see
how they'd serialize on the Java side .. the two Connection objects are
entirely independent and don't hold references to each other at all.

IIRC you were using a connection pool -- is it possible the contention is
happening in the pool implementation? Some pool implementations provide
their own proxying Connection and *Statement implementations; perhaps the
pool's CallableStatement is causing the problem.

Did you try any of my suggestions below? If you can produce a thread dump
while some of the clients are blocked we can probably narrow the problem
down without a testcase.

> Ideally we need a selfcontained, compilable, testcase that demonstrates
> the problem. Otherwise it's just guesswork.
>
> Some things you could try to narrow down the problem:
>
>   - Run a couple of instances of your test in separate JVMs, see if you
> get any concurrency between them.
>
>   - Get thread dumps from the JVM while your test is running (under
> solaris & linux sending SIGQUIT to the JVM does this) and see where your
>
> client threads are blocked.
>
>   - Turn on statement logging (in postgresql.conf) and see exactly when
> the queries are being submitted.
>
>   - Run the same queries by hand via psql and see if you get the same
> behaviour.
>
> -O

-O

Re: odd jdbc driver synchronization issue

From
"George Lessmann"
Date:
Kris,

Thanks for the test program. It helped a lot, and I believe I've found
my problem. The main issue seems to be the transaction isolation and
locking behavior of PostgreSQL as compared to MSSQL Server, and the way
I was writing my code (specifically, commits) between all my different
tests. My workload uses two stored procedures; each one issues a few
selects, possibly updates, and then an insert. The update branch is what
caused the issues, as a table could be concurrently updated by any
number of clients (basically, something like balance = balance +
amount).

Here's my work section, modeled after your example:

    public void execute(int num) throws SQLException {
        for (int i=0; i<num; i++) {

            parentCall.execute();
            // #1
            for (int j=0; j<9; j++) {
                childCall.execute();
                // #1
            }
            // #2
        }
        // #3
        conn.commit();
    }

#1 a conn.commit() placed here allowed multiple clients/postmasters to
work. No postmaster process went above 8% cpu.
#2 a conn.commit() placed here allowed multiple clients/postmasters to
work. Faster than #1, naturally. No postmaster process went above 25%
cpu.
#3 a conn.commit() placed here will serialize all clients/postmasters
(if you have updates in your functions). 1 postmaster process always at
99% cpu. Although it finished with the highest throughput, it's not the
correct isolation level for the workload.

Here's some comparison data for those who are interested,

Postgres    #1    39.38 tps
Postgres    #2    108.50 tps
Postgres    #3    563.76 tps (near pgAdmin levels?)
MSSQL        #1    389.81 tps
MSSQL        #2    459.18 tps
MSSQL        #3    767.23 tps

Thanks to Dave, Oliver, and Kris for the help.

George

-----Original Message-----
From: pgsql-jdbc-owner@postgresql.org
[mailto:pgsql-jdbc-owner@postgresql.org] On Behalf Of Kris Jurka
Sent: Monday, January 05, 2004 4:42 PM
To: George Lessmann
Cc: pgsql-jdbc@postgresql.org
Subject: Re: [JDBC] odd jdbc driver synchronization issue



On Mon, 5 Jan 2004, George Lessmann wrote:

>
> The original design of my test application allocated one Connection
and
> two CallableStatements per client (one for the parent insert and one
for
> the child insert). As I reported earlier, each client would then be
> serialized, even though they each existed in their own
> thread/connection/callablestatement.
>
> I then refactored the application to allocate a connection per client,
> and prepareCall() before every execute(). This change resulted in each
> client being able to concurrently access the database, even though, in
> my opinion, it violates the whole reason behind a prepareCall() which
is
> to reuse it often without recompilation.

I've attached a program I wrote to try and verify your claims.  I see no
serialization despite a single prepareCall and many executes.  This
program runs with

java CallTest <num clients> <num runs per client>

each client does a loop over numRuns, calling a SELECT stored
procedure and then inserts that value into a table via another stored
procedure.

Let me know if you see a problem with this test or how your application
differs.

Kris Jurka

Re: odd jdbc driver synchronization issue

From
Kris Jurka
Date:

> Here's my work section, modeled after your example:
>
>     public void execute(int num) throws SQLException {
>         for (int i=0; i<num; i++) {
>
>             parentCall.execute();
>             // #1
>             for (int j=0; j<9; j++) {
>                 childCall.execute();
>                 // #1
>             }
>             // #2
>         }
>         // #3
>         conn.commit();
>     }

In this loop are you using i or j to indicate what rows to operate on in
the stored procedure calls?  If so there will be huge contention as each
thread will try to do the exact same things.  Using clientNumber*num+i or
something like that would give a more realistic test if you're supposed to
be dividing up work between these threads.

Kris Jurka