Thread: Re: [QUESTIONS] JDBC updates are slow

Re: [QUESTIONS] JDBC updates are slow

From
Peter T Mount
Date:
On Fri, 9 Jan 1998, Justin Wells wrote:
> Hi,
>
> I'm using JDBC to store persistent data. It seems to be fast enough
> loading data from the database, but it is too slow writing it back.
>
> The difference seems to be that I can get all the data from the DB in
> one or two result sets when I access it, but I am required to make a
> separate connection to the DB for each tuple I want to add or update.

You shouldn't be opening a new connection just to update a tuple. As long
as you dont transfer more than one thing at a time between your app and
the backend, all should be fine.

> I'm using PreparedStatement, but that doesn't change things.

> What I would really like to do is submit a whole batch of updates in a
> single request, to cut the database access time. I'm adding them all to
> the same table, so I'm hoping to something like "insert into ... from mytable"
> where mytable is a set of tuples local to my application.
>
> Is there any way to do this? I didn't see anything like this in the JDBC
> documentation. I'm really suffering for performance over this, does anyone
> have any ideas?

In JDBC, there is no real way of batch updates.

If your table has indices, you could try wrapping the updates within a
transaction:

  myconnection.setAutoCommit(false);    // Start using transactions
  ... all updates here
  myconnection.commit();        // Commit the updates

  ... later, when transactions are nolonger required
  myconnection.setAutoCommit(true);    // Stop using transactions

> For the time being my workaround is to cache all the data in the application
> and have a separate thread post the updates. I am unhappy about this because
> it defeats proper locking and makes for really dirty reads.

Thread safety is on the cards for the driver, but it won't be there for
6.3 (I'm trying to get as much of the api implemented as possible).

Saying that, there are very few JDBC drivers out there (for any database)
that are Thread safe. On the JDBC list, it was recomended that for any
program that wants to use multiple drivers, to use them from one Thread
only.

> Any ideas would be appreaciated!

As a non JDBC extension to the driver, I'm looking at providing a way to
use postgresql's copy method for large updates to tables.

Also, you say you are using the database to store Persistent data. Is this
data that could be stores as a Serializable object? If so, you could have
a few columns in the table hold key data for searching, and an oid to
point to a Large Object holding the Serialized object. This now works for
6.3

--
Peter T Mount  petermount@earthling.net or pmount@maidast.demon.co.uk
Main Homepage: http://www.demon.co.uk/finder
Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk


Re: [QUESTIONS] JDBC updates are slow

From
Justin Wells
Date:
> You shouldn't be opening a new connection just to update a tuple. As long
> as you dont transfer more than one thing at a time between your app and
> the backend, all should be fine.

You're right. What I think is happening is the extra overhead of a statement,
and the fact that the DB is performing separate write operations, are adding
up to too much overhead.

> If your table has indices, you could try wrapping the updates within a
> transaction:

I tried this, and it did shave a bit of time off the operation, but not
very much. I also tried writing the data to a new, unindexed empty table
with the idea that if I could get it in there efficiently, I could
subseqently merge that table into the main table. I also simplified the
data I was posting down to four int4's -- but doing all that, I still didn't
get very good performance out of it.

I'm going to try with a few different DB's, and in the case of postgres I'm
going to upgrade to the latest version and try it on a few different
platforms.

> > For the time being my workaround is to cache all the data in the application
> > and have a separate thread post the updates. I am unhappy about this because
> > it defeats proper locking and makes for really dirty reads.
>
> Thread safety is on the cards for the driver, but it won't be there for
> 6.3 (I'm trying to get as much of the api implemented as possible).

I didn't realize it wasn't there. What I'll do is lock on some other object
so that only one thread can access it. I'll let a lower priority background
thread grab the lock and post my updates while nothing is going on, and I'll
arrange for it to yield frequently.

> As a non JDBC extension to the driver, I'm looking at providing a way to
> use postgresql's copy method for large updates to tables.

That would be great. Since there is no JDBC standard way of doing it, I'd
be glad if there were driver dependent solutions. I would just rewrite that
section for each DB I wanted to support, and breathe a big sigh of relief
that it worked :)

By background thread workaround improves apparent performance, but obviously
it can't take a sustained data feed.

> Also, you say you are using the database to store Persistent data. Is this
> data that could be stores as a Serializable object? If so, you could have
> a few columns in the table hold key data for searching, and an oid to
> point to a Large Object holding the Serialized object. This now works for
> 6.3

I'm still thinking about this one. Right now I have a lot of small objects
but it's possible I could dream up a way to combine them into bigger objects
where this would make more sense. It's possible, but I'd have to give up on
the ability to search on some of them. Still, that might work. (On average
I have objects made up of four integers and a string, sometimes three integers,
sometimes two strings.)

Justin