Thread: jdbc batch performance problem

jdbc batch performance problem

From
pginfo
Date:
Hi,

I am using pg + jdbc for j2ee development from a long time.
The main problem I found is the very bad insert performance.
I am using addBatch and executeBatch for minimum network trafik,
but it is also very slow.
For example if I try to insert 1000 rows ( in one transaction) into very
simple table
it takes ~ 3000 ms. If i use insert into myTable select ... it takes
only 10 ms.

Exist any basic performance problem for pg jdbc or I make mistake?
Also I testet the same example on oracle 8i and mysql.
It is working much much better.
I love pg but it will be great if I can make insert faster.

Many thanks,
ivan.


Re: jdbc batch performance problem

From
Dave Cramer
Date:
Hi,

You can try dropping the index before doing the inserts and rebuilding
it afterwards, that may help.
On Sat, 2003-07-19 at 02:52, pginfo wrote:
> Hi,
>
> I am using pg + jdbc for j2ee development from a long time.
> The main problem I found is the very bad insert performance.
> I am using addBatch and executeBatch for minimum network trafik,
> but it is also very slow.
> For example if I try to insert 1000 rows ( in one transaction) into very
> simple table
> it takes ~ 3000 ms. If i use insert into myTable select ... it takes
> only 10 ms.
this is not comparing apples to apples, the insert into ... does
everything on the backend, jdbc has to get the data to the server.
>
> Exist any basic performance problem for pg jdbc or I make mistake?
> Also I testet the same example on oracle 8i and mysql.

How was this test done? For instance did you use transactions with
mysql? did you use their jdbc interface, was the connection to the
database the same speed in each case?


> It is working much much better.
> I love pg but it will be great if I can make insert faster.
>
> Many thanks,
> ivan.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>
--
Dave Cramer <Dave@micro-automation.net>


Re: jdbc batch performance problem

From
pginfo
Date:
Hi Dave,
 thanks for the answer.

Dave Cramer wrote:

> Hi,
>
> You can try dropping the index before doing the inserts and rebuilding
> it afterwards, that may help.

I make this test. It is not better.As I wrote if I use insert into mytable
select ... (with index) the time is 10 ms.
Of this reason I think the problem is only by jdbc implementation, not by data
base.

> On Sat, 2003-07-19 at 02:52, pginfo wrote:
> > Hi,
> >
> > I am using pg + jdbc for j2ee development from a long time.
> > The main problem I found is the very bad insert performance.
> > I am using addBatch and executeBatch for minimum network trafik,
> > but it is also very slow.
> > For example if I try to insert 1000 rows ( in one transaction) into very
> > simple table
> > it takes ~ 3000 ms. If i use insert into myTable select ... it takes
> > only 10 ms.
> this is not comparing apples to apples, the insert into ... does
> everything on the backend, jdbc has to get the data to the server.
> >
> > Exist any basic performance problem for pg jdbc or I make mistake?
> > Also I testet the same example on oracle 8i and mysql.
>
> How was this test done? For instance did you use transactions with
> mysql? did you use their jdbc interface, was the connection to the
> database the same speed in each case?
>

I am not mysql expert. I used the standart setup (with transactions). All the
tests I make on the same computer:dual PIII 1 GHz, 4 GB RAM, linux 7.3.
I make this tests only to be sure that the problem is not into JVM or my app.
server.
I expected to reach 2000 (or 3000) inserts/sec. This is the result by oracle
and mysql.
With pg I got ~ 300 inserts/sec.
It is 10 time slower and I do not think it is normal.
I know pg relativ good and think that th db performance is aprox the same as
oracle.
And for me the problem is into the jdbc driver.

Any idea?
ivan.

> > It is working much much better.
> > I love pg but it will be great if I can make insert faster.
> >
> > Many thanks,
> > ivan.
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 8: explain analyze is your friend
> >
> --
> Dave Cramer <Dave@micro-automation.net>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)




Re: jdbc batch performance problem

From
Paul Thomas
Date:
On 19/07/2003 07:52 pginfo wrote:
> Hi,
>
> I am using pg + jdbc for j2ee development from a long time.
> The main problem I found is the very bad insert performance.
> I am using addBatch and executeBatch for minimum network trafik,
> but it is also very slow.
> For example if I try to insert 1000 rows ( in one transaction) into very
> simple table
> it takes ~ 3000 ms. If i use insert into myTable select ... it takes
> only 10 ms.
>
> Exist any basic performance problem for pg jdbc or I make mistake?
> Also I testet the same example on oracle 8i and mysql.
> It is working much much better.
> I love pg but it will be great if I can make insert faster.
>

You don't say which version of PG you're using or what your hardware/OS
setup is. Anyway, I've been doing some testing of my own as well doing
some digging into the JDBC source. Here is my set up

1GZ Athlon
512MB RAm
30MB ATA66 IDE drive
RedHat Linux 7.2
PostgreSQL 7.3.3
postgresql.conf changes:
   shared)buffer = 1000
   max_connections = 100
   sort_memory = 1024

My test program generates approx 1.2 million records into a table with a
single index. I didn't use addBatch()/executeBatch() but tried variations
of Statement and PreparedStatement in a transaction, commiting
approximately every 100 records. If you look at the source you will see
that executeBatch performs a series of exeucte()'s - one for each
statement added with addBatch() so I don't think there would be any less
network traffic compared to executing each query individually. The results
I got were as follows

1) Using Statement: 290 inserts per second

2) Using PreparedStatement: 420 inserts per second

3) Using PreparedStatement with server prepared statement: 1030 inserts
per second

HTH

--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for the Smaller
Business |
| Computer Consultants         |
http://www.thomas-micro-systems-ltd.co.uk   |
+------------------------------+---------------------------------------------+

Re: jdbc batch performance problem

From
pginfo
Date:
Hi,
my configuration:
dual PIII 1 GHz, 4 GB RAM, linux red hat 7.3,
40 GB HDD ATA100,
PG 7.3.1
max_connections 256
shared_buffers = 512
sort_mem = 10000

I tested before some time the executeUpdate and executeBatch and found big
diference
by performance.

If as you say it do not exist any diference I think it will be possible to improve

executeBatch.
For example for it's implementation to use the pg copy command.
Exist at the moment any work for improving the batch commands?

regards,
ivan.

Paul Thomas wrote:

> On 19/07/2003 07:52 pginfo wrote:
> > Hi,
> >
> > I am using pg + jdbc for j2ee development from a long time.
> > The main problem I found is the very bad insert performance.
> > I am using addBatch and executeBatch for minimum network trafik,
> > but it is also very slow.
> > For example if I try to insert 1000 rows ( in one transaction) into very
> > simple table
> > it takes ~ 3000 ms. If i use insert into myTable select ... it takes
> > only 10 ms.
> >
> > Exist any basic performance problem for pg jdbc or I make mistake?
> > Also I testet the same example on oracle 8i and mysql.
> > It is working much much better.
> > I love pg but it will be great if I can make insert faster.
> >
>
> You don't say which version of PG you're using or what your hardware/OS
> setup is. Anyway, I've been doing some testing of my own as well doing
> some digging into the JDBC source. Here is my set up
>
> 1GZ Athlon
> 512MB RAm
> 30MB ATA66 IDE drive
> RedHat Linux 7.2
> PostgreSQL 7.3.3
> postgresql.conf changes:
>    shared)buffer = 1000
>    max_connections = 100
>    sort_memory = 1024
>
> My test program generates approx 1.2 million records into a table with a
> single index. I didn't use addBatch()/executeBatch() but tried variations
> of Statement and PreparedStatement in a transaction, commiting
> approximately every 100 records. If you look at the source you will see
> that executeBatch performs a series of exeucte()'s - one for each
> statement added with addBatch() so I don't think there would be any less
> network traffic compared to executing each query individually. The results
> I got were as follows
>
> 1) Using Statement: 290 inserts per second
>
> 2) Using PreparedStatement: 420 inserts per second
>
> 3) Using PreparedStatement with server prepared statement: 1030 inserts
> per second
>
> HTH
>
> --
> Paul Thomas
> +------------------------------+---------------------------------------------+
> | Thomas Micro Systems Limited | Software Solutions for the Smaller
> Business |
> | Computer Consultants         |
> http://www.thomas-micro-systems-ltd.co.uk   |
> +------------------------------+---------------------------------------------+
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)




Re: jdbc batch performance problem

From
Kris Jurka
Date:

> If as you say it do not exist any diference I think it will be
> possible to improve executeBatch.
> For example for it's implementation to use the pg copy command.
> Exist at the moment any work for improving the batch commands?
>

To use COPY instead of insert during executeBatch would require some
complicated parsing of the statement to determine if it was really a
simple insert that could be translated into COPY.  If your statement had a
function call or a subselect it could not use COPY.  Also I'm not sure
what translating an INSERT into a COPY would do if you we're inserting
into a table that had rules on it.

There was a patch around that provided an API to use COPY explicitly which
might suit your needs.

Kris Jurka


Re: jdbc batch performance problem

From
Barry Lind
Date:
Currently there is no difference between inserts done serially or done
via the batch interface.  In both cases there is one round trip per
insert.  This is an area of the code that would welcome contributions.

--Barry

pginfo wrote:
> Hi,
>
> I am using pg + jdbc for j2ee development from a long time.
> The main problem I found is the very bad insert performance.
> I am using addBatch and executeBatch for minimum network trafik,
> but it is also very slow.
> For example if I try to insert 1000 rows ( in one transaction) into very
> simple table
> it takes ~ 3000 ms. If i use insert into myTable select ... it takes
> only 10 ms.
>
> Exist any basic performance problem for pg jdbc or I make mistake?
> Also I testet the same example on oracle 8i and mysql.
> It is working much much better.
> I love pg but it will be great if I can make insert faster.
>
> Many thanks,
> ivan.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>



Re: jdbc batch performance problem

From
pginfo
Date:
Hi,
Ok it is clear.
I will try to found a little time for this problem into the near future.

I am very familiar with pg internal and if some one have made tests with "C"
inserts it will
be great to send some info.
For example how many records can I insert with "C" code.
In this case we will know what to expect as performance.
As I wrote the performance by mysql and oracle is realy great and I am ready
to give my time for improving pg.

regards,
ivan.

Barry Lind wrote:

> Currently there is no difference between inserts done serially or done
> via the batch interface.  In both cases there is one round trip per
> insert.  This is an area of the code that would welcome contributions.
>
> --Barry
>
> pginfo wrote:
> > Hi,
> >
> > I am using pg + jdbc for j2ee development from a long time.
> > The main problem I found is the very bad insert performance.
> > I am using addBatch and executeBatch for minimum network trafik,
> > but it is also very slow.
> > For example if I try to insert 1000 rows ( in one transaction) into very
> > simple table
> > it takes ~ 3000 ms. If i use insert into myTable select ... it takes
> > only 10 ms.
> >
> > Exist any basic performance problem for pg jdbc or I make mistake?
> > Also I testet the same example on oracle 8i and mysql.
> > It is working much much better.
> > I love pg but it will be great if I can make insert faster.
> >
> > Many thanks,
> > ivan.
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 8: explain analyze is your friend
> >
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html