Thread: jdbc batch performance problem
Hi, I am using pg + jdbc for j2ee development from a long time. The main problem I found is the very bad insert performance. I am using addBatch and executeBatch for minimum network trafik, but it is also very slow. For example if I try to insert 1000 rows ( in one transaction) into very simple table it takes ~ 3000 ms. If i use insert into myTable select ... it takes only 10 ms. Exist any basic performance problem for pg jdbc or I make mistake? Also I testet the same example on oracle 8i and mysql. It is working much much better. I love pg but it will be great if I can make insert faster. Many thanks, ivan.
Hi, You can try dropping the index before doing the inserts and rebuilding it afterwards, that may help. On Sat, 2003-07-19 at 02:52, pginfo wrote: > Hi, > > I am using pg + jdbc for j2ee development from a long time. > The main problem I found is the very bad insert performance. > I am using addBatch and executeBatch for minimum network trafik, > but it is also very slow. > For example if I try to insert 1000 rows ( in one transaction) into very > simple table > it takes ~ 3000 ms. If i use insert into myTable select ... it takes > only 10 ms. this is not comparing apples to apples, the insert into ... does everything on the backend, jdbc has to get the data to the server. > > Exist any basic performance problem for pg jdbc or I make mistake? > Also I testet the same example on oracle 8i and mysql. How was this test done? For instance did you use transactions with mysql? did you use their jdbc interface, was the connection to the database the same speed in each case? > It is working much much better. > I love pg but it will be great if I can make insert faster. > > Many thanks, > ivan. > > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > -- Dave Cramer <Dave@micro-automation.net>
Hi Dave, thanks for the answer. Dave Cramer wrote: > Hi, > > You can try dropping the index before doing the inserts and rebuilding > it afterwards, that may help. I make this test. It is not better.As I wrote if I use insert into mytable select ... (with index) the time is 10 ms. Of this reason I think the problem is only by jdbc implementation, not by data base. > On Sat, 2003-07-19 at 02:52, pginfo wrote: > > Hi, > > > > I am using pg + jdbc for j2ee development from a long time. > > The main problem I found is the very bad insert performance. > > I am using addBatch and executeBatch for minimum network trafik, > > but it is also very slow. > > For example if I try to insert 1000 rows ( in one transaction) into very > > simple table > > it takes ~ 3000 ms. If i use insert into myTable select ... it takes > > only 10 ms. > this is not comparing apples to apples, the insert into ... does > everything on the backend, jdbc has to get the data to the server. > > > > Exist any basic performance problem for pg jdbc or I make mistake? > > Also I testet the same example on oracle 8i and mysql. > > How was this test done? For instance did you use transactions with > mysql? did you use their jdbc interface, was the connection to the > database the same speed in each case? > I am not mysql expert. I used the standart setup (with transactions). All the tests I make on the same computer:dual PIII 1 GHz, 4 GB RAM, linux 7.3. I make this tests only to be sure that the problem is not into JVM or my app. server. I expected to reach 2000 (or 3000) inserts/sec. This is the result by oracle and mysql. With pg I got ~ 300 inserts/sec. It is 10 time slower and I do not think it is normal. I know pg relativ good and think that th db performance is aprox the same as oracle. And for me the problem is into the jdbc driver. Any idea? ivan. > > It is working much much better. > > I love pg but it will be great if I can make insert faster. > > > > Many thanks, > > ivan. > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 8: explain analyze is your friend > > > -- > Dave Cramer <Dave@micro-automation.net> > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
On 19/07/2003 07:52 pginfo wrote: > Hi, > > I am using pg + jdbc for j2ee development from a long time. > The main problem I found is the very bad insert performance. > I am using addBatch and executeBatch for minimum network trafik, > but it is also very slow. > For example if I try to insert 1000 rows ( in one transaction) into very > simple table > it takes ~ 3000 ms. If i use insert into myTable select ... it takes > only 10 ms. > > Exist any basic performance problem for pg jdbc or I make mistake? > Also I testet the same example on oracle 8i and mysql. > It is working much much better. > I love pg but it will be great if I can make insert faster. > You don't say which version of PG you're using or what your hardware/OS setup is. Anyway, I've been doing some testing of my own as well doing some digging into the JDBC source. Here is my set up 1GZ Athlon 512MB RAm 30MB ATA66 IDE drive RedHat Linux 7.2 PostgreSQL 7.3.3 postgresql.conf changes: shared)buffer = 1000 max_connections = 100 sort_memory = 1024 My test program generates approx 1.2 million records into a table with a single index. I didn't use addBatch()/executeBatch() but tried variations of Statement and PreparedStatement in a transaction, commiting approximately every 100 records. If you look at the source you will see that executeBatch performs a series of exeucte()'s - one for each statement added with addBatch() so I don't think there would be any less network traffic compared to executing each query individually. The results I got were as follows 1) Using Statement: 290 inserts per second 2) Using PreparedStatement: 420 inserts per second 3) Using PreparedStatement with server prepared statement: 1030 inserts per second HTH -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for the Smaller Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
Hi, my configuration: dual PIII 1 GHz, 4 GB RAM, linux red hat 7.3, 40 GB HDD ATA100, PG 7.3.1 max_connections 256 shared_buffers = 512 sort_mem = 10000 I tested before some time the executeUpdate and executeBatch and found big diference by performance. If as you say it do not exist any diference I think it will be possible to improve executeBatch. For example for it's implementation to use the pg copy command. Exist at the moment any work for improving the batch commands? regards, ivan. Paul Thomas wrote: > On 19/07/2003 07:52 pginfo wrote: > > Hi, > > > > I am using pg + jdbc for j2ee development from a long time. > > The main problem I found is the very bad insert performance. > > I am using addBatch and executeBatch for minimum network trafik, > > but it is also very slow. > > For example if I try to insert 1000 rows ( in one transaction) into very > > simple table > > it takes ~ 3000 ms. If i use insert into myTable select ... it takes > > only 10 ms. > > > > Exist any basic performance problem for pg jdbc or I make mistake? > > Also I testet the same example on oracle 8i and mysql. > > It is working much much better. > > I love pg but it will be great if I can make insert faster. > > > > You don't say which version of PG you're using or what your hardware/OS > setup is. Anyway, I've been doing some testing of my own as well doing > some digging into the JDBC source. Here is my set up > > 1GZ Athlon > 512MB RAm > 30MB ATA66 IDE drive > RedHat Linux 7.2 > PostgreSQL 7.3.3 > postgresql.conf changes: > shared)buffer = 1000 > max_connections = 100 > sort_memory = 1024 > > My test program generates approx 1.2 million records into a table with a > single index. I didn't use addBatch()/executeBatch() but tried variations > of Statement and PreparedStatement in a transaction, commiting > approximately every 100 records. If you look at the source you will see > that executeBatch performs a series of exeucte()'s - one for each > statement added with addBatch() so I don't think there would be any less > network traffic compared to executing each query individually. The results > I got were as follows > > 1) Using Statement: 290 inserts per second > > 2) Using PreparedStatement: 420 inserts per second > > 3) Using PreparedStatement with server prepared statement: 1030 inserts > per second > > HTH > > -- > Paul Thomas > +------------------------------+---------------------------------------------+ > | Thomas Micro Systems Limited | Software Solutions for the Smaller > Business | > | Computer Consultants | > http://www.thomas-micro-systems-ltd.co.uk | > +------------------------------+---------------------------------------------+ > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> If as you say it do not exist any diference I think it will be > possible to improve executeBatch. > For example for it's implementation to use the pg copy command. > Exist at the moment any work for improving the batch commands? > To use COPY instead of insert during executeBatch would require some complicated parsing of the statement to determine if it was really a simple insert that could be translated into COPY. If your statement had a function call or a subselect it could not use COPY. Also I'm not sure what translating an INSERT into a COPY would do if you we're inserting into a table that had rules on it. There was a patch around that provided an API to use COPY explicitly which might suit your needs. Kris Jurka
Currently there is no difference between inserts done serially or done via the batch interface. In both cases there is one round trip per insert. This is an area of the code that would welcome contributions. --Barry pginfo wrote: > Hi, > > I am using pg + jdbc for j2ee development from a long time. > The main problem I found is the very bad insert performance. > I am using addBatch and executeBatch for minimum network trafik, > but it is also very slow. > For example if I try to insert 1000 rows ( in one transaction) into very > simple table > it takes ~ 3000 ms. If i use insert into myTable select ... it takes > only 10 ms. > > Exist any basic performance problem for pg jdbc or I make mistake? > Also I testet the same example on oracle 8i and mysql. > It is working much much better. > I love pg but it will be great if I can make insert faster. > > Many thanks, > ivan. > > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend >
Hi, Ok it is clear. I will try to found a little time for this problem into the near future. I am very familiar with pg internal and if some one have made tests with "C" inserts it will be great to send some info. For example how many records can I insert with "C" code. In this case we will know what to expect as performance. As I wrote the performance by mysql and oracle is realy great and I am ready to give my time for improving pg. regards, ivan. Barry Lind wrote: > Currently there is no difference between inserts done serially or done > via the batch interface. In both cases there is one round trip per > insert. This is an area of the code that would welcome contributions. > > --Barry > > pginfo wrote: > > Hi, > > > > I am using pg + jdbc for j2ee development from a long time. > > The main problem I found is the very bad insert performance. > > I am using addBatch and executeBatch for minimum network trafik, > > but it is also very slow. > > For example if I try to insert 1000 rows ( in one transaction) into very > > simple table > > it takes ~ 3000 ms. If i use insert into myTable select ... it takes > > only 10 ms. > > > > Exist any basic performance problem for pg jdbc or I make mistake? > > Also I testet the same example on oracle 8i and mysql. > > It is working much much better. > > I love pg but it will be great if I can make insert faster. > > > > Many thanks, > > ivan. > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 8: explain analyze is your friend > > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html