Re: Performance comparison to psql. - Mailing list pgsql-jdbc

From Michael Nacos
Subject Re: Performance comparison to psql.
Date
Msg-id 407fa4640809250901p38ed31d4u13258aa8212dab85@mail.gmail.com
Whole thread Raw
In response to Performance comparison to psql.  (Arie Ozarov <aozarov@hi5.com>)
Responses Re: Performance comparison to psql.  (Maciek Sakrejda <msakrejda@truviso.com>)
List pgsql-jdbc
branching off from http://archives.postgresql.org/pgsql-jdbc/2008-09/msg00074.php and joining this thread

Dear Kris,

thanks for the link in regards to COPY versus JDBC batch. From the numbers in that email, I am tempted to assume that, if your tables have indices, you would expect something like 1x speed of COPY = 6x speed of JDBC batch = 36x speed of single-statement JDBC
Comparison table (records inserted per millisecond)
COPY JDBC JDBC batch
WITHOUT INDEXES: 198 1.5 14
WITH 2 INDEXES: 45 1.5 10
In the last few weeks, I have been trying to improve a bulk update process which involves arbitrary SQL statements, so a COPY command has not been an option. I cannot really compare what I've been doing to the performance of a COPY command, cause the specs differ, but I can compare pgBee's performance to psql's processing text files of SQL statements.

For the same input file, psql took 112m, whereas pgBee finished in 21m. Both were run on the same database server (localhost), pgBee was batching groups of 100 statements at a time and a real data file was used, with 1131753 SQL statements in total (511335 DELETEs and 567577 INSERTs). pgBee seems to be doing 898 operations/millisecond, or ~450 INSERTSs/millisecond ! Is that possible? psql seems to manage 168 operations/millisecond - half of those are INSERTs.

I suppose I am sort of doing what a JDBC COPY command would do (but with arbitrary SQL statements): pgBee handles the parsing of files, takes care of encoding issues, escapes some characters and find and tries to execute all statements in the most efficient way. Somehow, reading lines from a text file into a String buffer, splitting the statements up and executing them as a batch gives you a real performance boost. Does psql implicitly handle each statetement as a separate transaction? This could account for pgBee's better performance.

Michael

pgsql-jdbc by date:

Previous
From: Craig Ringer
Date:
Subject: Re: COPY support in JDBC driver?
Next
From: Maciek Sakrejda
Date:
Subject: Re: Performance comparison to psql.