Ceki Gulcu wrote:
> Hello,
>
> Thanks for all the replies received so far. I really
> appreciate it.
>
> One important point I failed to mention in my original
> mail, was that we are using a prepared statement to
> perform the initial batch of inserts via executeBatch,
> followed by another prepared statement and invoking
> executeBatch and a third prepared
> statement+executeBatch.
>
> Your suggestions imply the use of regular statements
> as opposed to prepared statements. I don't know
> whether the performance boost is due to the use of
> prepared statements or whether it's more the
> "batching", that is regrouping the SQL statements in
> one transaction.
I initially thought you could batch the inserts in the approach I
suggested, but of course addBatch and executeBatch live on a particular
PreparedStatement so you can't do that.
Ok, how about something like this instead:
CREATE SEQUENCE id INCREMENT BY 100;
Then limit batch size to 100, and for each batch do:
SELECT nextval('id')
-- let's assume this returns 1000
Next do the actual batch inserts, grouped by statement and computing ID
values yourself based on the returned nextval + offset:
INSERT INTO logging_event VALUES (1000, ...)
INSERT INTO logging_event VALUES (1001, ...)
INSERT INTO logging_event VALUES (1002, ...)
-- ...
INSERT INTO logging_event VALUES (1099, ...)
INSERT INTO logging_event_property VALUES (1000, 'foo', ...)
INSERT INTO logging_event_property VALUES (1000, 'bar', ...)
INSERT INTO logging_event_property VALUES (1001, 'foo', ...)
INSERT INTO logging_event_property VALUES (1001, 'bar', ...)
-- etc
INSERT INTO logging_event_exception VALUES (1000, 'line 1', ...)
INSERT INTO logging_event_exception VALUES (1000, 'line 2', ...)
INSERT INTO logging_event_exception VALUES (1001, 'line 1', ...)
INSERT INTO logging_event_exception VALUES (1001, 'line 2', ...)
-- etc
Because of the INCREMENT BY clause, the generated IDs won't collide with
another concurrent inserter. This should let you use both
PreparedStatement and batch execution I think.
There is a tradeoff between rate of ID consumption and maximum batch
size to be made, but if you're using int8 for IDs then ID consumption is
unlikely to be an issue..
-O