Re: [BUGS] BUG #1347: Bulk Import stopps after a while ( - Mailing list pgsql-jdbc
From | Csaba Nagy |
---|---|
Subject | Re: [BUGS] BUG #1347: Bulk Import stopps after a while ( |
Date | |
Msg-id | 1102931834.3498.23.camel@localhost.localdomain Whole thread Raw |
In response to | Re: [BUGS] BUG #1347: Bulk Import stopps after a while ( 8.0.0. (Kris Jurka <books@ejurka.com>) |
Responses |
Re: [BUGS] BUG #1347: Bulk Import stopps after a while (
|
List | pgsql-jdbc |
Hi all, I just shoot in the dark, but using the non blocking IO facilities of jdk1.4+ wouldn't solve this problem ? Of course this would be no more compatible with older jdks... and not that I would have time to contribute :-( Cheers, Csaba. On Mon, 2004-12-13 at 08:46, Kris Jurka wrote: > On Mon, 13 Dec 2004, PostgreSQL Bugs List wrote: > > > > > The following bug has been logged online: > > > > Bug reference: 1347 > > PostgreSQL version: 8.0 Beta > > Operating system: Windows XP > > Description: Bulk Import stopps after a while ( 8.0.0. RC1) > > > > - I have written a java program to transfer data from SQL Server 2000 to > > PosgresSQL 8.0.0 RC1 release. I am updating the data in batches. > > If my batch size is 1000/2000 records at a time.. This works fine.. And if I > > change this size to say 20,000, it does only finishes one loop.. and then > > stays idle. The CPU usage down to 10 % which was before 100 % while applying > > the first batch of 20, 000 records. > > > > > > The execution of program is halting just at > > int n [] = stmt.batchUpdate(); > > > > This may be a problem with the JDBC driver deadlocking as described in the > below code comment. When originally written I asked Oliver about the > estimate of MAX_BUFFERED_QUERIES and he felt confident in that number. It > would be good to know if lowering this number fixes your problem. > > Kris Jurka > > // Deadlock avoidance: > // > // It's possible for the send and receive streams to get > // "deadlocked" against each other since we do not have a separate > // thread. The scenario is this: we have two streams: > // > // driver -> TCP buffering -> server > // server -> TCP buffering -> driver > // > // The server behaviour is roughly: > // while true: > // read message > // execute message > // write results > // > // If the server -> driver stream has a full buffer, the write will > // block. If the driver is still writing when this happens, and the > // driver -> server stream also fills up, we deadlock: the driver is > // blocked on write() waiting for the server to read some more data, > // and the server is blocked on write() waiting for the driver to read > // some more data. > // > // To avoid this, we guess at how many queries we can send before the > // server -> driver stream's buffer is full (MAX_BUFFERED_QUERIES). > // This is the point where the server blocks on write and stops > // reading data. If we reach this point, we force a Sync message and > // read pending data from the server until ReadyForQuery, > // then go back to writing more queries unless we saw an error. > // > // This is not 100% reliable -- it's only done in the batch-query case > // and only at a reasonably high level (per query, not per message), > // and it's only an estimate -- so it might break. To do it correctly > // in all cases would seem to require a separate send or receive > // thread as we can only do the Sync-and-read-results operation at > // particular points, and also as we don't really know how much data > // the server is sending. > > // Assume 64k server->client buffering and 250 bytes response per > // query (conservative). > private static final int MAX_BUFFERED_QUERIES = (64000 / 250); > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend
pgsql-jdbc by date: