I'm having a really hard time coming up with theories about the cause
or things to check.
We ran the test again with logging to disk, and it didn't happen in an
hour of testing. The logging boosted the average run time of the
series of database modificates we attempt as a single transaction
from 44 ms to 58 ms. We logged to a dummy PrintWriter, which just
returned to driver code without doing anything, and the time went to
50 ms. We got our first error after 9 minutes with that configuration.
The only thing running on the server is the postgres back end. It would
be hard to imagine something outside of the postgres software itself
which would be able to send the signal only when a rollback occurred.
Can you think of anything which could be coming through the protocol
stream which would cause this signal during the commit after a rollback?
About the only other thing I can think to do is to try to come up with
a RAM-based PrintWriter to keep a rolling buffer of JDBC logging
which it would dump when we get the error. Since a PrintWriter which
did absolutely nothing was right on the edge of blocking the problem,
I'm skeptical that adding even that much will allow the problem to show.
I welcome all suggestions on what to try or what to monitor.
-Kevin
>>> Tom Lane <tgl@sss.pgh.pa.us> 09/13/05 11:31 AM >>>
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> One more thought -- I keep coming back to the fact that when we turn
> on logging in the JDBC driver on the client side, the problem does not
> occur. The only possible reason I can see for this having any affect
> on the problem is the small delay introduced by the synchronous
> logging. Since this is only showing up on commit of a database
> transaction which follows close on the heels of a rollback on the same
> connection, is there any chance that there is some very small
> "settling time" needed for a rollback, and we're sometimes getting in
> ahead of this?
(1) No.
(2) Even if we posit the existence of such a serious bug as that, it
wouldn't explain how control gets to the SIGINT response code.
regards, tom lane