Re: Serialization errors on single threaded request - Mailing list pgsql-bugs
From | Kevin Grittner |
---|---|
Subject | Re: Serialization errors on single threaded request |
Date | |
Msg-id | s30f18df.021@gwmta.wicourts.gov Whole thread Raw |
Responses |
Re: Serialization errors on single threaded request
|
List | pgsql-bugs |
Unfortunately, the original test environment has been blown away in favor o= f testing the 8.1 beta release. I can confirm that the problem exists on a= build of the 8.1 beta. If it would be helpful I could set it up again on = 8.0.3 to confirm. I THINK it was actually the tip of the 8.0 stable branch= as opposed to the 8.0.3 release proper. =20 We have a little more information about the failure pattern -- when we get = these, it is always after there has been a rollback on the thread which eve= ntually generates the serialization error. So I think the pattern is: =20 ConnectionA: - A series of insert/update/deletes (on tables OTHER than the progress t= able). - Update the progress table. - Commit the transaction. ConnectionB: - A series of insert/update/deletes (on tables OTHER than the progress t= able) fails. - Rollback the transaction. - Attempt each insert/update/delete individually. Commit or rollback e= ach as we go. - Attempt to update the progress table -- fail on serialization error. =20 To avoid any ambiguity in my former posts -- introducing even a very small = delay between the operations on ConnectionA and ConnectionB makes the seria= lization error very infrequent; introducing a larger delay seems to make it= go away. I hate to consider that as a solution, however. =20 I'm afraid I'm not familiar with a good way to capture the stream of commun= ications with the database server. If you could point me in the right dire= ction, I'll give it my best shot. =20 I did just have a thought, though -- is there any chance that the JDBC Conn= ection.commit is returning once the command is written to the TCP buffer, a= nd I'm getting hurt by some network latency issues -- the Nagle algorithm o= r some such? (I assume that the driver is waiting for a response from the = server before returning, so this shouldn't be the issue.) At the point tha= t the commit confirmation is sent by the server, I assume the shared memory= changes are visible to the other processes? =20 -Kevin =20 =20 >>> Tom Lane <tgl@sss.pgh.pa.us> 08/26/05 12:16 PM >>> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > What happens if the timestamp of the commit is an exact match for the > timestamp of the next transaction start? What is the resolution of > the time sampling? It's not done via timestamps: rather, each transaction takes a census of the transaction XIDs that are running in other backends when it starts (there is an array in shared memory that lets it get this information cheaply). Reliability of the system clock is not a factor. Are you sure the server is 8.0.3? There was a bug in prior releases that might possibly be related: 2005-05-07 17:22 tgl * src/backend/utils/time/: tqual.c (REL7_3_STABLE), tqual.c (REL7_4_STABLE), tqual.c (REL7_2_STABLE), tqual.c (REL8_0_STABLE), tqual.c: Adjust time qual checking code so that we always check TransactionIdIsInProgress before we check commit/abort status.=20 Formerly this was done in some paths but not all, with the result that a transaction might be considered committed for some purposes before it became committed for others. Per example found by Jan Wieck. My recollection though is that this only affected applications that were using SELECT FOR UPDATE. In any case, it's pretty hard to see how this would affect an application that is in fact waiting for the backend to report commit-done before it launches the next transaction; the race-condition window we were concerned about no longer exists by the time the backend sends CommandComplete. So my suspicion remains fixed on that point. Do you have any way of sniffing the network traffic of the middle-tier to confirm that it's doing what it's supposed to? regards, tom lane
pgsql-bugs by date: