Re: temporal support patch - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: temporal support patch |
Date | |
Msg-id | 503603350200002500049A3E@gw.wicourts.gov Whole thread Raw |
In response to | Re: temporal support patch (Jeff Davis <pgsql@j-davis.com>) |
List | pgsql-hackers |
Jeff Davis <pgsql@j-davis.com> wrote: > On Tue, 2012-08-21 at 17:07 -0500, Kevin Grittner wrote: >> The fact that it has an unknown sequence number or timestamp for >> purposes of ordering visibility of transactions doesn't mean you >> can't show that it completed in an audit log. In other words, I >> think the needs for a temporal database are significantly >> different from the needs of an auditing system. > > ... > >> I would assume an audit log would have very different needs from >> tracking changes for a temporal database view. It even seems >> possible that you might want to see what people *looked* at, >> versus just changes. You might want to see transactions which >> were rolled back, which are irrelevant for a temporal view. If >> we're talking about an auditing system, we're talking about an >> almost completely different animal from a temporal view of the >> database. > > OK, I think I see what you're saying now. Basically, an audit log > means different things to different people, so I think it confused > the issue. Probably. When I think of an audit log, I tend to think of viewing "who did what when", without that necessarily caring a lot about viewing interim visible database states. > But "temporal" is fairly vague, too. Yeah, but in this context I have taken it to mean that someone wants to run a query such that it sees the database state "as of" some previous point in time. Even with a read-only transaction, if you want to avoid seeing states of the database which are inconsistent with business rules enforced through serializable transactions, you need to deal with some tricky problems. > I am most interested in the topic you brought up about > serializability and system time (transaction time), because it > would be a fundamental piece upon which we can build a lot of > these other things (including what could be called an audit log). [brain dump follows -- remember, you *said* you were interested] If you think it matters for what you are calling an audit log, then I probably have an incomplete or inaccurate understanding of what you mean by audit log. Perhaps you could sketch that out a bit? (Or point back to where it was described, if I've missed or forgotten something that went before.) The reason it's tricky is that while SSI fully complies with the requirement that the behavior of a set of concurrent serializable transactions running in a database is consistent with some serial (one-at-a-time) execution of those transactions, it does not share certain properties with other types of serializable implementations, so people may be assuming those additional properties where they don't actually exist. The two most common alternatives to SSI are S2PL and OCC. Under both of these techniques, the apparent order of execution (the order in which the transactions could have run to produce the same results as if they were run one-at-a-time) is the commit order. In S2PL this is accomplished by having reads block writes until commit time and writes block everything until commit time. In OCC this is accomplished by checking the read set of a transaction at commit time and rolling back the transaction if there is a single write by another transaction which conflicts with the predicate locks of the read set (i.e., there is a single read-write conflict out from the transaction being committed). SSI dodges the blocking and the high rollback rate, but the technique has these characteristics which may be surprising:- The apparent order of execution is not always the commit order. If two transactions are concurrent, and T1 reads something which would look different if it could see the work of T2 (but it *can't* because the transactions are concurrent), then T1 *appears* to have executed before T2. T2 might actually *start* first and *commit* first, but if there was overlap and the rw-conflict, then T1 ran first *logically*. SSI prevents cycles in this ordering, by canceling a transaction when a possible cycle is detected.- A read-only transactions can cause an anomaly where there would otherwise not be one. This is because a transaction which "appeared" to commit after another transaction based on rw-conflicts may have actually committed first, and would be visible to the read-only transaction while the work of the "earlier" transaction would not show up for it; if no transaction observes that state, then the problem goes away when the "logically earlier" transaction later commits. If the state is observed, even by a read-only transaction, then the "earlier" transaction logically "can't have happened" -- so it must be rolled back with a serialization failure. Within one database, this is tracked and handled by SSI. My concern is that the transactions might both commit, then a "time traveler" goes back and sees the state "that never happened." One of the features added with SSI was DEFERRABLE transactions. The point of this is that when a snapshot is generated, it can often be determined (either immediately or after other transactions have completed) that the snapshot cannot see any such anomalous database states. A SERIALIZABLE READ ONLY DEFERRABLE transaction waits until it can acquire such a snapshot before running its first query. We have talked about the possibility of supporting SERIALIZABLE transactions on hot standby by including some minimal information in the WAL stream to allow the standby to identify safe snapshots and use only those for SERIALIZABLE transactions. I have a harder time seeing that working with temporal views of a database or with audit logs. What I've been on about lately is a half-baked idea about how to possibly construct snapshots more recent than the last "safe" snapshot according to existing rules, that will still give a coherent view of the database from a read-only replica or in this temporal view of things. There are unresolved issues with the idea, and I'm not really sure that once those are ironed out, it will be different from what you would get by just remembering the last safe snapshot and using it for serializable transaction requests, but I still have hope of doing better than that. The problems come in when you consider a mix of SERIALIZABLE and other transactions. If *all* DML transactions are SERIALIZABLE, an ordering of transactions can be created, and you can pick any point in that list and construct a view of the database which will be free of serialization anomalies. The problem is how to interleave that list with non-serializable transactions and still build a coherent snapshot. And the serializable "apparent order of execution" probably doesn't matter for non-serializable transactions reading the database, because in real-time they see modifications based on commit order. They can see serialization anomalies, but by virtue of running at a less strict isolation level, they are saying that they don't care about that. -Kevin
pgsql-hackers by date: