Re: Audit Logs WAS: temporal support patch - Mailing list pgsql-hackers
From | Christopher Browne |
---|---|
Subject | Re: Audit Logs WAS: temporal support patch |
Date | |
Msg-id | CAFNqd5VvgJKgtBP0hU9Tb0sM0qr+z7+FTLdrb2xLHweeErdeKA@mail.gmail.com Whole thread Raw |
In response to | Re: Audit Logs WAS: temporal support patch (Jim Nasby <jim@nasby.net>) |
List | pgsql-hackers |
On Tue, Aug 28, 2012 at 5:06 PM, Jim Nasby <jim@nasby.net> wrote: > On 8/28/12 2:51 PM, Pavel Stehule wrote: >>> >>> >The thing I don't like about this is it assumes that time is the best >>> > way to >>> >refer to when things changed in a system. Not only is that a bad >>> > assumption, >>> >it also means that relating things to history becomes messy. >> >> On second hand I don't have a problem with some optional counter, >> although I think so database system time is very useful and other >> counters for versioning are not necessary - because in one time I can >> have only one version - it doesn't do versions from rollbacked >> transactions. > > What happens if the system clock runs backwards? > > What happens if two transactions start in the same microsecond? (And I know > for a fact that's possible, because I've seen it). > > More importantly, I believe using time to handle recording a versioned > history of something is flawed to begin with. You might care about what time > a new version was created; but what's far more important is recording the > correct ordering of things, and time isn't actually a great way to do that. > > Note that no version control systems use time as their primary attribute. At one point, I designed a configuration system that used the DNS concept of "serial numbers" to indicate the temporality of the configuration. NULL was treated as 'infinitely in the future'. The notion would be that a new piece of configuration would initially be assigned (latest_serial+1, NULL) as its temporal visibility, and, upon bumping the serial number, that would become "live" configuration. One would mark a piece of configuration as "about to die" by assigning (X, latest_serial+1) as its temporal visibility, and, again, upon bumping the serial number, that terminates its visibility. In that approach to things, it's an "expensive serialization event" to bump the serial number. Not that the action is super-expensive, it's just that that's not something you ought to do in a distributed fashion. ONE process should bump the serial number. I think you're right that for these sorts of cases, the use of time as the source of versioning information is pretty dangerous. In SCM systems, we discovered that it was pretty horrible to try to assign serial numbers in a 'consistent' fashion; you'd get, in CVS, that files would get assigned version '1.2.1.5.3.2.7', and things would get worse from there. It turned out that using a hash code like SHA-* as a version number was more satisfactory; just attach labels to those versions that you care to identify. It's possible that, for a "versioned history," that assigning a hash/UUID as the version ID is pretty satisfactory, even though it's not ordered. Using a sequence to assign a version scales a bit less well, but is probably usually OK. One of the other things discovered in that "config system design" was that it was mighty useful to assign transactions at will: create table config_txns ( tx_id serial primary key, tx_at timestamptz default now() ); , associating data with the tx_id value, and then associate additional information to the "config_txns" table if needful. For instance, sometimes you need a label (akin to an SCM "branch" or "tag" label) create table config_labels ( tx_id integer primary key references config_txns(tx_id) on delete cascade, label text not nullunique ); -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
pgsql-hackers by date: