Thread: FW: bitemporal functionality for PostgreSQL
All<br /> <br /> Is there an interest in developing bitemporal functionality in PostgreSQL<br /> <br /> Regards<br /> <br /> Luke <br /><br />> From: bruce@momjian.us<br />> Subject: Re: bitemporal functionality for PostgreSQL<br/>> To: luke_porter@hotmail.com<br />> Date: Fri, 1 Feb 2008 10:08:03 -0500<br />> <br />> LukePorter wrote:<br />> > <br />> > Bruce<br />> > <br />> > I have a compelling solution to providea database with bitemporal<br />> > functionality. It is a comprehensive spec. Is there any interest<br />>> in PostgreSQL providing this functionality. The area is sometimes<br />> > referred to as "temporal agility".It has recently been covered<br />> > as an emerging requirement by Garther and C J Date has recently<br />>> lectured on the area (My solution was demo'd at the same event<br />> > he spoke at). It is a transformingapproach to data management<br />> > in that it completely removes the need for the developer to<br />>> address the time dimension - the database does it all.<br />> <br />> Yes, I think there would be interest.I would ask on the hackers email<br />> list.<br />> <br />> --<br />> Bruce Momjian <bruce@momjian.us>http://momjian.us<br />> EnterpriseDB http://postgres.enterprisedb.com<br />> <br />> +If your life is a hard drive, Christ can be your backup. +<br /><br /><br /><hr />Messenger on the move. <a href="http://mobile.uk.msn.com/pc/messenger.aspx" target="_new">Text MSN to 63463 now!</a>
Luke Porter <luke_porter@hotmail.com> writes: > Is there an interest in developing bitemporal functionality in PostgreSQL Is that anything like the late, mostly unlamented "time travel" feature? regards, tom lane PS: in general, defining what you want with one word and assuming everyone will grok the full meaning and implications of that is a good way to get ignored on this list. Assuming we have never heard of "bitemporal" is a better starting point.
On Feb 1, 2008, at 10:42 AM, Luke Porter wrote: > All > > Is there an interest in developing bitemporal functionality in > PostgreSQL > > Regards > > Luke I can only speak for myself, but- definitely! Based on the googling I did on "bitemporal database", I kind of do this already with PostgreSQL. Some of my tables are insert-only and each row includes a committed time timestamp. That way, I don't need a separate audit log table, and "fixing" someone's mistake is as simple as copying old rows. The downside to this is that I need a view to represent the current "truth" and calculating the truth is more expensive than a simple table would be. Can you explain in more detail or provide references to how PostgreSQL could potentially handle temporal data better? One idea I had would be to blow the transaction ID up to 128 bits (no more wrapping!) and have it represent the nanoseconds since the epoch. Cheers, M
On Fri, 2008-02-01 at 15:42 +0000, Luke Porter wrote: > All > > Is there an interest in developing bitemporal functionality in > PostgreSQL > I am very interested in this topic, and I maintain the pgsql-temporal project at: http://pgfoundry.org/projects/temporal/ It's missing good docs and a few other things that I'd like, but it provides a good time interval type, including lots of useful operators, and GiST index support functions. For instance, you can do queries like: SELECT att1 FROM mytable WHERE during @> '2001-05-11 01:01:01'::timestamptz; which is a simple way to get all records where "during" contains the point in time '2001-05-11 01:01:01'. It's also indexable with GiST, meaning that query will perform well in a variety of situations. I'm going to spend some time getting the docs up to speed so people can actually use it. Regards,Jeff Davis
Jeff Davis wrote: > On Fri, 2008-02-01 at 15:42 +0000, Luke Porter wrote: >> All >> >> Is there an interest in developing bitemporal functionality in >> PostgreSQL >> > > I am very interested in this topic, and I maintain the pgsql-temporal > project at: > > http://pgfoundry.org/projects/temporal/ > > It's missing good docs and a few other things that I'd like, but it > provides a good time interval type, including lots of useful operators, > and GiST index support functions. > > For instance, you can do queries like: > > SELECT att1 FROM mytable WHERE during @> '2001-05-11 > 01:01:01'::timestamptz; > > which is a simple way to get all records where "during" contains the > point in time '2001-05-11 01:01:01'. It's also indexable with GiST, > meaning that query will perform well in a variety of situations. I don't know what "bitemporal" is all about, but to me, the pgsql-temporal approach is clearly the most flexible and attractive. Good interval handling is not limited to just time, BTW, there's other applications that deal with intervals of other types like floats, though time intervals are the most common. We can get pretty far with a pgfoundry project, providing a good interval data type with operators, but there's a few things that would need backend support. For example, queries like: SELECT * FROM mytable1; timeatt ------- 10 - 20 30 - 40 SELECT * FROM mytable2, timeatt ------- 15 - 35 SELECT * FROM mytable1 INTERSECT SELECT * FROM mytable2; timeatt ------- 15 - 20 30 - 35 as well as declaring uniqueness constraints, so that there's no rows with overlapping intervals, foreign key references on intervals etc. I would suggest a book called "Temporal Data and the Relational Model", by C. J. Date, Hugh Darwen and Nikos A Lorentzos to anyone who's interested in temporal issues. It presents a very elegant solution to temporal issues, fully compatible with the relational model. Unlike "time travel" kind of approaches. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Sun, 2008-02-03 at 10:48 +0000, Heikki Linnakangas wrote: > I don't know what "bitemporal" is all about, but to me, the > pgsql-temporal approach is clearly the most flexible and attractive. I think what he means by "bitemporal" is what CJ Date, et al., refer to as "fully temporal" (as opposed to semi-temporal), that is, dealing with time intervals rather than time points. > SELECT * FROM mytable1; > timeatt > ------- > 10 - 20 > 30 - 40 > > SELECT * FROM mytable2, > timeatt > ------- > 15 - 35 > > SELECT * FROM mytable1 > INTERSECT > SELECT * FROM mytable2; > timeatt > ------- > 15 - 20 > 30 - 35 > > as well as declaring uniqueness constraints, so that there's no rows > with overlapping intervals, foreign key references on intervals etc. I'd like to see those things as well. I think it would be relatively straightforward to define some kind of generalized join operator (of which INTERSECT can be a special case), but optimizing it properly would be difficult. We could extend type definitions to optionally include the operators necessary to perform the generalized join. We could do similar things for other relational operators. I think for the key constraint (which is not the same as uniqueness), we'd need to make a new index access method (or perhaps extend GiST somehow). If you have any ideas, I'd like to hear them. Obviously, my ideas are still far away from a feature proposal. > I would suggest a book called "Temporal Data and the Relational Model", > by C. J. Date, Hugh Darwen and Nikos A Lorentzos to anyone who's > interested in temporal issues. It presents a very elegant solution to > temporal issues, fully compatible with the relational model. Unlike > "time travel" kind of approaches. I second this statement. That's an excellent book, both analytical and practical. I think my time interval type is fairly faithful to that approach (although as I said, I really need to document it properly). Regards,Jeff Davis
On Sun, 3 Feb 2008, Jeff Davis wrote: > I think what he means by "bitemporal" is what CJ Date, et al., refer to > as "fully temporal" (as opposed to semi-temporal), that is, dealing with > time intervals rather than time points. I think fully temporal is a step of complexity above how some people use bitemporal but this terminology is slippery. Bitemporal databases are ones where every transaction gets two timestamps: one at transaction commit and a second that encodes what time that transaction is valid as of. A classic example uses a bitemporal employee promotion table to track the company org chart. With the "valid as of" timestamp in there, if you structure the query right you can generate a report as of any particular point in time and find out what the structure of the organization was at that point. There's all sorts of applications where being able to track the state not just of the current table but of where it was at a previous/future time can be handy, and bitemporal approaches can be easier to handle than something like a history table. While there's useful syntax to add, I believe one challenge of bitemporal databases from the -hackers perspective involves how to accelerate the typical queries people run. Here's some research that looks into things like how to index the data usefully, and that gives more references into the bitemp literature: http://www.cs.ucr.edu/~tsotras/temporal.html >> I would suggest a book called "Temporal Data and the Relational Model", >> by C. J. Date, Hugh Darwen and Nikos A Lorentzos to anyone who's >> interested in temporal issues. I think you need to be familiar with the work set down in both that one and the Snodgrass/Jensen "Developing Time-Oriented Database Applications in SQL" before you can even start do anything that's actually new in this area. Bitemporal tables show up early in that book (P44 of the PDF http://www.cs.arizona.edu/people/rts/tdbbook.pdf ) The way they use them, the valid and transaction times are both intervals rather than points, which I think makes the implementation there look more like Date's fully-temporal tables. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Le dimanche 03 février 2008, Greg Smith a écrit : > On Sun, 3 Feb 2008, Jeff Davis wrote: > > I think what he means by "bitemporal" is what CJ Date, et al., refer to > > as "fully temporal" (as opposed to semi-temporal), that is, dealing with > > time intervals rather than time points. > > >> I would suggest a book called "Temporal Data and the Relational Model", > >> by C. J. Date, Hugh Darwen and Nikos A Lorentzos to anyone who's > >> interested in temporal issues. > > I think you need to be familiar with the work set down in both that one > and the Snodgrass/Jensen "Developing Time-Oriented Database Applications > in SQL" before you can even start do anything that's actually new in this > area. Bitemporal tables show up early in that book (P44 of the PDF > http://www.cs.arizona.edu/people/rts/tdbbook.pdf ) I found the following document quite useful to grasp the concepts involved, it allowed me to decide whether I needed bitemporal feature or not (was not) :) http://rueping.info/doc/Andreas Rüping -- 2DHistory.pdf Hope this helps, regards, -- dim
On Sun, 2008-02-03 at 15:56 -0500, Greg Smith wrote: > On Sun, 3 Feb 2008, Jeff Davis wrote: > > > I think what he means by "bitemporal" is what CJ Date, et al., refer to > > as "fully temporal" (as opposed to semi-temporal), that is, dealing with > > time intervals rather than time points. > > Bitemporal databases are ones where every transaction gets two timestamps: > one at transaction commit and a second that encodes what time that > transaction is valid as of. Ah, so "bitemporal" means both valid and transactional time stored in the relation. I believe the definition of "fully temporal" I was referring to was that it used intervals rather than time points. It's interesting that you're using time points to represent that data. I would think time intervals would be much more useful. > I think you need to be familiar with the work set down in both that one > and the Snodgrass/Jensen "Developing Time-Oriented Database Applications > in SQL" before you can even start do anything that's actually new in this > area. Bitemporal tables show up early in that book (P44 of the PDF > http://www.cs.arizona.edu/people/rts/tdbbook.pdf ) The way they use them, > the valid and transaction times are both intervals rather than points, > which I think makes the implementation there look more like Date's > fully-temporal tables. Although I haven't read the whole Snodgrass book, out of what I did read it seemed much less helpful. They mix a lot of the analysis of time data management with SQL idiosyncrasies (and specific implementations), which is not nearly so useful in something like PostgreSQL where we can make our own data types. Also, they use "period" to mean interval, and "interval" to mean duration -- which is wrong, in my opinion; interval already has a well- defined mathematical meaning. Regards,Jeff Davis
Jeff Davis wrote: > Also, they use "period" to mean interval, and "interval" to mean > duration -- which is wrong, in my opinion; interval already has a well- > defined mathematical meaning. Agreed, but that mistake actually originates from the SQL standard. The SQL INTERVAL data type is really a duration, so I guess they had to come up with a new term for interval in the mathematical meaning. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote: > Jeff Davis wrote: >> Also, they use "period" to mean interval, and "interval" to mean >> duration -- which is wrong, in my opinion; interval already has a well- >> defined mathematical meaning. > > Agreed, but that mistake actually originates from the SQL standard. The And plenty of other ISO standards for around the same time, such as ISO 8601 which defines ways to specify both intervals that have Start and End times as well as Duration Only intervals.
On Mon, 2008-02-04 at 20:50 +0000, Heikki Linnakangas wrote: > Jeff Davis wrote: > > Also, they use "period" to mean interval, and "interval" to mean > > duration -- which is wrong, in my opinion; interval already has a well- > > defined mathematical meaning. > > Agreed, but that mistake actually originates from the SQL standard. The > SQL INTERVAL data type is really a duration, so I guess they had to come > up with a new term for interval in the mathematical meaning. Fair enough. However, that is just an example of the SQL idiosyncrasies that I was referring to. I think that kind of thing gets in the way a lot more than it helps, which is why I found Temporal Data and the Relational Model so refreshing. Regards,Jeff Davis