Thread: Periods
SQL:2011 introduced the concept of a "period". It takes two existing columns and basically does the same thing as our range types except there is no new storage. I believe if Jeff Davis had given us range types a few years later than he did, it would have been using periods.
Attached is a WIP patch that I have been working on. The only thing left is completing periods on inherited tables, and finishing up pg_dump. I'm posting this now just to make sure my basic foundation is sound, and to let people know that I'm working on this.
The patch itself doesn't have any functionality, it just allows periods to be defined. With that, there are several things that we can do: SYSTEM_TIME periods, which are explicitly not allowed by this patch, will allow us to do SQL standard versioned tables, and also allows some time travel functionality. Application periods can be used in PRIMARY/UNIQUE keys, foreign keys, and give nice new features to UPDATE and DELETE. They also allow "period predicates" which are the same kind of operations we already have for range types. All of that is for future patches that build on the infrastructure presented in this patch.
The SQL standard restricts period columns to dates or timestamps, but I'm allowing anything that has a btree operator class, as is the PostgreSQL way. System periods, once allowed, will only be timestamptz though. Unfortunately, I had to fully reserve the word PERIOD for this.
I'm looking for comments on everything except the pg_dump stuff, keeping in mind that inheritance is not finished either.
Thanks!
This is patch is based off of 71b349aef4.
Attachment
2018-05-26 22:56 GMT+02:00 Vik Fearing <vik.fearing@protonmail.com>:
SQL:2011 introduced the concept of a "period". It takes two existing columns and basically does the same thing as our range types except there is no new storage. I believe if Jeff Davis had given us range types a few years later than he did, it would have been using periods.Attached is a WIP patch that I have been working on. The only thing left is completing periods on inherited tables, and finishing up pg_dump. I'm posting this now just to make sure my basic foundation is sound, and to let people know that I'm working on this.The patch itself doesn't have any functionality, it just allows periods to be defined. With that, there are several things that we can do: SYSTEM_TIME periods, which are explicitly not allowed by this patch, will allow us to do SQL standard versioned tables, and also allows some time travel functionality. Application periods can be used in PRIMARY/UNIQUE keys, foreign keys, and give nice new features to UPDATE and DELETE. They also allow "period predicates" which are the same kind of operations we already have for range types. All of that is for future patches that build on the infrastructure presented in this patch.The SQL standard restricts period columns to dates or timestamps, but I'm allowing anything that has a btree operator class, as is the PostgreSQL way. System periods, once allowed, will only be timestamptz though. Unfortunately, I had to fully reserve the word PERIOD for this.I'm looking for comments on everything except the pg_dump stuff, keeping in mind that inheritance is not finished either.Thanks!
looks interesting
Regards
Pavel
This is patch is based off of 71b349aef4.
On Sat, May 26, 2018 at 1:56 PM, Vik Fearing <vik.fearing@protonmail.com> wrote: > SQL:2011 introduced the concept of a "period". It takes two existing columns > and basically does the same thing as our range types except there is no new > storage. I believe if Jeff Davis had given us range types a few years later > than he did, it would have been using periods. Hi Vik, I'm really glad to see someone working on temporal features! I've only dabbled in Postgres hacking, but I've been following temporal database research for several years, so I hope you won't mind my comments. I already shared some thoughts on this other thread: http://www.postgresql-archive.org/SQL-2011-Valid-Time-Support-td6020221.html I would love to see Postgres support the standard but *also* let people use range types. I'm not sure I agree that Jeff Davis would have preferred the SQL:2011 period idea, which is an extra-relational concept. Since it is attached to a table, it doesn't carry through cleanly to a result set, so what happens if you want to apply temporal features to a view, subquery, CTE, or set-returning function? A range on the other hand is just a type, so as long as temporal operators support that type, everything still composes nicely and just works. The Date/Darwen/Lorenztos book has more to say about that, and I think it's worth reading. They are unrealistically extreme in their purism, but here I think they have some good points---points they also raised against an earlier proposed temporal-SQL standard circa 1998. By the way here are some thoughts Jeff shared with me about that book, which he says inspired range types: https://news.ycombinator.com/item?id=14738655 I understand that your patch is just to allow defining periods, but I thought I'd share my own hopes earlier rather than later, in case you are doing more work on this. Also, it might be nice if Postgres let you also define periods from a single range column, so that people who want to use intervals can still stick closer to the standard---I dunno, just an idea. Also, this may not be very helpful, but I started an extension to support temporal foreign keys here: https://github.com/pjungwir/time_for_keys It uses intervals, not periods, but maybe you can steal some ideas. :-) I have a half-finished branch porting it from plpgsql to C, so that I could give them more catalog integration, and also I have hopes of defining temporal primary keys, although that isn't implemented yet. Anyway, I mention it because you said, "Application periods can be used in PRIMARY/UNIQUE keys, foreign keys," but feel free to ignore it. :-) In general, I would love Postgres to have some lower-level primitives like range types and the Dingös operators, and then build the SQL:2011 support on top of those. I'm happy to contribute work to help make that happen, although I'd probably need to work with someone with more Postgres hacking experience to get it done. Yours, Paul
On Tue, Jun 5, 2018 at 12:47 PM, Paul A Jungwirth <pj@illuminatedcomputing.com> wrote: > Also, this may not be very helpful, but I started an extension to > support temporal foreign keys here: > > https://github.com/pjungwir/time_for_keys > > It uses intervals, not periods, but maybe you can steal some ideas. Sorry for two emails but I wanted to add: the more stealable thing are the tests, which are pretty thorough and took a lot of hours to write. They are yours if you want them. :-) Paul
On Sat, May 26, 2018 at 1:56 PM, Vik Fearing vik.fearing@protonmail.com wrote:SQL:2011 introduced the concept of a "period". It takes two existing columnsand basically does the same thing as our range types except there is no newstorage. I believe if Jeff Davis had given us range types a few years laterthan he did, it would have been using periods.Hi Vik, I'm really glad to see someone working on temporal features!I've only dabbled in Postgres hacking, but I've been followingtemporal database research for several years, so I hope you won't mindmy comments. I already shared some thoughts on this other thread:
Hi! No, of course I don't mind your comments; I welcome them. I had not seen that thread so I'll go take a look at it.
I would love to see Postgres support the standard but also letpeople use range types. I'm not sure I agree that Jeff Davis wouldhave preferred the SQL:2011 period idea, which is an extra-relationalconcept. Since it is attached to a table, it doesn't carry throughcleanly to a result set, so what happens if you want to apply temporalfeatures to a view, subquery, CTE, or set-returning function?
As far as I can see, the standard doesn't say what should happen if you select a period, or even if that's possible. It does however define how to create a period not attached to a table (PERIOD <left paren> <period start value> <comma> <period end value> <right paren>) so it would be possible to use that for views, subqueries, and the rest of your examples.
A range on the other hand is just a type, so as long as temporal operatorssupport that type, everything still composes nicely and just works.The Date/Darwen/Lorenztos book has more to say about that, and I thinkit's worth reading. They are unrealistically extreme in their purism,but here I think they have some good points---points they also raisedagainst an earlier proposed temporal-SQL standard circa 1998. By theway here are some thoughts Jeff shared with me about that book, whichhe says inspired range types:
Thanks, I will read this, too.
I understand that your patch is just to allow defining periods, but Ithought I'd share my own hopes earlier rather than later, in case youare doing more work on this.
Yes, I plan on doing much more work on this. My goal is to implement (by myself or with help from other) the entire SQL:2016 spec on periods and system versioned tables. This current patch is just infrastructure.
Also, it might be nice if Postgres letyou also define periods from a single range column, so that people whowant to use intervals can still stick closer to the standard---Idunno, just an idea.
That's a nice idea, but I'm not sure how I'd fit it into the pg_period catalog which expects two columns.
Also, this may not be very helpful, but I started an extension tosupport temporal foreign keys here:It uses intervals, not periods, but maybe you can steal some ideas.:-) I have a half-finished branch porting it from plpgsql to C, sothat I could give them more catalog integration, and also I have hopesof defining temporal primary keys, although that isn't implementedyet. Anyway, I mention it because you said, "Application periods canbe used in PRIMARY/UNIQUE keys, foreign keys," but feel free to ignoreit. :-)
While I'm waiting for comments on how best to do inheritance and other aspects of my patch, I'm working on getting PRIMARY/UNIQUE keys with periods. That's far from finished though as it is touching parts of the code that I have never looked at before.
In general, I would love Postgres to have some lower-level primitiveslike range types and the Dingös operators, and then build theSQL:2011 support on top of those. I'm happy to contribute work to helpmake that happen, although I'd probably need to work with someone withmore Postgres hacking experience to get it done.
Any help you can give me (or that I could give you) is greatly appreciated. I'm hoping we can get *something* in v12 with periods.
Hello Vik, On 2018-Jun-05, Vik Fearing wrote: > > I understand that your patch is just to allow defining periods, but I > > thought I'd share my own hopes earlier rather than later, in case you > > are doing more work on this. > > Yes, I plan on doing much more work on this. My goal is to implement > (by myself or with help from other) the entire SQL:2016 spec on > periods and system versioned tables. This current patch is just > infrastructure. Have you had the chance to work on this? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello Vik,On 2018-Jun-05, Vik Fearing wrote:I understand that your patch is just to allow defining periods, but Ithought I'd share my own hopes earlier rather than later, in case youare doing more work on this.Yes, I plan on doing much more work on this. My goal is to implement(by myself or with help from other) the entire SQL:2016 spec onperiods and system versioned tables. This current patch is justinfrastructure.Have you had the chance to work on this?
Hi Alvaro,
I've been working on this as an extension instead. It allows me to do some stuff in plpgsql which is much easier for me.
I would love to have all this in core (especially since MariaDB 10.4 just became the first open source database to get all of this functionality), but something is better than nothing.
--
Vik Fearing
Hello Vik, Paul, On 2019-Jul-04, Vik Fearing wrote: > I've been working on this as an extension instead. It allows me to do some stuff in plpgsql which is much easier for me. > > https://github.com/xocolatl/periods/ Hmm, okay. > I would love to have all this in core (especially since MariaDB 10.4 > just became the first open source database to get all of this > functionality), but something is better than nothing. Agreed on all accounts. I think that the functionality in your patch is already integrated in Paul's patch for temporal PK/FK elsewhere ... is that correct, or is this patch orthogonal to that work? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Jul 4, 2019 at 11:44 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > I think that the functionality in your patch is already integrated in > Paul's patch for temporal PK/FK elsewhere ... is that correct, or is > this patch orthogonal to that work? Hi Vik, I wasn't aware that you had moved your work over to an extension. It looks like you've made a lot of progress! I'd be eager to work with you on getting this into core. Alvaro: it's a mix of orthogonal and not orthogonal. :-) My work lets you use range types directly in SQL:2011 constructs, whereas Vik's work lets you use SQL:2011 PERIODs. I would like to support *both*, so I've been intending to add Vik's original patch letting you define PERIODs to my own work. There is no conflict in supporting both ranges and PERIODs because you can't name a PERIOD the same as an existing column. Behind the scenes the easiest way to implement PERIODs is with range types, so there would be very little duplication to permit both. PERIODs suffer from being outside relational theory as a quasi-attribute. You especially see this in composing queries & result sets. For example with ranges you could say this: WITH x AS ( SELECT * FROM y FOR PORTION OF some_range FROM t1 TO t2 ) SELECT * FROM x FOR PORTION OF some_range FROM t3 TO t4 ; But with PERIODs you can't because a PERIOD is not included in `SELECT *`. Also it's unclear how "attached" it is to a table definition. Can you say `SELECT *, some_period FROM x`? If so can you then use `FOR PORTION OF` on that result set? Can you construct an on-the-fly PERIOD expression? Can you pass a period to a function as a parameter? Can a function return a period? Can you ORDER BY a period? GROUP BY one? Can you cast to/from a period? Can you ask a period for its high/low values? Do we treat a PERIOD as a whole new datatype? Can you define a real column of type PERIOD? I haven't found text from the standard that answers most of these questions. (The standard does say you can construct a `PERIOD(t1, t2)` expression but apparently only inside a "period predicate".) Also you can't define PERIODs on types other than date/timestamp/timestamptz, unlike ranges. Also PERIODs require a sentinel value to mean "unbounded" (e.g. 31-JAN-9999) whereas ranges let you express that with a NULL. (Postgres does have Infinity and -Infinity for timestamp types but I've noticed that client programming languages can't always express ranges with those values.) Personally I intend to use ranges any time I build temporal tables, but supporting PERIODs might have value for people more interested in database portability or someone migrating from elsewhere to Postgres. I had some conversations at PGCon that I felt validated the permit-PERIODS-or-ranges approach, so I'm about ready to expand my patch to handle PERIODs too. For that I would love to draw on Vik's work so far. I think his original patch against core is more likely to be helpful than the extension, but I'll certainly consult both, and Vik if you have any advice let me know! :-) A big difference between a temporal extension vs temporal features in core is implementing DML. An extension pretty much requires you to use INSTEAD OF triggers. Also Vik's README points out that implementing temporal DELETE is hard that way. In core I believe you'd do temporal DML in the executor node. (That's my working theory anyway; I'm still new to that part of the code.) The first thing on my TODO list is to write a blog post comparing how other RDMBSes handle PERIODs and other temporal features. Besides the questions above, how does a trigger work on a table? For example when you DELETE in the middle of a range/period, and it becomes an INSERT plus an UPDATE, I *believe* you still fire the DELETE trigger. And you need to set the NEW/OLD tuples appropriately. You *don't* fire any INSERT & UPDATE triggers. The standard isn't super explicit but that's my take on it, and I want to write down what other vendors are doing. Yours, Paul