Thread: SQL compliant interval implementation
Hi all, I've been looking at the postgres interval implementation lately, and I'm interested in putting together an improved implementation that accords more closely with the SQL specification, in particular with: --- 4.6.2 Intervals There are two classes of intervals. One class, called year-month intervals, has an express or implied datetime precision that includes no fields other than YEAR and MONTH, though not both are required. The other class, called day-time intervals, has an express or implied interval precision that can include any fields other than YEAR or MONTH. --- The reason for intervals being divided into two classes is that interval days and months (as distinct from datetime days and months) have no well-defined relationship. The current postgres implementation uses a conversion rate of 30 days to the month, but this appears to be a band-aid solution to a deeper problem; that postgres is trying to do with a single type something which really should be done with two. Imagine that I tried to implement a unified "length" type that contained two fields, "metres" and "hobbits", where the number of hobbits per metre differs from one hobbit to the next, but nonetheless you're allowed to perform comparison and conversion between hobbits and metres. People would haul me out into the proverbial town square and throw rotten tomatoes at me. And rightly so. I think the SQL standard has the right idea. Keep the year-months and the day-times separate. Don't try to perform arithmetic or comparisons between the two. I note that this has been discussed on the mailing lists a few times before, but I didn't see any conclusion or consensus reached. So, the question is, are there any compelling reasons we shouldn't try to implement "interval" as two types? Does the existing unified approach offer any significant advantages? The only such advantage I can see is that it's easy for the user to specify month values and day values together when performing date arithmetic, for example if I wanted to add 1 month and 12 days to a date, under the current approach I would simply issue: > SELECT dateval + interval '1 month 12 days'; That's nice and convenient, however, there's no reason we can't keep this simple under a separated approach: > SELECT dateval + interval month '1' + interval second '12 days'; With ISO INTERVAL syntax (already listed as a TODO item) this becomes a bit more human-readable: > SELECT dateval + interval '1' month + interval '12' day; By defining some convenient numeric input functions it becomes very readable (not to mention incredibly easy to parse into the internal format, versus textual inputs) > SELECT dateval + months(1) + days(12); It could be done without breaking existing applications; just implement the two new interval types, and leave the old unified interval in place as a deprecated type, then drop it after a few major releases. The day-time interval type would be pretty much the same as the existing interval, with the month field removed. The year-month field would simply be a signed integer. Thanks in advance for your comments. Regards, BJ
I don't see how this makese our system any better than it does not. It just seems to eliminate the 30-day problem by not allowing it. That doesn't seem to be a step forward. --------------------------------------------------------------------------- Brendan Jurd wrote: > Hi all, > > I've been looking at the postgres interval implementation lately, and > I'm interested in putting together an improved implementation that > accords more closely with the SQL specification, in particular with: > > --- > 4.6.2 Intervals > > There are two classes of intervals. One class, called year-month > intervals, has an express or implied datetime precision that includes > no fields other than YEAR and MONTH, though not both are required. The > other class, called day-time intervals, has an express or implied > interval precision that can include any fields other than YEAR or > MONTH. > --- > > The reason for intervals being divided into two classes is that > interval days and months (as distinct from datetime days and months) > have no well-defined relationship. > > The current postgres implementation uses a conversion rate of 30 days > to the month, but this appears to be a band-aid solution to a deeper > problem; that postgres is trying to do with a single type something > which really should be done with two. > > Imagine that I tried to implement a unified "length" type that > contained two fields, "metres" and "hobbits", where the number of > hobbits per metre differs from one hobbit to the next, but nonetheless > you're allowed to perform comparison and conversion between hobbits > and metres. People would haul me out into the proverbial town square > and throw rotten tomatoes at me. And rightly so. > > I think the SQL standard has the right idea. Keep the year-months and > the day-times separate. Don't try to perform arithmetic or > comparisons between the two. > > I note that this has been discussed on the mailing lists a few times > before, but I didn't see any conclusion or consensus reached. > > So, the question is, are there any compelling reasons we shouldn't try > to implement "interval" as two types? Does the existing unified > approach offer any significant advantages? > > The only such advantage I can see is that it's easy for the user to > specify month values and day values together when performing date > arithmetic, for example if I wanted to add 1 month and 12 days to a > date, under the current approach I would simply issue: > > > SELECT dateval + interval '1 month 12 days'; > > That's nice and convenient, however, there's no reason we can't keep > this simple under a separated approach: > > > SELECT dateval + interval month '1' + interval second '12 days'; > > With ISO INTERVAL syntax (already listed as a TODO item) this becomes > a bit more human-readable: > > > SELECT dateval + interval '1' month + interval '12' day; > > By defining some convenient numeric input functions it becomes very > readable (not to mention incredibly easy to parse into the internal > format, versus textual inputs) > > > SELECT dateval + months(1) + days(12); > > It could be done without breaking existing applications; just > implement the two new interval types, and leave the old unified > interval in place as a deprecated type, then drop it after a few major > releases. > > The day-time interval type would be pretty much the same as the > existing interval, with the month field removed. The year-month field > would simply be a signed integer. > > Thanks in advance for your comments. > > Regards, > BJ > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
"Brendan Jurd" <direvus@gmail.com> writes: > I've been looking at the postgres interval implementation lately, and > I'm interested in putting together an improved implementation that > accords more closely with the SQL specification, in particular with: Appealing to the SQL spec isn't going to take you very far in this argument, because the SQL spec itself is pretty broken in this area. In particular, the only reason that year-month vs day-second makes any sense is that they're pretending daylight savings doesn't exist. If you allow for DST transitions then the day/second multiplier isn't constant any more than the month/day multiplier is. (And then there are leap seconds...) I don't especially want to get into leap-second-aware timekeeping, because that'd practically destroy the ability to do calendar arithmetic at all. But DST awareness seems like a requirement to me. We just recently fixed "interval" to do something fairly plausible in that area, and I don't much want to backtrack. > It could be done without breaking existing applications; just > implement the two new interval types, and leave the old unified > interval in place as a deprecated type, then drop it after a few major > releases. Uh, not if all three types want to be called "interval", and I don't entirely see how you maintain spec compliance without that. In any case, actually *dropping* support for the existing interval behavior is a way harder sell than just adding something more spec-like; you haven't been nearly persuasive enough to sell that one. I don't even see an argument here for deprecating it, because if you ask me it works better than the SQL-spec behavior. regards, tom lane
Brendan, > There are two classes of intervals. One class, called year-month > intervals, has an express or implied datetime precision that includes > no fields other than YEAR and MONTH, though not both are required. The > other class, called day-time intervals, has an express or implied > interval precision that can include any fields other than YEAR or > MONTH. Yeah, we used to do that. It sucked. In fact, most developers of applications which were calendar-heavy ended up using custom data types to work around the SQL-spec INTERVAL limitations. And that benefits nobody. We modified our INTERVAL type to function in ways that calendar application developers (of which there are several on this list) can actually use. See discussion leading up to the 8.0 release. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco
On 5/24/06, Josh Berkus <josh@agliodbs.com> wrote: > Brendan, > > > There are two classes of intervals. One class, called year-month > > intervals, has an express or implied datetime precision that includes > > no fields other than YEAR and MONTH, though not both are required. The > > other class, called day-time intervals, has an express or implied > > interval precision that can include any fields other than YEAR or > > MONTH. > > Yeah, we used to do that. It sucked. In fact, most developers of > applications which were calendar-heavy ended up using custom data types to > work around the SQL-spec INTERVAL limitations. And that benefits nobody. > Could you elaborate on how it sucked? Apart from the issue of daylight savings which Tom has mentioned, what are these limitations that needed to be worked around? I've been searching through the archives for discussions relating to intervals, but haven't come across the one you're describing. Most probably because there have been a LOT of discussions relating to intervals. Regards, BJ
Brendan Jurd wrote: > On 5/24/06, Josh Berkus <josh@agliodbs.com> wrote: > > Brendan, > > > > > There are two classes of intervals. One class, called year-month > > > intervals, has an express or implied datetime precision that includes > > > no fields other than YEAR and MONTH, though not both are required. The > > > other class, called day-time intervals, has an express or implied > > > interval precision that can include any fields other than YEAR or > > > MONTH. > > > > Yeah, we used to do that. It sucked. In fact, most developers of > > applications which were calendar-heavy ended up using custom data types to > > work around the SQL-spec INTERVAL limitations. And that benefits nobody. > > > > Could you elaborate on how it sucked? Apart from the issue of > daylight savings which Tom has mentioned, what are these limitations > that needed to be worked around? > > I've been searching through the archives for discussions relating to > intervals, but haven't come across the one you're describing. Most > probably because there have been a LOT of discussions relating to > intervals. Well, it seems to just eliminate the 30-day problem by disallowing it, and creating two data types. I don't see how that is better than what we have now. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Brendan, > Could you elaborate on how it sucked? Apart from the issue of > daylight savings which Tom has mentioned, what are these limitations > that needed to be worked around? Well, actually, the DST thing was pretty severe -- it made timestamptz unusable. That's why we partitioned interval into month/year | day/week | hour/minute/second/etc. I personally don't see the benefit of evaluating "1 month" = "30 days", but I don't see the harm either. "days" *don't* get rolled up to months, which is proper partitioned behavior: postgres=# select interval '180 days';interval ----------180 days (1 row) postgres=# select interval '1800 days';interval -----------1800 days The only issue comes when you multiply units by a decimal: postgres=# select interval '11 months' * 0.3; ?column? ------------------------3 mons 8 days 24:00:00 ... which leads to some broken calculations: select ( interval '11 months' * 0.3 ) / 0.3; ?column? --------------------------10 mons 26 days 96:00:00 but anyone who deals in "fractional months" should know that they're approximating. Previously, we couldn't get decimal calculations to work at all. > I've been searching through the archives for discussions relating to > intervals, but haven't come across the one you're describing. Most > probably because there have been a LOT of discussions relating to > intervals. If I had a link, I'd send it. But I'd have to do the same searching you're doing. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco
On May 24, 2006, at 7:37 , Brendan Jurd wrote: > I've been searching through the archives for discussions relating to > intervals, but haven't come across the one you're describing. Most > probably because there have been a LOT of discussions relating to > intervals. I don't have links to the threads, but here are some subject titles and approximate times that may help you find where things related to this have been discussed before. November 2004 [GENERAL] Comment on timezone and interval types October 2005 [BUGS] BUG #1993: Adding/subtracting negative time intervals March 2006 Re: [HACKERS] [SQL] Interval subtracting Michael Glaesemann grzm seespotcode net