Thread: SQL compliant interval implementation

SQL compliant interval implementation

From
"Brendan Jurd"
Date:
Hi all,

I've been looking at the postgres interval implementation lately, and
I'm interested in putting together an improved implementation that
accords more closely with the SQL specification, in particular with:

---
4.6.2 Intervals

There are two classes of intervals. One class, called year-month
intervals, has an express or implied datetime precision that includes
no fields other than YEAR and MONTH, though not both are required. The
other class, called day-time intervals, has an express or implied
interval precision that can include any fields other than YEAR or
MONTH.
---

The reason for intervals being divided into two classes is that
interval days and months (as distinct from datetime days and months)
have no well-defined relationship.

The current postgres implementation uses a conversion rate of 30 days
to the month, but this appears to be a band-aid solution to a deeper
problem; that postgres is trying to do with a single type something
which really should be done with two.

Imagine that I tried to implement a unified "length" type that
contained two fields, "metres" and "hobbits", where the number of
hobbits per metre differs from one hobbit to the next, but nonetheless
you're allowed to perform comparison and conversion between hobbits
and metres.  People would haul me out into the proverbial town square
and throw rotten tomatoes at me.  And rightly so.

I think the SQL standard has the right idea.  Keep the year-months and
the day-times separate.  Don't try to perform arithmetic or
comparisons between the two.

I note that this has been discussed on the mailing lists a few times
before, but I didn't see any conclusion or consensus reached.

So, the question is, are there any compelling reasons we shouldn't try
to implement "interval" as two types?  Does the existing unified
approach offer any significant advantages?

The only such advantage I can see is that it's easy for the user to
specify month values and day values together when performing date
arithmetic, for example if I wanted to add 1 month and 12 days to a
date, under the current approach I would simply issue:

> SELECT dateval + interval '1 month 12 days';

That's nice and convenient, however, there's no reason we can't keep
this simple under a separated approach:

> SELECT dateval + interval month '1' + interval second '12 days';

With ISO INTERVAL syntax (already listed as a TODO item) this becomes
a bit more human-readable:

> SELECT dateval + interval '1' month + interval '12' day;

By defining some convenient numeric input functions it becomes very
readable (not to mention incredibly easy to parse into the internal
format, versus textual inputs)

> SELECT dateval + months(1) + days(12);

It could be done without breaking existing applications; just
implement the two new interval types, and leave the old unified
interval in place as a deprecated type, then drop it after a few major
releases.

The day-time interval type would be pretty much the same as the
existing interval, with the month field removed.  The year-month field
would simply be a signed integer.

Thanks in advance for your comments.

Regards,
BJ


Re: SQL compliant interval implementation

From
Bruce Momjian
Date:
I don't see how this makese our system any better than it does not.  It
just seems to eliminate the 30-day problem by not allowing it.  That
doesn't seem to be a step forward.

---------------------------------------------------------------------------

Brendan Jurd wrote:
> Hi all,
> 
> I've been looking at the postgres interval implementation lately, and
> I'm interested in putting together an improved implementation that
> accords more closely with the SQL specification, in particular with:
> 
> ---
> 4.6.2 Intervals
> 
> There are two classes of intervals. One class, called year-month
> intervals, has an express or implied datetime precision that includes
> no fields other than YEAR and MONTH, though not both are required. The
> other class, called day-time intervals, has an express or implied
> interval precision that can include any fields other than YEAR or
> MONTH.
> ---
> 
> The reason for intervals being divided into two classes is that
> interval days and months (as distinct from datetime days and months)
> have no well-defined relationship.
> 
> The current postgres implementation uses a conversion rate of 30 days
> to the month, but this appears to be a band-aid solution to a deeper
> problem; that postgres is trying to do with a single type something
> which really should be done with two.
> 
> Imagine that I tried to implement a unified "length" type that
> contained two fields, "metres" and "hobbits", where the number of
> hobbits per metre differs from one hobbit to the next, but nonetheless
> you're allowed to perform comparison and conversion between hobbits
> and metres.  People would haul me out into the proverbial town square
> and throw rotten tomatoes at me.  And rightly so.
> 
> I think the SQL standard has the right idea.  Keep the year-months and
> the day-times separate.  Don't try to perform arithmetic or
> comparisons between the two.
> 
> I note that this has been discussed on the mailing lists a few times
> before, but I didn't see any conclusion or consensus reached.
> 
> So, the question is, are there any compelling reasons we shouldn't try
> to implement "interval" as two types?  Does the existing unified
> approach offer any significant advantages?
> 
> The only such advantage I can see is that it's easy for the user to
> specify month values and day values together when performing date
> arithmetic, for example if I wanted to add 1 month and 12 days to a
> date, under the current approach I would simply issue:
> 
> > SELECT dateval + interval '1 month 12 days';
> 
> That's nice and convenient, however, there's no reason we can't keep
> this simple under a separated approach:
> 
> > SELECT dateval + interval month '1' + interval second '12 days';
> 
> With ISO INTERVAL syntax (already listed as a TODO item) this becomes
> a bit more human-readable:
> 
> > SELECT dateval + interval '1' month + interval '12' day;
> 
> By defining some convenient numeric input functions it becomes very
> readable (not to mention incredibly easy to parse into the internal
> format, versus textual inputs)
> 
> > SELECT dateval + months(1) + days(12);
> 
> It could be done without breaking existing applications; just
> implement the two new interval types, and leave the old unified
> interval in place as a deprecated type, then drop it after a few major
> releases.
> 
> The day-time interval type would be pretty much the same as the
> existing interval, with the month field removed.  The year-month field
> would simply be a signed integer.
> 
> Thanks in advance for your comments.
> 
> Regards,
> BJ
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
> 

--  Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: SQL compliant interval implementation

From
Tom Lane
Date:
"Brendan Jurd" <direvus@gmail.com> writes:
> I've been looking at the postgres interval implementation lately, and
> I'm interested in putting together an improved implementation that
> accords more closely with the SQL specification, in particular with:

Appealing to the SQL spec isn't going to take you very far in this
argument, because the SQL spec itself is pretty broken in this area.
In particular, the only reason that year-month vs day-second makes
any sense is that they're pretending daylight savings doesn't exist.
If you allow for DST transitions then the day/second multiplier isn't
constant any more than the month/day multiplier is.  (And then there
are leap seconds...)

I don't especially want to get into leap-second-aware timekeeping,
because that'd practically destroy the ability to do calendar arithmetic
at all.  But DST awareness seems like a requirement to me.  We just
recently fixed "interval" to do something fairly plausible in that
area, and I don't much want to backtrack.

> It could be done without breaking existing applications; just
> implement the two new interval types, and leave the old unified
> interval in place as a deprecated type, then drop it after a few major
> releases.

Uh, not if all three types want to be called "interval", and I don't
entirely see how you maintain spec compliance without that.  In any
case, actually *dropping* support for the existing interval behavior is
a way harder sell than just adding something more spec-like; you haven't
been nearly persuasive enough to sell that one.  I don't even see an
argument here for deprecating it, because if you ask me it works better
than the SQL-spec behavior.
        regards, tom lane


Re: SQL compliant interval implementation

From
Josh Berkus
Date:
Brendan,

> There are two classes of intervals. One class, called year-month
> intervals, has an express or implied datetime precision that includes
> no fields other than YEAR and MONTH, though not both are required. The
> other class, called day-time intervals, has an express or implied
> interval precision that can include any fields other than YEAR or
> MONTH.

Yeah, we used to do that.  It sucked.  In fact, most developers of 
applications which were calendar-heavy ended up using custom data types to 
work around the SQL-spec INTERVAL limitations.  And that benefits nobody.

We modified our INTERVAL type to function in ways that calendar application 
developers (of which there are several on this list) can actually use.  
See discussion leading up to the 8.0 release.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco


Re: SQL compliant interval implementation

From
"Brendan Jurd"
Date:
On 5/24/06, Josh Berkus <josh@agliodbs.com> wrote:
> Brendan,
>
> > There are two classes of intervals. One class, called year-month
> > intervals, has an express or implied datetime precision that includes
> > no fields other than YEAR and MONTH, though not both are required. The
> > other class, called day-time intervals, has an express or implied
> > interval precision that can include any fields other than YEAR or
> > MONTH.
>
> Yeah, we used to do that.  It sucked.  In fact, most developers of
> applications which were calendar-heavy ended up using custom data types to
> work around the SQL-spec INTERVAL limitations.  And that benefits nobody.
>

Could you elaborate on how it sucked?  Apart from the issue of
daylight savings which Tom has mentioned, what are these limitations
that needed to be worked around?

I've been searching through the archives for discussions relating to
intervals, but haven't come across the one you're describing.  Most
probably because there have been a LOT of discussions relating to
intervals.

Regards,
BJ


Re: SQL compliant interval implementation

From
Bruce Momjian
Date:
Brendan Jurd wrote:
> On 5/24/06, Josh Berkus <josh@agliodbs.com> wrote:
> > Brendan,
> >
> > > There are two classes of intervals. One class, called year-month
> > > intervals, has an express or implied datetime precision that includes
> > > no fields other than YEAR and MONTH, though not both are required. The
> > > other class, called day-time intervals, has an express or implied
> > > interval precision that can include any fields other than YEAR or
> > > MONTH.
> >
> > Yeah, we used to do that.  It sucked.  In fact, most developers of
> > applications which were calendar-heavy ended up using custom data types to
> > work around the SQL-spec INTERVAL limitations.  And that benefits nobody.
> >
> 
> Could you elaborate on how it sucked?  Apart from the issue of
> daylight savings which Tom has mentioned, what are these limitations
> that needed to be worked around?
> 
> I've been searching through the archives for discussions relating to
> intervals, but haven't come across the one you're describing.  Most
> probably because there have been a LOT of discussions relating to
> intervals.

Well, it seems to just eliminate the 30-day problem by disallowing it,
and creating two data types.  I don't see how that is better than what
we have now.

--  Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: SQL compliant interval implementation

From
Josh Berkus
Date:
Brendan,

> Could you elaborate on how it sucked?  Apart from the issue of
> daylight savings which Tom has mentioned, what are these limitations
> that needed to be worked around?

Well, actually, the DST thing was pretty severe -- it made timestamptz 
unusable.  That's why we partitioned interval into month/year | day/week | 
hour/minute/second/etc.

I personally don't see the benefit of evaluating "1 month" = "30 days", but 
I don't see the harm either.   "days" *don't* get rolled up to months, 
which is proper partitioned behavior:

postgres=# select interval '180 days';interval
----------180 days
(1 row)

postgres=# select interval '1800 days';interval
-----------1800 days

The only issue comes when you multiply units by a decimal:

postgres=# select interval '11 months' * 0.3;       ?column?
------------------------3 mons 8 days 24:00:00

... which leads to some broken calculations:

select ( interval '11 months' * 0.3 ) / 0.3;        ?column?
--------------------------10 mons 26 days 96:00:00

but anyone who deals in "fractional months" should know that they're 
approximating.  Previously, we couldn't get decimal calculations to work 
at all.
 > I've been searching through the archives for discussions relating to
> intervals, but haven't come across the one you're describing.  Most
> probably because there have been a LOT of discussions relating to
> intervals.

If I had a link, I'd send it.  But I'd have to do the same searching you're 
doing.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco


Re: SQL compliant interval implementation

From
Michael Glaesemann
Date:
On May 24, 2006, at 7:37 , Brendan Jurd wrote:

> I've been searching through the archives for discussions relating to
> intervals, but haven't come across the one you're describing.  Most
> probably because there have been a LOT of discussions relating to
> intervals.

I don't have links to the threads, but here are some subject titles  
and approximate times that may help you find where things related to  
this have been discussed before.

November 2004
[GENERAL] Comment on timezone and interval types
October 2005
[BUGS] BUG #1993: Adding/subtracting negative time intervals
March 2006
Re: [HACKERS] [SQL] Interval subtracting


Michael Glaesemann
grzm seespotcode net