Re: A date bug: number of day of October - Mailing list pgsql-bugs

From Tom Lane
Subject Re: A date bug: number of day of October
Date
Msg-id 5706.955766269@sss.pgh.pa.us
Whole thread Raw
In response to Re: A date bug: number of day of October  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-bugs
Peter Eisentraut <peter_e@gmx.net> writes:
> In the CET zone (where you probably live) daylight savings time ends in
> October, so it is true that

>     1999-10-01 00:00:00 + (24h * 31 days) = 1999-10-31 23:00:00

> Of course a month is not defined as "24h * 30/31 days" but instead as the
> time between xxxx-yy-zz and xxxx-(yy+1)-zz, so the above could be condered
> wrong. It is especially wrong that the same thing happens if you use DATE
> instead of TIMESTAMP (or DATETIME, now deprecated).

Actually, INTERVAL does know the difference between '1 month' and '31 days',
as illustrated by the fact that it gets these two cases right:

regression=# select '1999-6-1'::timestamp  + '1 month'::interval;
        ?column?
------------------------
 1999-07-01 00:00:00-04
(1 row)

regression=# select '1999-6-1'::timestamp  + '2 month'::interval;
        ?column?
------------------------
 1999-08-01 00:00:00-04
(1 row)

(remember June and July have different numbers of days).  The problem
here is a plain old code bug: after transforming the input value to
y/m/d/h/m/s/timezone, and correctly adding the N-month offset to
this symbolic form, timestamp_pl_span transforms back to a GMT-based
timestamp using *the same timezone offset*.

Thus, for example,

regression=# select '1999-10-1'::timestamp  + '1 month'::interval;
        ?column?
------------------------
 1999-10-31 23:00:00-05
(1 row)

for me (I live in EST5EDT, where Oct 1 is in daylight savings time
GMT-4, but Nov 1 is standard time GMT-5).

Correct behavior, IMHO, is to change to the local timezone appropriate
for the converted date before transforming y/m/d/etc to timestamp.
I have no idea how hard that is to do.  One kluge that comes to mind is
to convert the modified y/m/d/etc date to timestamp, convert that back
to local time to get a timezone, and then convert the same y/m/d/h/m/s
plus new timezone to timestamp.  But perhaps there's a cleaner/faster
way to do it.  I'm not real sure that said algorithm would give
plausible behavior if the result time falls within a DST transition
anyway.  (But what is plausible behavior in that case?)

Another issue: for intervals smaller than a month, INTERVAL currently
represents the value as X number of seconds.  Thus, since our last
DST->EST transition was early morning 1999/10/31,

regression=# select '1999-10-31'::timestamp + '1 day'::interval;
        ?column?
------------------------
 1999-10-31 23:00:00-05
(1 row)

which is fairly unintuitive --- though if I'd asked for +'24 hours'
I would accept it as correct.  This is not a code bug but designed
behavior.  ISTM that really, INTERVAL ought to have a three-part
representation: months (which can serve for larger units as well),
days, and sub-day units (which can all be converted to seconds).
But representing days as seconds breaks at DST boundaries.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Ryan Kirkpatrick
Date:
Subject: Re: Minimal patches for PostgreSQL 7.0b3 on NetBSD/alpha 1.4.1....
Next
From: Guillaume Perréal
Date:
Subject: Re: A date bug: number of day of October