Thread: A date bug: number of day of October
Hi all, I think there is a bug in the PostgreSQL date system: Try computing (replace XXXX with some differents years):'XXXX-10-01'::datetime + '1 month'::interval Depending on the year, you can get 'XXXX-11-01' as expected or 'XXXX-10-31' which is bug. Guillaume Perréal - Stagiaire MIAG Cemagref (URH), Lyon, France Tél: (+33) 4.72.20.87.64
Guillaume Perréal writes: > Try computing (replace XXXX with some differents years): > 'XXXX-10-01'::datetime + '1 month'::interval > Depending on the year, you can get 'XXXX-11-01' as expected or 'XXXX-10-31' > which is bug. In the CET zone (where you probably live) daylight savings time ends in October, so it is true that 1999-10-01 00:00:00 + (24h * 31 days) = 1999-10-31 23:00:00 Of course a month is not defined as "24h * 30/31 days" but instead as the time between xxxx-yy-zz and xxxx-(yy+1)-zz, so the above could be condered wrong. It is especially wrong that the same thing happens if you use DATE instead of TIMESTAMP (or DATETIME, now deprecated). You get the above behaviour between 1996 and 2037. Before 1996 you can observe the same behaviour with September, because back then we switched in September! The system is pretty smart. Of course after 2037 we're doomed anyway. The bottom line is that INTERVAL in its current implementation has deficiencies and it's not SQL compliant either. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Peter Eisentraut <peter_e@gmx.net> writes: > In the CET zone (where you probably live) daylight savings time ends in > October, so it is true that > 1999-10-01 00:00:00 + (24h * 31 days) = 1999-10-31 23:00:00 > Of course a month is not defined as "24h * 30/31 days" but instead as the > time between xxxx-yy-zz and xxxx-(yy+1)-zz, so the above could be condered > wrong. It is especially wrong that the same thing happens if you use DATE > instead of TIMESTAMP (or DATETIME, now deprecated). Actually, INTERVAL does know the difference between '1 month' and '31 days', as illustrated by the fact that it gets these two cases right: regression=# select '1999-6-1'::timestamp + '1 month'::interval; ?column? ------------------------ 1999-07-01 00:00:00-04 (1 row) regression=# select '1999-6-1'::timestamp + '2 month'::interval; ?column? ------------------------ 1999-08-01 00:00:00-04 (1 row) (remember June and July have different numbers of days). The problem here is a plain old code bug: after transforming the input value to y/m/d/h/m/s/timezone, and correctly adding the N-month offset to this symbolic form, timestamp_pl_span transforms back to a GMT-based timestamp using *the same timezone offset*. Thus, for example, regression=# select '1999-10-1'::timestamp + '1 month'::interval; ?column? ------------------------ 1999-10-31 23:00:00-05 (1 row) for me (I live in EST5EDT, where Oct 1 is in daylight savings time GMT-4, but Nov 1 is standard time GMT-5). Correct behavior, IMHO, is to change to the local timezone appropriate for the converted date before transforming y/m/d/etc to timestamp. I have no idea how hard that is to do. One kluge that comes to mind is to convert the modified y/m/d/etc date to timestamp, convert that back to local time to get a timezone, and then convert the same y/m/d/h/m/s plus new timezone to timestamp. But perhaps there's a cleaner/faster way to do it. I'm not real sure that said algorithm would give plausible behavior if the result time falls within a DST transition anyway. (But what is plausible behavior in that case?) Another issue: for intervals smaller than a month, INTERVAL currently represents the value as X number of seconds. Thus, since our last DST->EST transition was early morning 1999/10/31, regression=# select '1999-10-31'::timestamp + '1 day'::interval; ?column? ------------------------ 1999-10-31 23:00:00-05 (1 row) which is fairly unintuitive --- though if I'd asked for +'24 hours' I would accept it as correct. This is not a code bug but designed behavior. ISTM that really, INTERVAL ought to have a three-part representation: months (which can serve for larger units as well), days, and sub-day units (which can all be converted to seconds). But representing days as seconds breaks at DST boundaries. regards, tom lane
Tom Lane wrote: > > Peter Eisentraut <peter_e@gmx.net> writes: > > In the CET zone (where you probably live) daylight savings time ends in > > October, so it is true that > > > 1999-10-01 00:00:00 + (24h * 31 days) = 1999-10-31 23:00:00 > > > Of course a month is not defined as "24h * 30/31 days" but instead as the > > time between xxxx-yy-zz and xxxx-(yy+1)-zz, so the above could be condered > > wrong. It is especially wrong that the same thing happens if you use DATE > > instead of TIMESTAMP (or DATETIME, now deprecated). > > Actually, INTERVAL does know the difference between '1 month' and '31 days', > as illustrated by the fact that it gets these two cases right: > > regression=# select '1999-6-1'::timestamp + '1 month'::interval; > ?column? > ------------------------ > 1999-07-01 00:00:00-04 > (1 row) > > regression=# select '1999-6-1'::timestamp + '2 month'::interval; > ?column? > ------------------------ > 1999-08-01 00:00:00-04 > (1 row) > > (remember June and July have different numbers of days). The problem > here is a plain old code bug: after transforming the input value to > y/m/d/h/m/s/timezone, and correctly adding the N-month offset to > this symbolic form, timestamp_pl_span transforms back to a GMT-based > timestamp using *the same timezone offset*. > > Thus, for example, > > regression=# select '1999-10-1'::timestamp + '1 month'::interval; > ?column? > ------------------------ > 1999-10-31 23:00:00-05 > (1 row) > > for me (I live in EST5EDT, where Oct 1 is in daylight savings time > GMT-4, but Nov 1 is standard time GMT-5). > > Correct behavior, IMHO, is to change to the local timezone appropriate > for the converted date before transforming y/m/d/etc to timestamp. > I have no idea how hard that is to do. One kluge that comes to mind is > to convert the modified y/m/d/etc date to timestamp, convert that back > to local time to get a timezone, and then convert the same y/m/d/h/m/s > plus new timezone to timestamp. But perhaps there's a cleaner/faster > way to do it. I'm not real sure that said algorithm would give > plausible behavior if the result time falls within a DST transition > anyway. (But what is plausible behavior in that case?) > > Another issue: for intervals smaller than a month, INTERVAL currently > represents the value as X number of seconds. Thus, since our last > DST->EST transition was early morning 1999/10/31, > > regression=# select '1999-10-31'::timestamp + '1 day'::interval; > ?column? > ------------------------ > 1999-10-31 23:00:00-05 > (1 row) > > which is fairly unintuitive --- though if I'd asked for +'24 hours' > I would accept it as correct. This is not a code bug but designed > behavior. ISTM that really, INTERVAL ought to have a three-part > representation: months (which can serve for larger units as well), > days, and sub-day units (which can all be converted to seconds). > But representing days as seconds breaks at DST boundaries. > > regards, tom lane Thanks you all, As I just use this code to get the number of seconds in a month (and don't care anyway of the timezone), I think I could use this: select date_part('epoch', date_trunc('month', '1999-10-01'::timestamp + '1 month 1 hour'::interval))-date_part('epoch', '1999-10-01'::timestamp); Regards, Guillaume Perréal - Stagiaire MIAG Cemagref (URH), Lyon, France Tél: (+33) 4.72.20.87.64