Interval->day proposal - Mailing list pgsql-hackers

From Michael Glaesemann
Subject Interval->day proposal
Date
Msg-id 3296A62B-C72D-454B-8222-DCA30E598A12@myrealbox.com
Whole thread Raw
Responses Re: Interval->day proposal  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Interval->day proposal  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
There has been discussion in the past of including number of days as  
part of the Interval structure in addition to the current months and  
time components. Here are some mailing list threads where the topic  
has arisen.

[Re: [GENERAL] '1 year' = '360 days' ????](http:// 
archives.postgresql.org/pgsql-general/2004-10/msg01104.php)
[Re: [HACKERS] timestamp with time zone a la sql99](http:// 
archives.postgresql.org/pgsql-hackers/2004-10/msg00843.php)

One advantage of this is that it would allow '1 day' to have a  
different meaning that '24 hours', which would be meaningful when  
crossing daylight saving time changes. For example, PostgreSQL  
returns the following results:

test=# set time zone 'CST7CDT';
SET
test=# select '2005-04-01 12:00-07'::timestamptz + '1 day'::interval;        ?column?
------------------------
2005-04-02 12:00:00-07
(1 row)

test=# select '2005-04-02 12:00-07'::timestamptz + '1 day'::interval;        ?column?
------------------------
2005-04-03 13:00:00-06
(1 row)

A daylight saving time change occurred at 2005-04-03 02:00.  
Internally, the '1 day' interval is converted to 24 hours, which is  
then added to '2005-04-02 12:00-07', returning '2005-04-03 13:00-06'.  
Distinguishing between '1 day' and '24 hours' would allow '2005-04-02  
12:00-07'::timestamptz + '1 day'::interval to return '2005-04-03  
12:00-06', while '2005-04-02 12:00-07'::timestamptz + '1  
day'::interval would return '2005-04-03 13:00-06' as it does now.

I'm interested in implementing at least part of this. In my mind  
there are two steps: the first is to modify the Interval data type  
implementation to include days as well as months and time; the second  
is to modify the functions used to add interval to timestamptz to  
take into account the difference between '1 day' and '24 hours' when  
working across daylight saving time changes. I'm thinking of it in  
two steps primarily because this is my first venture into the backend  
code and I'm breaking it into pieces that I hopefully deal with.  
Arguably, implementing only the first step isn't useful in and of  
itself, but I have to start somewhere, and that seems like the  
logical place to start.

I've been looking at the current interval implementation, and am  
beginning to form a plan for my first cut at the code. Before  
starting out, I'd appreciate any feedback on my understanding of the  
code or my plan.

In interval_in, (backend/utils/adt/timestamp.c) strings representing  
intervals are parsed using DecodeInterval() which is defined in  
backend/utils/adt/datetime.c. DecodeInterval() assigns values to the  
appropriate components of a pg_tm structure, which includes tm_year,  
tm_mon, tm_mday, tm_hour, tm_min, and tm_sec. This pg_tm data is  
passed to tm2interval, where the tm_year and tm_mon components are  
used to determine the interval->month value, and the tm_mday,  
tm_hour, tm_min, and tm_sec values are used to determine the interval- >time value.

When the string is read by DecodeInterval, the "days" component is  
assigned to tm_mday. It seems relatively straightforward to use this  
input to provide the interval->day value. However, I'm wondering what  
range of days this the interval->day component can be expected to  
handle. tm_mday is an int value, which is only guaranteed to be 2  
bytes (though it may be larger), if I understand correctly. This  
means that tm_mday isn't able to hold the full span of days, which is  
nearly 2.15 billion over the range of valid timestamps (from BC 4713  
to AD 5,874,897).

However, this range of days may not be necessary in practice. The  
(minimum) 2 bytes guaranteed by int is enough to hold +/- 44 years of  
days. Is it likely that DST issues are going to be important over  
anything larger? I'm not sure, and I welcome others' thoughts on  
this. If a wider range of days is deemed required, I'm not sure how  
this would be accomplished outside of changing the pg_tm datatype.

If interval->day can be stored in an int16, then the interval struct  
would be widened from 12 to 14 bytes. Are there concerns about  
widening the interval datatype? This is perhaps a naive approach, so  
if anyone has other ideas, I'd love to hear them.

Is my understanding correct? Any corrections or pointers appreciated.

Michael Glaesemann
grzm myrealbox com



pgsql-hackers by date:

Previous
From: Andrew - Supernews
Date:
Subject: Re: locks in CREATE TRIGGER, ADD FK
Next
From: Bruce Momjian
Date:
Subject: Re: Escape handling in COPY, strings, psql