Re: ISO 8601 "Time Intervals" of the "format with time-unit - Mailing list pgsql-patches
From | Peter Eisentraut |
---|---|
Subject | Re: ISO 8601 "Time Intervals" of the "format with time-unit |
Date | |
Msg-id | Pine.LNX.4.44.0312010719450.22012-100000@peter.localdomain Whole thread Raw |
In response to | Re: ISO 8601 "Time Intervals" of the "format with time-unit (Bruce Momjian <pgman@candle.pha.pa.us>) |
Responses |
Re: ISO 8601 "Time Intervals" of the "format with time-unit
Re: ISO 8601 "Time Intervals" of the "format with time-unit |
List | pgsql-patches |
Bruce Momjian writes: > Is this ready for application? It looks good to me. However, there is > an "Open issues" section. It would be more useful to implement the SQL standard for intervals first instead of inventing more nonstandard formats for it. > > --------------------------------------------------------------------------- > > Ron Mayer wrote: > > Short summary: > > > > This patch allows ISO 8601 "time intervals" using the "format > > with time-unit designators" to specify postgresql "intervals". > > > > Below I have (A) What these time intervals are, (B) What I > > modified to support them, (C) Issues with intervals I want > > to bring up, and (D) a patch supporting them. > > > > It's helpful to me. Any feedback is appreciated. If you > > did want to consider including it, let me know what to clean > > up. If not, I thought I'd just put it here if anyone else finds > > it useful too. > > > > Thanks for your time, > > > > Ron Mayer > > > > Longer: > > > > (A) What these intervals are. > > > > ISO 8601, the standard from which PostgreSQL gets some of it's > > time syntax, also has a specification for "time-intervals". > > > > In particular, section 5.5.4.2 has a "Representation of > > time-interval by duration only" which I believe maps > > nicely to ISO intervals. > > > > Compared to the ISO 8601 time interval specification, the > > postgresql interval syntax is quite verbose. For example: > > > > Postgresql interval: ISO8601 Interval > > --------------------------------------------------- > > '1 year 6 months' 'P1Y6M' > > '3 hours 25 minutes 42 seconds' 'PT3H25M42S' > > > > Yeah, it's uglier, but it sure is short which can make > > for quicker typing and shorter scripts, and if for some > > strange reason you had an application using this format > > it's nice not to have to translate. > > > > The syntax is as follows: > > Basic extended format: PnYnMnDTnHnMnS > > PnW > > > > Where everything before the "T" is a date-part and everything > > after is a time-part. W is for weeks. > > In the date-part, Y=Year, M=Month, D=Day > > In the time-part, H=Hour, M=Minute, S=Second > > > > Much more info can be found from the draft standard > > ftp://ftp.qsl.net/pub/g1smd/154N362_.PDF > > The final standard's only available for $$$ so I didn't > > look at it. Some other sites imply that this part didn't > > change from the last draft to the standard. > > > > > > (B) This change was made by adding two functions to "datetime.c" > > next to where DecodeInterval parses the normal interval syntax. > > > > A total of 313 lines were added, including comments and sgml docs. > > Of these only 136 are actual code, the rest, comments, whitespace, etc. > > > > > > One new function "DecodeISO8601Interval" follows the style of > > "DecodeInterval" below it, and trys to strictly follow the ISO > > syntax. If it doesn't match, it'll return -1 and the old syntax > > will be checked as before. > > > > The first test (first character of the first field must be 'P', > > and second character must be 'T' or '\0') should be fast so I don't > > think this will impact performance of existing code. > > > > > > The second function ("adjust_fval") is just a small helper-function > > to remove some of the cut&paste style that DecodeInterval used. > > > > It seems to work. > > ======================================================================= > > betadb=# select 'P1M15DT12H30M7S'::interval; > > interval > > ------------------------ > > 1 mon 15 days 12:30:07 > > (1 row) > > > > betadb=# select '1 month 15 days 12 hours 30 minutes 7 seconds'::interval; > > interval > > ------------------------ > > 1 mon 15 days 12:30:07 > > (1 row) > > ===================================================================== > > > > > > > > (C) Open issues with intervals, and questions I'd like to ask. > > > > 1. DecodeInterval seems to have a hardcoded '.' for specifying > > fractional times. ISO 8601 states that both '.' and ',' are > > ok, but "of these, the comma is the preferred sign". > > > > In DecodeISO8601Interval I loosened the test to allow > > both but left it as it was in DecodeInterval. Should > > both be changed to make them more consistant? > > > > 2. In "DecodeInterval", fractional weeks and fractional months > > can produce seconds; but fractional years can not (rounded > > to months). I didn't understand the reasoning for this, so > > I left it the same, and followed the same convention for > > ISO intervals. Should I change this? > > > > 3. I could save a bunch of copy-paste-lines-of-code from the > > pre-existing DecodeInterval by calling the adjust_fval helper > > function. The tradeoff is a few extra function-calls when > > decoding an interval. However I didn't want to risk changes > > to the existing part unless you guys encourage me to do so. > > > > > > (D) The patch. > > > > > > Index: doc/src/sgml/datatype.sgml > > =================================================================== > > RCS file: /projects/cvsroot/pgsql-server/doc/src/sgml/datatype.sgml,v > > retrieving revision 1.123 > > diff -u -1 -0 -r1.123 datatype.sgml > > --- doc/src/sgml/datatype.sgml 31 Aug 2003 17:32:18 -0000 1.123 > > +++ doc/src/sgml/datatype.sgml 8 Sep 2003 04:04:58 -0000 > > @@ -1735,20 +1735,71 @@ > > Quantities of days, hours, minutes, and seconds can be specified without > > explicit unit markings. For example, <literal>'1 12:59:10'</> is read > > the same as <literal>'1 day 12 hours 59 min 10 sec'</>. > > </para> > > > > <para> > > The optional precision > > <replaceable>p</replaceable> should be between 0 and 6, and > > defaults to the precision of the input literal. > > </para> > > + > > + > > + <para> > > + Alternatively, <type>interval</type> values can be written as > > + ISO 8601 time intervals, using the "Format with time-unit designators". > > + This format always starts with the character <literal>'P'</>, followed > > + by a string of values followed by single character time-unit designators. > > + A <literal>'T'</> separates the date and time parts of the interval. > > + </para> > > + > > + <para> > > + Format: PnYnMnDTnHnMnS > > + </para> > > + <para> > > + In this format, <literal>'n'</> gets replaced by a number, and > > + <literal>Y</> represents years, > > + <literal>M</> (in the date part) months, > > + <literal>D</> months, > > + <literal>H</> hours, > > + <literal>M</> (in the time part) minutes, > > + and <literal>S</> seconds. > > + </para> > > + > > + > > + <table id="interval-example-table"> > > + <title>Interval Example</title> > > + <tgroup cols="2"> > > + <thead> > > + <row> > > + <entry>Traditional</entry> > > + <entry>ISO-8601 time-interval</entry> > > + </row> > > + </thead> > > + <tbody> > > + <row> > > + <entry>1 month</entry> > > + <entry>P1M</entry> > > + </row> > > + <row> > > + <entry>1 hour 30 minutes</entry> > > + <entry>PT1H30M</entry> > > + </row> > > + <row> > > + <entry>2 years 10 months 15 days 10 hours 30 minutes 20 seconds</entry> > > + <entry>P2Y10M15DT10H30M20S</entry> > > + </row> > > + </tbody> > > + </thead> > > + </table> > > + > > + </para> > > </sect3> > > > > <sect3> > > <title>Special Values</title> > > > > <indexterm> > > <primary>time</primary> > > <secondary>constants</secondary> > > </indexterm> > > > > Index: src/backend/utils/adt/datetime.c > > =================================================================== > > RCS file: /projects/cvsroot/pgsql-server/src/backend/utils/adt/datetime.c,v > > retrieving revision 1.116 > > diff -u -1 -0 -r1.116 datetime.c > > --- src/backend/utils/adt/datetime.c 27 Aug 2003 23:29:28 -0000 1.116 > > +++ src/backend/utils/adt/datetime.c 8 Sep 2003 04:04:59 -0000 > > @@ -30,20 +30,21 @@ > > struct tm * tm, fsec_t *fsec, int *is2digits); > > static int DecodeNumberField(int len, char *str, > > int fmask, int *tmask, > > struct tm * tm, fsec_t *fsec, int *is2digits); > > static int DecodeTime(char *str, int fmask, int *tmask, > > struct tm * tm, fsec_t *fsec); > > static int DecodeTimezone(char *str, int *tzp); > > static datetkn *datebsearch(char *key, datetkn *base, unsigned int nel); > > static int DecodeDate(char *str, int fmask, int *tmask, struct tm * tm); > > static void TrimTrailingZeros(char *str); > > +static int DecodeISO8601Interval(char **field, int *ftype, int nf, int *dtype, struct tm * tm, fsec_t *fsec); > > > > > > int day_tab[2][13] = { > > {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31, 0}, > > {31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31, 0}}; > > > > char *months[] = {"Jan", "Feb", "Mar", "Apr", "May", "Jun", > > "Jul", "Aug", "Sep", "Oct", "Nov", "Dec", NULL}; > > > > char *days[] = {"Sunday", "Monday", "Tuesday", "Wednesday", > > @@ -2872,30 +2873,271 @@ > > default: > > *val = tp->value; > > break; > > } > > } > > > > return type; > > } > > > > > > +void adjust_fval(double fval,struct tm * tm, fsec_t *fsec, int scale); > > +{ > > + int sec; > > + fval *= scale; > > + sec = fval; > > + tm->tm_sec += sec; > > +#ifdef HAVE_INT64_TIMESTAMP > > + *fsec += ((fval - sec) * 1000000); > > +#else > > + *fsec += (fval - sec); > > +#endif > > +} > > + > > + > > +/* DecodeISO8601Interval() > > + * > > + * Check if it's a ISO 8601 Section 5.5.4.2 "Representation of > > + * time-interval by duration only." > > + * Basic extended format: PnYnMnDTnHnMnS > > + * PnW > > + * For more info. > > + * http://www.astroclark.freeserve.co.uk/iso8601/index.html > > + * ftp://ftp.qsl.net/pub/g1smd/154N362_.PDF > > + * > > + * Examples: P1D for 1 day > > + * PT1H for 1 hour > > + * P2Y6M7DT1H30M for 2 years, 6 months, 7 days 1 hour 30 min > > + * > > + * The first field is exactly "p" or "pt" it may be of this type. > > + * > > + * Returns -1 if the field is not of this type. > > + * > > + * It pretty strictly checks the spec, with the two exceptions > > + * that a week field ('W') may coexist with other units, and that > > + * this function allows decimals in fields other than the least > > + * significant units. > > + */ > > +int > > +DecodeISO8601Interval(char **field, int *ftype, int nf, int *dtype, struct tm * tm, fsec_t *fsec) > > +{ > > + char *cp; > > + int fmask = 0, > > + tmask; > > + int val; > > + double fval; > > + int arg; > > + int datepart; > > + > > + /* > > + * An ISO 8601 "time-interval by duration only" must start > > + * with a 'P'. If it contains a date-part, 'p' will be the > > + * only character in the field. If it contains no date part > > + * it will contain exactly to characters 'PT' indicating a > > + * time part. > > + * Anything else is illegal and will be treated like a > > + * traditional postgresql interval. > > + */ > > + if (!(field[0][0] == 'p' && > > + ((field[0][1] == 0) || (field[0][1] == 't' && field[0][2] == 0)))) > > + { > > + return -1; > > + } > > + > > + > > + /* > > + * If the first field is exactly 1 character ('P'), it starts > > + * with date elements. Otherwise it's two characters ('PT'); > > + * indicating it starts with a time part. > > + */ > > + datepart = (field[0][1] == 0); > > + > > + /* > > + * Every value must have a unit, so we require an even > > + * number of value/unit pairs. Therefore we require an > > + * odd nubmer of fields, including the prefix 'P'. > > + */ > > + if ((nf & 1) == 0) > > + return -1; > > + > > + /* > > + * Process pairs of fields at a time. > > + */ > > + for (arg = 1 ; arg < nf ; arg+=2) > > + { > > + char * value = field[arg ]; > > + char * units = field[arg+1]; > > + > > + /* > > + * The value part must be a number. > > + */ > > + if (ftype[arg] != DTK_NUMBER) > > + return -1; > > + > > + /* > > + * extract the number, almost exactly like the non-ISO interval. > > + */ > > + val = strtol(value, &cp, 10); > > + > > + /* > > + * One difference from the normal postgresql interval below... > > + * ISO 8601 states that "Of these, the comma is the preferred > > + * sign" so I allow it here for locales that support it. > > + * Note: Perhaps the old-style interval code below should > > + * allow for this too, but I didn't want to risk backward > > + * compatability. > > + */ > > + if (*cp == '.' || *cp == ',') > > + { > > + fval = strtod(cp, &cp); > > + if (*cp != '\0') > > + return -1; > > + > > + if (val < 0) > > + fval = -(fval); > > + } > > + else if (*cp == '\0') > > + fval = 0; > > + else > > + return -1; > > + > > + > > + if (datepart) > > + { > > + /* > > + * All the 8601 unit specifiers are 1 character, but may > > + * be followed by a 'T' character if transitioning between > > + * the date part and the time part. If it's not either > > + * one character or two characters with the second being 't' > > + * it's an error. > > + */ > > + if (!(units[1] == 0 || (units[1] == 't' && units[2] == 0))) > > + return -1; > > + > > + if (units[1] == 't') > > + datepart = 0; > > + > > + switch (units[0]) /* Y M D W */ > > + { > > + case 'd': > > + tm->tm_mday += val; > > + if (fval != 0) > > + adjust_fval(fval,tm,fsec, 86400); > > + tmask = ((fmask & DTK_M(DAY)) ? 0 : DTK_M(DAY)); > > + break; > > + > > + case 'w': > > + tm->tm_mday += val * 7; > > + if (fval != 0) > > + adjust_fval(fval,tm,fsec,7 * 86400); > > + tmask = ((fmask & DTK_M(DAY)) ? 0 : DTK_M(DAY)); > > + break; > > + > > + case 'm': > > + tm->tm_mon += val; > > + if (fval != 0) > > + adjust_fval(fval,tm,fsec,30 * 86400); > > + tmask = DTK_M(MONTH); > > + break; > > + > > + case 'y': > > + /* > > + * Why can fractional months produce seconds, > > + * but fractional years can't? Well the older > > + * interval code below has the same property > > + * so this one follows the other one too. > > + */ > > + tm->tm_year += val; > > + if (fval != 0) > > + tm->tm_mon += (fval * 12); > > + tmask = ((fmask & DTK_M(YEAR)) ? 0 : DTK_M(YEAR)); > > + break; > > + > > + default: > > + return -1; /* invald date unit prefix */ > > + } > > + } > > + else > > + { > > + /* > > + * ISO 8601 time part. > > + * In the time part, only one-character > > + * unit prefixes are allowed. If it's more > > + * than one character, it's not a valid ISO 8601 > > + * time interval by duration. > > + */ > > + if (units[1] != 0) > > + return -1; > > + > > + switch (units[0]) /* H M S */ > > + { > > + case 's': > > + tm->tm_sec += val; > > +#ifdef HAVE_INT64_TIMESTAMP > > + *fsec += (fval * 1000000); > > +#else > > + *fsec += fval; > > +#endif > > + tmask = DTK_M(SECOND); > > + break; > > + > > + case 'm': > > + tm->tm_min += val; > > + if (fval != 0) > > + adjust_fval(fval,tm,fsec,60); > > + tmask = DTK_M(MINUTE); > > + break; > > + > > + case 'h': > > + tm->tm_hour += val; > > + if (fval != 0) > > + adjust_fval(fval,tm,fsec,3600); > > + tmask = DTK_M(HOUR); > > + break; > > + > > + default: > > + return -1; /* invald time unit prefix */ > > + } > > + } > > + fmask |= tmask; > > + } > > + > > + if (*fsec != 0) > > + { > > + int sec; > > + > > +#ifdef HAVE_INT64_TIMESTAMP > > + sec = (*fsec / INT64CONST(1000000)); > > + *fsec -= (sec * INT64CONST(1000000)); > > +#else > > + TMODULO(*fsec, sec, 1e0); > > +#endif > > + tm->tm_sec += sec; > > + } > > + return (fmask != 0) ? 0 : -1; > > +} > > + > > + > > /* DecodeInterval() > > * Interpret previously parsed fields for general time interval. > > * Returns 0 if successful, DTERR code if bogus input detected. > > * > > * Allow "date" field DTK_DATE since this could be just > > * an unsigned floating point number. - thomas 1997-11-16 > > * > > * Allow ISO-style time span, with implicit units on number of days > > * preceding an hh:mm:ss field. - thomas 1998-04-30 > > + * > > + * Allow ISO-8601 style "Representation of time-interval by duration only" > > + * of the format 'PnYnMnDTnHnMnS' and 'PnW' - ron 2003-08-30 > > */ > > + > > int > > DecodeInterval(char **field, int *ftype, int nf, int *dtype, struct tm * tm, fsec_t *fsec) > > { > > int is_before = FALSE; > > char *cp; > > int fmask = 0, > > tmask, > > type; > > int i; > > int dterr; > > @@ -2906,20 +3148,37 @@ > > > > type = IGNORE_DTF; > > tm->tm_year = 0; > > tm->tm_mon = 0; > > tm->tm_mday = 0; > > tm->tm_hour = 0; > > tm->tm_min = 0; > > tm->tm_sec = 0; > > *fsec = 0; > > > > + /* > > + * Check if it's a ISO 8601 Section 5.5.4.2 "Representation of > > + * time-interval by duration only." > > + * Basic extended format: PnYnMnDTnHnMnS > > + * PnW > > + * http://www.astroclark.freeserve.co.uk/iso8601/index.html > > + * ftp://ftp.qsl.net/pub/g1smd/154N362_.PDF > > + * Examples: P1D for 1 day > > + * PT1H for 1 hour > > + * P2Y6M7DT1H30M for 2 years, 6 months, 7 days 1 hour 30 min > > + * > > + * The first field is exactly "p" or "pt" it may be of this type. > > + */ > > + if (DecodeISO8601Interval(field,ftype,nf,dtype,tm,fsec) == 0) { > > + return 0; > > + } > > + > > /* read through list backwards to pick up units before values */ > > for (i = nf - 1; i >= 0; i--) > > { > > switch (ftype[i]) > > { > > case DTK_TIME: > > dterr = DecodeTime(field[i], fmask, &tmask, tm, fsec); > > if (dterr) > > return dterr; > > type = DTK_DAY; > > @@ -2983,20 +3242,21 @@ > > } > > /* DROP THROUGH */ > > > > case DTK_DATE: > > case DTK_NUMBER: > > val = strtol(field[i], &cp, 10); > > > > if (type == IGNORE_DTF) > > type = DTK_SECOND; > > > > + /* should this allow ',' for locales that use it ? */ > > if (*cp == '.') > > { > > fval = strtod(cp, &cp); > > if (*cp != '\0') > > return DTERR_BAD_FORMAT; > > > > if (val < 0) > > fval = -(fval); > > } > > else if (*cp == '\0') > > > > =================================================================== > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 6: Have you searched our list archives? > > > > http://archives.postgresql.org > > > > -- Peter Eisentraut peter_e@gmx.net
pgsql-patches by date: