Re: [HACKERS] postgres and year 2000 - Mailing list pgsql-hackers

From Thomas G. Lockhart
Subject Re: [HACKERS] postgres and year 2000
Date
Msg-id 3697A21A.EAF7627A@alumni.caltech.edu
Whole thread Raw
In response to postgres and year 2000  (Massimo Dal Zotto <dz@cs.unitn.it>)
List pgsql-hackers
> it seems that the year handling in pgsql dates is not very consistent:
> the same number is converted to a different year
> depending on the number of digits and the number itself.
> I think that this kind of things are the most likely sources of Y2K
> troubles. A more consistent approach would be to treat the year
> literally and let any smart hack with dates entirely to the user under
> his responsability.

OK, here is a patch which tightens up the date/time interpretation of
random input :)

For numeric fields, anything greater than two digits in length is
required to be either a "concatenated date/time" or an explicit date.
Also, now *only* two digit years are converted to a year in a nearby
century.

> Only then we could declare pgsql as full Y2K compliant.

I agree that the interpretation of 3 and 5 digit years was not right for
years with leading zeros. This patch should fix that.

"Y2K compliant" means to me that the behavior of well-formed dates must
be predictable and understandable. Dates *intended* to be in the 19th or
20th century should end up being interpreted that way.

The Postgres date/time types allow year ranges from (roughly) Julian Day
zero, which is Nov 23, 4714 BC (I wrote it as -4713 in my notes, so I
may be off by a year), to far into the future. So if you enter a three
digit year, that's what you are going to get (or will, with this patch).
I don't see that as causing Y2K trouble; you could just as easily call
it Y1K trouble :)

Anyway, there is a tradeoff between flexibility in date specification
and catching unintended typos. Dan suggested that all dates be required
to have 4 digit years, which may be overly harsh for some users and
developers. Should we have a way to specify the range of valid dates and
times so databases installed for specific applications can have more
restrictive constraints on input? I already parameterized the year range
check with a #define. Perhaps we should have it be able to be overridden
by ./configure or by a Makefile.custom??

A patch is enclosed which fixes all of the problems I can see in your
examples. It basically does no century adjustments unless the year is
exactly two digits, and interprets anything greater than two digits as
an explicit year (or a concatenated date if 8 or 6 or 5 digits). It adds
"two digit year" adjustments to concatentated dates, which I had
apparently omitted. It also does more rigorous checking on the usage of
"BC", so that if you specify a negative or zero year along with BC it
will complain (before, it just flipped the sign on the year).

Please install and let me know what you think:
  cd src/backend/utils/adt
  patch < dt.c.patch
  cd ../../..
  make install

I will apply this to the development cvs tree sometime soon, unless
folks find problems or need something different.

Massimo, would you be interested in adding some of your test cases to
the datetime or horology regression test? Send me patches and I'll add
them in...

                      - Tom*** dt.c.orig    Mon Jan  4 17:43:25 1999
--- dt.c    Sat Jan  9 18:24:29 1999
***************
*** 2839,2850 ****
              case DTK_NUMBER:
                  flen = strlen(field[i]);

!                 if (flen > 4)
                  {
                      if (DecodeNumberField(flen, field[i], fmask, &tmask, tm, fsec) != 0)
                          return -1;

                  }
                  else
                  {
                      if (DecodeNumber(flen, field[i], fmask, &tmask, tm, fsec) != 0)
--- 2839,2853 ----
              case DTK_NUMBER:
                  flen = strlen(field[i]);

!                 /* long numeric string and either no date or no time read yet?
!                  * then interpret as a concatenated date or time... */
!                 if ((flen > 4) && !((fmask & DTK_DATE_M) && (fmask & DTK_TIME_M)))
                  {
                      if (DecodeNumberField(flen, field[i], fmask, &tmask, tm, fsec) != 0)
                          return -1;

                  }
+                 /* otherwise it is a single date/time field... */
                  else
                  {
                      if (DecodeNumber(flen, field[i], fmask, &tmask, tm, fsec) != 0)
***************
*** 3000,3006 ****

      /* there is no year zero in AD/BC notation; i.e. "1 BC" == year 0 */
      if (bc)
!         tm->tm_year = -(tm->tm_year - 1);

      if ((mer != HR24) && (tm->tm_hour > 12))
          return -1;
--- 3003,3014 ----

      /* there is no year zero in AD/BC notation; i.e. "1 BC" == year 0 */
      if (bc)
!     {
!         if (tm->tm_year > 0)
!             tm->tm_year = -(tm->tm_year - 1);
!         else
!             elog(ERROR,"Inconsistant use of year %04d and 'BC'", tm->tm_year);
!     }

      if ((mer != HR24) && (tm->tm_hour > 12))
          return -1;
***************
*** 3375,3382 ****
      printf("DecodeNumber- %s is %d fmask=%08x tmask=%08x\n", str, val, fmask, *tmask);
  #endif

!     /* enough digits to be unequivocal year? */
!     if (flen == 4)
      {
  #ifdef DATEDEBUG
          printf("DecodeNumber- match %d (%s) as year\n", val, str);
--- 3383,3405 ----
      printf("DecodeNumber- %s is %d fmask=%08x tmask=%08x\n", str, val, fmask, *tmask);
  #endif

!     /* Special case day of year? */
!     if ((flen == 3) && (fmask & DTK_M(YEAR))
!         && ((val >= 1) && (val <= 366)))
!     {
!         *tmask = (DTK_M(DOY) | DTK_M(MONTH) | DTK_M(DAY));
!         tm->tm_yday = val;
!         j2date((date2j(tm->tm_year, 1, 1) + tm->tm_yday - 1),
!                &tm->tm_year, &tm->tm_mon, &tm->tm_mday);
!
!     }
!     /* Enough digits to be unequivocal year?
!      * Used to test for 4 digits or more,
!      * but we now test first for a three-digit doy
!      * so anything bigger than two digits had better be
!      * an explicit year. - thomas 1999-01-09
!      */
!     else if (flen > 2)
      {
  #ifdef DATEDEBUG
          printf("DecodeNumber- match %d (%s) as year\n", val, str);
***************
*** 3399,3416 ****

          tm->tm_year = val;

-         /* special case day of year? */
-     }
-     else if ((flen == 3) && (fmask & DTK_M(YEAR))
-              && ((val >= 1) && (val <= 366)))
-     {
-         *tmask = (DTK_M(DOY) | DTK_M(MONTH) | DTK_M(DAY));
-         tm->tm_yday = val;
-         j2date((date2j(tm->tm_year, 1, 1) + tm->tm_yday - 1),
-                &tm->tm_year, &tm->tm_mon, &tm->tm_mday);
-
-         /* already have year? then could be month */
      }
      else if ((fmask & DTK_M(YEAR)) && (!(fmask & DTK_M(MONTH)))
               && ((val >= 1) && (val <= 12)))
      {
--- 3422,3429 ----

          tm->tm_year = val;

      }
+     /* already have year? then could be month */
      else if ((fmask & DTK_M(YEAR)) && (!(fmask & DTK_M(MONTH)))
               && ((val >= 1) && (val <= 12)))
      {
***************
*** 3460,3469 ****
  #endif
          *tmask = DTK_M(YEAR);
          tm->tm_year = val;
!         if (tm->tm_year < 70)
!             tm->tm_year += 2000;
!         else if (tm->tm_year < 100)
!             tm->tm_year += 1900;

      }
      else
--- 3473,3487 ----
  #endif
          *tmask = DTK_M(YEAR);
          tm->tm_year = val;
!
!         /* adjust ONLY if exactly two digits... */
!         if (flen == 2)
!         {
!             if (tm->tm_year < 70)
!                 tm->tm_year += 2000;
!             else if (tm->tm_year < 100)
!                 tm->tm_year += 1900;
!         }

      }
      else
***************
*** 3527,3534 ****
--- 3545,3573 ----
              tm->tm_mon = atoi(str + 2);
              *(str + 2) = '\0';
              tm->tm_year = atoi(str + 0);
+
+             if (tm->tm_year < 70)
+                 tm->tm_year += 2000;
+             else if (tm->tm_year < 100)
+                 tm->tm_year += 1900;
          }

+     }
+     else if ((len == 5) && !(fmask & DTK_DATE_M))
+     {
+ #ifdef DATEDEBUG
+         printf("DecodeNumberField- %s is 5 characters fmask=%08x tmask=%08x\n", str, fmask, *tmask);
+ #endif
+         *tmask = DTK_DATE_M;
+         tm->tm_mday = atoi(str + 2);
+         *(str + 2) = '\0';
+         tm->tm_mon = 1;
+         tm->tm_year = atoi(str + 0);
+
+         if (tm->tm_year < 70)
+             tm->tm_year += 2000;
+         else if (tm->tm_year < 100)
+             tm->tm_year += 1900;
      }
      else if (strchr(str, '.') != NULL)
      {

pgsql-hackers by date:

Previous
From: "Bryan Field-Elliot"
Date:
Subject: RE: [HACKERS] RE: [GENERAL] Benchmarking PGSQL against Microsoft SQL 7
Next
From: "Thomas G. Lockhart"
Date:
Subject: Re: [HACKERS] RE: [GENERAL] Benchmarking PGSQL against Microsoft SQL 7