Re: BUG #16419: wrong parsing BC year in to_date() function - Mailing list pgsql-hackers

From Tom Lane
Subject Re: BUG #16419: wrong parsing BC year in to_date() function
Date
Msg-id 712970.1601403538@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #16419: wrong parsing BC year in to_date() function  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> I think this is nuts.  The current behavior is obviously broken;
> we should just treat it as a bug and fix it, including back-patching.
> I do not think there is a compatibility problem of any significance.
> Who out there is going to have an application that is relying on the
> ability to insert BC dates in this way?

Concretely, I propose the attached.  This adjusts Dar Alathar-Yemen's
patch (it didn't do the right thing IMO for the combination of bc
and year < 0) and adds test cases and docs.

Oracle would have us throw an error for year zero, but our historical
behavior has been to read it as 1 BC.  That's not so obviously wrong
that I'd want to change it in the back branches.  Maybe it could be
done as a follow-up change in HEAD.

            regards, tom lane

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 62dd738230..ec8451d1b9 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -7678,6 +7678,15 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
       </para>
      </listitem>

+     <listitem>
+      <para>
+       In <function>to_timestamp</function> and <function>to_date</function>,
+       negative years are treated as signifying BC.  If you write both a
+       negative year and an explicit <literal>BC</literal> field, you get AD
+       again.  An input of year zero is treated as 1 BC.
+      </para>
+     </listitem>
+
      <listitem>
       <para>
        In <function>to_timestamp</function> and <function>to_date</function>,
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index b91ff7bb80..3bb01cdb65 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -4569,8 +4569,11 @@ do_to_timestamp(text *date_txt, text *fmt, Oid collid, bool std,
         {
             /* If a 4-digit year is provided, we use that and ignore CC. */
             tm->tm_year = tmfc.year;
-            if (tmfc.bc && tm->tm_year > 0)
-                tm->tm_year = -(tm->tm_year - 1);
+            if (tmfc.bc)
+                tm->tm_year = -tm->tm_year;
+            /* correct for our representation of BC years */
+            if (tm->tm_year < 0)
+                tm->tm_year++;
         }
         fmask |= DTK_M(YEAR);
     }
diff --git a/src/test/regress/expected/horology.out b/src/test/regress/expected/horology.out
index c8c33a0fc0..7f82dcfbfe 100644
--- a/src/test/regress/expected/horology.out
+++ b/src/test/regress/expected/horology.out
@@ -2916,6 +2916,45 @@ SELECT to_date('2458872', 'J');
  01-23-2020
 (1 row)

+--
+-- Check handling of BC dates
+--
+SELECT to_date('44-02-01 BC','YYYY-MM-DD BC');
+    to_date
+---------------
+ 02-01-0044 BC
+(1 row)
+
+SELECT to_date('-44-02-01','YYYY-MM-DD');
+    to_date
+---------------
+ 02-01-0044 BC
+(1 row)
+
+SELECT to_date('-44-02-01 BC','YYYY-MM-DD BC');
+  to_date
+------------
+ 02-01-0044
+(1 row)
+
+SELECT to_timestamp('44-02-01 11:12:13 BC','YYYY-MM-DD HH24:MI:SS BC');
+          to_timestamp
+---------------------------------
+ Fri Feb 01 11:12:13 0044 PST BC
+(1 row)
+
+SELECT to_timestamp('-44-02-01 11:12:13','YYYY-MM-DD HH24:MI:SS');
+          to_timestamp
+---------------------------------
+ Fri Feb 01 11:12:13 0044 PST BC
+(1 row)
+
+SELECT to_timestamp('-44-02-01 11:12:13 BC','YYYY-MM-DD HH24:MI:SS BC');
+         to_timestamp
+------------------------------
+ Mon Feb 01 11:12:13 0044 PST
+(1 row)
+
 --
 -- Check handling of multiple spaces in format and/or input
 --
@@ -3183,6 +3222,12 @@ SELECT to_date('2016 366', 'YYYY DDD');  -- ok

 SELECT to_date('2016 367', 'YYYY DDD');
 ERROR:  date/time field value out of range: "2016 367"
+SELECT to_date('0000-02-01','YYYY-MM-DD');  -- allowed, though it shouldn't be
+    to_date
+---------------
+ 02-01-0001 BC
+(1 row)
+
 --
 -- Check behavior with SQL-style fixed-GMT-offset time zone (cf bug #8572)
 --
diff --git a/src/test/regress/sql/horology.sql b/src/test/regress/sql/horology.sql
index c464e6766c..fed21a53c8 100644
--- a/src/test/regress/sql/horology.sql
+++ b/src/test/regress/sql/horology.sql
@@ -426,6 +426,17 @@ SELECT to_date('1 4 1902', 'Q MM YYYY');  -- Q is ignored
 SELECT to_date('3 4 21 01', 'W MM CC YY');
 SELECT to_date('2458872', 'J');

+--
+-- Check handling of BC dates
+--
+
+SELECT to_date('44-02-01 BC','YYYY-MM-DD BC');
+SELECT to_date('-44-02-01','YYYY-MM-DD');
+SELECT to_date('-44-02-01 BC','YYYY-MM-DD BC');
+SELECT to_timestamp('44-02-01 11:12:13 BC','YYYY-MM-DD HH24:MI:SS BC');
+SELECT to_timestamp('-44-02-01 11:12:13','YYYY-MM-DD HH24:MI:SS');
+SELECT to_timestamp('-44-02-01 11:12:13 BC','YYYY-MM-DD HH24:MI:SS BC');
+
 --
 -- Check handling of multiple spaces in format and/or input
 --
@@ -511,6 +522,7 @@ SELECT to_date('2015 366', 'YYYY DDD');
 SELECT to_date('2016 365', 'YYYY DDD');  -- ok
 SELECT to_date('2016 366', 'YYYY DDD');  -- ok
 SELECT to_date('2016 367', 'YYYY DDD');
+SELECT to_date('0000-02-01','YYYY-MM-DD');  -- allowed, though it shouldn't be

 --
 -- Check behavior with SQL-style fixed-GMT-offset time zone (cf bug #8572)

pgsql-hackers by date:

Previous
From: "Andrey M. Borodin"
Date:
Subject: Re: Yet another fast GiST build
Next
From: Andrew Dunstan
Date:
Subject: Re: BLOB / CLOB support in PostgreSQL