Thread: Re: BUG #16419: wrong parsing BC year in to_date() function

Re: BUG #16419: wrong parsing BC year in to_date() function

From
"David G. Johnston"
Date:
Redirecting to -hackers for visibility.  I feel there needs to be something done here, even if just documentation (a bullet in the usage notes section - and a code comment update for the macro) pointing this out and not changing any behavior.

David J.

On Wed, May 6, 2020 at 8:12 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
‪On Wed, May 6, 2020 at 6:31 PM ‫دار الآثار للنشر والتوزيع-صنعاء Dar Alathar-Yemen‬‎ <dar_alathar@hotmail.com> wrote:‬

Any one suppose that these functions return the same:
make_date(-1,1,1)
to_date('-1-01-01','yyyy-mm-dd')

But make_date will give 0001-01-01 BC

And to_date will give 0002-01-01 BC



Interesting...and a fair point.

What seems to be happening here is that to_date is trying to be helpful by doing:

select to_date('0000','YYYY'); // 0001-01-01 BC

It does this seemingly by subtracting one from the year, making it positive, then (I infer) appending "BC" to the result.  Thus for the year "-1" it yields "0002-01-01 BC"

make_date just chooses to reject the year 0 and treat the negative as an alternative to specifying BC

There seems to be zero tests for to_date involving negative years, and the documentation doesn't talk of them.

I'll let the -hackers speak up as to how they want to go about handling to_date (research how it behaves in the other database it tries to emulate and either document or possibly change the behavior in v14) but do suggest that a simple explicit description of how to_date works in the presence of negative years be back-patched.  A bullet in the usage notes section probably suffices:

"If a YYYY format string captures a negative year, or 0000, it will treat it as a BC year after decreasing the value by one.  So 0000 maps to 1 BC and -1 maps to 2 BC and so on."

So, no, make_date and to_date do not agree on this point; and they do not have to.  There is no way to specify "BC" in make_date function so using negative there makes sense.  You can specify BC in the input string for to_date and indeed that is the only supported (documented) way to do so.

 
[and the next email]
 
Specifically:


/*
 * There is no 0 AD.  Years go from 1 BC to 1 AD, so we make it
 * positive and map year == -1 to year zero, and shift all negative
 * years up one.  For interval years, we just return the year.
 */
#define ADJUST_YEAR(year, is_interval) ((is_interval) ? (year) : ((year) <= 0 ? -((year) - 1) : (year)))

The code comment took me a bit to process - seems like the following would be better (if its right - I don't know why interval is a pure no-op while non-interval normalizes to a positive integer).

Years go from 1 BC to 1 AD, so we adjust the year zero, and all negative years, by shifting them away one year,  We then return the positive value of the result because the caller tracks the BC/AD aspect of the year separately and only deals with positive year values coming out of this macro.  Intervals denote the distance away from 0 a year is so we can simply take the supplied value and return it.  Interval processing code expects a negative result for intervals going into BC.

David J.


 

Re: BUG #16419: wrong parsing BC year in to_date() function

From
Laurenz Albe
Date:
On Tue, 2020-05-12 at 18:09 -0700, David G. Johnston wrote:
> Redirecting to -hackers for visibility.  I feel there needs to be something done here, even if just documentation (a
bulletin the usage notes section - and a code comment update for the macro)
 
> pointing this out and not changing any behavior.
> 
> David J.
> 
> On Wed, May 6, 2020 at 8:12 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
> > ‪On Wed, May 6, 2020 at 6:31 PM ‫دار الآثار للنشر والتوزيع-صنعاء Dar Alathar-Yemen‬‎ <dar_alathar@hotmail.com>
wrote:‬
> > > Any one suppose that these functions return the same:
> > > make_date(-1,1,1)
> > > to_date('-1-01-01','yyyy-mm-dd')
> > > 
> > > But make_date will give 0001-01-01 BC
> > > 
> > > And to_date will give 0002-01-01 BC
> > > 
> > > 
> > > 
> > > 
> > 
> > Interesting...and a fair point.
> > 
> > What seems to be happening here is that to_date is trying to be helpful by doing:
> > 
> > select to_date('0000','YYYY'); // 0001-01-01 BC
> > 
> > It does this seemingly by subtracting one from the year, making it positive, then (I infer) appending "BC" to the
result. Thus for the year "-1" it yields "0002-01-01 BC"
 
> > 
> > make_date just chooses to reject the year 0 and treat the negative as an alternative to specifying BC
> > 
> > There seems to be zero tests for to_date involving negative years, and the documentation doesn't talk of them.
> > 
> > I'll let the -hackers speak up as to how they want to go about handling to_date (research how it behaves in the
otherdatabase it tries to emulate and either document or possibly change the
 
> > behavior in v14) but do suggest that a simple explicit description of how to_date works in the presence of negative
yearsbe back-patched.  A bullet in the usage notes section probably suffices:
 
> > 
> > "If a YYYY format string captures a negative year, or 0000, it will treat it as a BC year after decreasing the
valueby one.  So 0000 maps to 1 BC and -1 maps to 2 BC and so on."
 
> > 
> > So, no, make_date and to_date do not agree on this point; and they do not have to.  There is no way to specify "BC"
inmake_date function so using negative there makes sense.  You can specify BC
 
> > in the input string for to_date and indeed that is the only supported (documented) way to do so.
> > 
> > 
> 
>  
> [and the next email]
>  
> > Specifically:
> > 
> >
https://github.com/postgres/postgres/blob/fb544735f11480a697fcab791c058adc166be1fa/src/backend/utils/adt/formatting.c#L236
> > 
> > /*
> >  * There is no 0 AD.  Years go from 1 BC to 1 AD, so we make it
> >  * positive and map year == -1 to year zero, and shift all negative
> >  * years up one.  For interval years, we just return the year.
> >  */
> > #define ADJUST_YEAR(year, is_interval) ((is_interval) ? (year) : ((year) <= 0 ? -((year) - 1) : (year)))
> > 
> > The code comment took me a bit to process - seems like the following would be better (if its right - I don't know
whyinterval is a pure no-op while non-interval normalizes to a positive integer).
 
> > 
> > Years go from 1 BC to 1 AD, so we adjust the year zero, and all negative years, by shifting them away one year,  We
thenreturn the positive value of the result because the caller tracks the BC/AD
 
> > aspect of the year separately and only deals with positive year values coming out of this macro.  Intervals denote
thedistance away from 0 a year is so we can simply take the supplied value and
 
> > return it.  Interval processing code expects a negative result for intervals going into BC.
> > 
> > David J.

Since "to_date" is an Oracle compatibility function, here is what Oracle 18.4 has to say to that:

SQL> SELECT to_date('0000', 'YYYY') FROM dual;
SELECT to_date('0000', 'YYYY') FROM dual
               *
ERROR at line 1:
ORA-01841: (full) year must be between -4713 and +9999, and not be 0


SQL> SELECT to_date('-0001', 'YYYY') FROM dual;
SELECT to_date('-0001', 'YYYY') FROM dual
               *
ERROR at line 1:
ORA-01841: (full) year must be between -4713 and +9999, and not be 0


SQL> SELECT to_date('-0001', 'SYYYY') FROM dual;

TO_DATE('-0001','SYYYY
----------------------
0001-05-01 00:00:00 BC

Yours,
Laurenz Albe




Re: BUG #16419: wrong parsing BC year in to_date() function

From
"David G. Johnston"
Date:
On Tue, May 12, 2020 at 8:56 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Tue, 2020-05-12 at 18:09 -0700, David G. Johnston wrote:
> Redirecting to -hackers for visibility.  I feel there needs to be something done here, even if just documentation (a bullet in the usage notes section - and a code comment update for the macro)
> pointing this out and not changing any behavior.

Since "to_date" is an Oracle compatibility function, here is what Oracle 18.4 has to say to that:

SQL> SELECT to_date('0000', 'YYYY') FROM dual;
SELECT to_date('0000', 'YYYY') FROM dual
               *
ERROR at line 1:
ORA-01841: (full) year must be between -4713 and +9999, and not be 0


Attached is a concrete patch (back-patchable hopefully) documenting the current reality.

As noted in the patch commit message (commentary really):

make_timestamp not agreeing with make_date on how to handle negative years should probably just be fixed - but that is for someone else to handle.

Whether to actually change the behavior of to_date is up for debate though I would presume it would not be back-patched.

David J.
Attachment

Re: BUG #16419: wrong parsing BC year in to_date() function

From
Bruce Momjian
Date:
On Wed, Jul 15, 2020 at 09:26:53AM -0700, David G. Johnston wrote:
> On Tue, May 12, 2020 at 8:56 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> 
>     On Tue, 2020-05-12 at 18:09 -0700, David G. Johnston wrote:
>     > Redirecting to -hackers for visibility.  I feel there needs to be
>     something done here, even if just documentation (a bullet in the usage
>     notes section - and a code comment update for the macro)
>     > pointing this out and not changing any behavior.
> 
>     Since "to_date" is an Oracle compatibility function, here is what Oracle
>     18.4 has to say to that:
> 
>     SQL> SELECT to_date('0000', 'YYYY') FROM dual;
>     SELECT to_date('0000', 'YYYY') FROM dual
>                    *
>     ERROR at line 1:
>     ORA-01841: (full) year must be between -4713 and +9999, and not be 0
> 
> 
> 
> Attached is a concrete patch (back-patchable hopefully) documenting the current
> reality.
> 
> As noted in the patch commit message (commentary really):
> 
> make_timestamp not agreeing with make_date on how to handle negative years
> should probably just be fixed - but that is for someone else to handle.
> 
> Whether to actually change the behavior of to_date is up for debate though I
> would presume it would not be back-patched.

OK, so, looking at this thread, we have to_date() treating -1 as -2 BC,
make_date() treating -1 as 1 BC, and we have Oracle, which to_date() is
supposed to match, making -1 as 1 BC.

Because we already have the to_date/make_date inconsistency, and the -1
to -2 BC mapping is confusing, and doesn't match Oracle, I think the
clean solution is to change PG 14 to treat -1 as 1 BC, and document the
incompatibility in the release notes.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: BUG #16419: wrong parsing BC year in to_date() function

From
"David G. Johnston"
Date:
On Thu, Sep 3, 2020 at 6:21 PM Bruce Momjian <bruce@momjian.us> wrote:
On Wed, Jul 15, 2020 at 09:26:53AM -0700, David G. Johnston wrote:

> Whether to actually change the behavior of to_date is up for debate though I
> would presume it would not be back-patched.

OK, so, looking at this thread, we have to_date() treating -1 as -2 BC,
make_date() treating -1 as 1 BC, and we have Oracle, which to_date() is
supposed to match, making -1 as 1 BC.

Because we already have the to_date/make_date inconsistency, and the -1
to -2 BC mapping is confusing, and doesn't match Oracle, I think the
clean solution is to change PG 14 to treat -1 as 1 BC, and document the
incompatibility in the release notes.

I agree that someone else should write another patch to fix the behavior for v14.  Still suggest committing the proposed patch to master and all supported versions to document the existing behavior correctly.  The fix patch can work from that.

David J.

Re: BUG #16419: wrong parsing BC year in to_date() function

From
Bruce Momjian
Date:
On Fri, Sep  4, 2020 at 12:45:36PM -0700, David G. Johnston wrote:
> On Thu, Sep 3, 2020 at 6:21 PM Bruce Momjian <bruce@momjian.us> wrote:
> 
>     On Wed, Jul 15, 2020 at 09:26:53AM -0700, David G. Johnston wrote:
> 
>     > Whether to actually change the behavior of to_date is up for debate
>     though I
>     > would presume it would not be back-patched.
> 
>     OK, so, looking at this thread, we have to_date() treating -1 as -2 BC,
>     make_date() treating -1 as 1 BC, and we have Oracle, which to_date() is
>     supposed to match, making -1 as 1 BC.
> 
>     Because we already have the to_date/make_date inconsistency, and the -1
>     to -2 BC mapping is confusing, and doesn't match Oracle, I think the
>     clean solution is to change PG 14 to treat -1 as 1 BC, and document the
>     incompatibility in the release notes.
> 
> 
> I agree that someone else should write another patch to fix the behavior for
> v14.  Still suggest committing the proposed patch to master and all supported
> versions to document the existing behavior correctly.  The fix patch can work
> from that.

I think we need to apply the patches to all branches at the same time. 
I am not sure we want to document a behavior we know will change in PG
14.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: BUG #16419: wrong parsing BC year in to_date() function

From
Peter Eisentraut
Date:
On 2020-09-04 21:45, David G. Johnston wrote:
> On Thu, Sep 3, 2020 at 6:21 PM Bruce Momjian <bruce@momjian.us 
> <mailto:bruce@momjian.us>> wrote:
> 
>     On Wed, Jul 15, 2020 at 09:26:53AM -0700, David G. Johnston wrote:
> 
>      > Whether to actually change the behavior of to_date is up for
>     debate though I
>      > would presume it would not be back-patched.
> 
>     OK, so, looking at this thread, we have to_date() treating -1 as -2 BC,
>     make_date() treating -1 as 1 BC, and we have Oracle, which to_date() is
>     supposed to match, making -1 as 1 BC.
> 
>     Because we already have the to_date/make_date inconsistency, and the -1
>     to -2 BC mapping is confusing, and doesn't match Oracle, I think the
>     clean solution is to change PG 14 to treat -1 as 1 BC, and document the
>     incompatibility in the release notes.
> 
> 
> I agree that someone else should write another patch to fix the behavior 
> for v14.  Still suggest committing the proposed patch to master and all 
> supported versions to document the existing behavior correctly.  The fix 
> patch can work from that.

Adding support for negative years in make_timestamp seems pretty 
straightforward; see attached patch.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: BUG #16419: wrong parsing BC year in to_date() function

From
Asif Rehman
Date:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            not tested

Patch looks good to me.

The new status of this patch is: Ready for Committer

Re: BUG #16419: wrong parsing BC year in to_date() function

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> On Fri, Sep  4, 2020 at 12:45:36PM -0700, David G. Johnston wrote:
>>> Because we already have the to_date/make_date inconsistency, and the -1
>>> to -2 BC mapping is confusing, and doesn't match Oracle, I think the
>>> clean solution is to change PG 14 to treat -1 as 1 BC, and document the
>>> incompatibility in the release notes.

>> I agree that someone else should write another patch to fix the behavior for
>> v14.  Still suggest committing the proposed patch to master and all supported
>> versions to document the existing behavior correctly.  The fix patch can work
>> from that.

> I think we need to apply the patches to all branches at the same time.
> I am not sure we want to document a behavior we know will change in PG
> 14.

I think this is nuts.  The current behavior is obviously broken;
we should just treat it as a bug and fix it, including back-patching.
I do not think there is a compatibility problem of any significance.
Who out there is going to have an application that is relying on the
ability to insert BC dates in this way?

            regards, tom lane



Re: BUG #16419: wrong parsing BC year in to_date() function

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> Adding support for negative years in make_timestamp seems pretty 
> straightforward; see attached patch.

In hopes of moving things along, I pushed that, along with documentation
additions.  I couldn't quite convince myself that it was a bug fix
though, so no back-patch.  (I don't think we really need any doc
changes about it in the back branches, either.)

            regards, tom lane



Re: BUG #16419: wrong parsing BC year in to_date() function

From
Tom Lane
Date:
I wrote:
> I think this is nuts.  The current behavior is obviously broken;
> we should just treat it as a bug and fix it, including back-patching.
> I do not think there is a compatibility problem of any significance.
> Who out there is going to have an application that is relying on the
> ability to insert BC dates in this way?

Concretely, I propose the attached.  This adjusts Dar Alathar-Yemen's
patch (it didn't do the right thing IMO for the combination of bc
and year < 0) and adds test cases and docs.

Oracle would have us throw an error for year zero, but our historical
behavior has been to read it as 1 BC.  That's not so obviously wrong
that I'd want to change it in the back branches.  Maybe it could be
done as a follow-up change in HEAD.

            regards, tom lane

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 62dd738230..ec8451d1b9 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -7678,6 +7678,15 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
       </para>
      </listitem>

+     <listitem>
+      <para>
+       In <function>to_timestamp</function> and <function>to_date</function>,
+       negative years are treated as signifying BC.  If you write both a
+       negative year and an explicit <literal>BC</literal> field, you get AD
+       again.  An input of year zero is treated as 1 BC.
+      </para>
+     </listitem>
+
      <listitem>
       <para>
        In <function>to_timestamp</function> and <function>to_date</function>,
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index b91ff7bb80..3bb01cdb65 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -4569,8 +4569,11 @@ do_to_timestamp(text *date_txt, text *fmt, Oid collid, bool std,
         {
             /* If a 4-digit year is provided, we use that and ignore CC. */
             tm->tm_year = tmfc.year;
-            if (tmfc.bc && tm->tm_year > 0)
-                tm->tm_year = -(tm->tm_year - 1);
+            if (tmfc.bc)
+                tm->tm_year = -tm->tm_year;
+            /* correct for our representation of BC years */
+            if (tm->tm_year < 0)
+                tm->tm_year++;
         }
         fmask |= DTK_M(YEAR);
     }
diff --git a/src/test/regress/expected/horology.out b/src/test/regress/expected/horology.out
index c8c33a0fc0..7f82dcfbfe 100644
--- a/src/test/regress/expected/horology.out
+++ b/src/test/regress/expected/horology.out
@@ -2916,6 +2916,45 @@ SELECT to_date('2458872', 'J');
  01-23-2020
 (1 row)

+--
+-- Check handling of BC dates
+--
+SELECT to_date('44-02-01 BC','YYYY-MM-DD BC');
+    to_date
+---------------
+ 02-01-0044 BC
+(1 row)
+
+SELECT to_date('-44-02-01','YYYY-MM-DD');
+    to_date
+---------------
+ 02-01-0044 BC
+(1 row)
+
+SELECT to_date('-44-02-01 BC','YYYY-MM-DD BC');
+  to_date
+------------
+ 02-01-0044
+(1 row)
+
+SELECT to_timestamp('44-02-01 11:12:13 BC','YYYY-MM-DD HH24:MI:SS BC');
+          to_timestamp
+---------------------------------
+ Fri Feb 01 11:12:13 0044 PST BC
+(1 row)
+
+SELECT to_timestamp('-44-02-01 11:12:13','YYYY-MM-DD HH24:MI:SS');
+          to_timestamp
+---------------------------------
+ Fri Feb 01 11:12:13 0044 PST BC
+(1 row)
+
+SELECT to_timestamp('-44-02-01 11:12:13 BC','YYYY-MM-DD HH24:MI:SS BC');
+         to_timestamp
+------------------------------
+ Mon Feb 01 11:12:13 0044 PST
+(1 row)
+
 --
 -- Check handling of multiple spaces in format and/or input
 --
@@ -3183,6 +3222,12 @@ SELECT to_date('2016 366', 'YYYY DDD');  -- ok

 SELECT to_date('2016 367', 'YYYY DDD');
 ERROR:  date/time field value out of range: "2016 367"
+SELECT to_date('0000-02-01','YYYY-MM-DD');  -- allowed, though it shouldn't be
+    to_date
+---------------
+ 02-01-0001 BC
+(1 row)
+
 --
 -- Check behavior with SQL-style fixed-GMT-offset time zone (cf bug #8572)
 --
diff --git a/src/test/regress/sql/horology.sql b/src/test/regress/sql/horology.sql
index c464e6766c..fed21a53c8 100644
--- a/src/test/regress/sql/horology.sql
+++ b/src/test/regress/sql/horology.sql
@@ -426,6 +426,17 @@ SELECT to_date('1 4 1902', 'Q MM YYYY');  -- Q is ignored
 SELECT to_date('3 4 21 01', 'W MM CC YY');
 SELECT to_date('2458872', 'J');

+--
+-- Check handling of BC dates
+--
+
+SELECT to_date('44-02-01 BC','YYYY-MM-DD BC');
+SELECT to_date('-44-02-01','YYYY-MM-DD');
+SELECT to_date('-44-02-01 BC','YYYY-MM-DD BC');
+SELECT to_timestamp('44-02-01 11:12:13 BC','YYYY-MM-DD HH24:MI:SS BC');
+SELECT to_timestamp('-44-02-01 11:12:13','YYYY-MM-DD HH24:MI:SS');
+SELECT to_timestamp('-44-02-01 11:12:13 BC','YYYY-MM-DD HH24:MI:SS BC');
+
 --
 -- Check handling of multiple spaces in format and/or input
 --
@@ -511,6 +522,7 @@ SELECT to_date('2015 366', 'YYYY DDD');
 SELECT to_date('2016 365', 'YYYY DDD');  -- ok
 SELECT to_date('2016 366', 'YYYY DDD');  -- ok
 SELECT to_date('2016 367', 'YYYY DDD');
+SELECT to_date('0000-02-01','YYYY-MM-DD');  -- allowed, though it shouldn't be

 --
 -- Check behavior with SQL-style fixed-GMT-offset time zone (cf bug #8572)

Re: BUG #16419: wrong parsing BC year in to_date() function

From
Bruce Momjian
Date:
On Tue, Sep 29, 2020 at 01:26:29PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Fri, Sep  4, 2020 at 12:45:36PM -0700, David G. Johnston wrote:
> >>> Because we already have the to_date/make_date inconsistency, and the -1
> >>> to -2 BC mapping is confusing, and doesn't match Oracle, I think the
> >>> clean solution is to change PG 14 to treat -1 as 1 BC, and document the
> >>> incompatibility in the release notes.
> 
> >> I agree that someone else should write another patch to fix the behavior for
> >> v14.  Still suggest committing the proposed patch to master and all supported
> >> versions to document the existing behavior correctly.  The fix patch can work
> >> from that.
> 
> > I think we need to apply the patches to all branches at the same time. 
> > I am not sure we want to document a behavior we know will change in PG
> > 14.
> 
> I think this is nuts.  The current behavior is obviously broken;
> we should just treat it as a bug and fix it, including back-patching.
> I do not think there is a compatibility problem of any significance.
> Who out there is going to have an application that is relying on the
> ability to insert BC dates in this way?

You are agreeing with what I am suggesting then?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: BUG #16419: wrong parsing BC year in to_date() function

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> On Tue, Sep 29, 2020 at 01:26:29PM -0400, Tom Lane wrote:
>> Bruce Momjian <bruce@momjian.us> writes:
>>> On Fri, Sep  4, 2020 at 12:45:36PM -0700, David G. Johnston wrote:
>>>> Because we already have the to_date/make_date inconsistency, and the -1
>>>> to -2 BC mapping is confusing, and doesn't match Oracle, I think the
>>>> clean solution is to change PG 14 to treat -1 as 1 BC, and document the
>>>> incompatibility in the release notes.

>>> I agree that someone else should write another patch to fix the behavior for
>>> v14.  Still suggest committing the proposed patch to master and all supported
>>> versions to document the existing behavior correctly.  The fix patch can work
>>> from that.

>> I think this is nuts.  The current behavior is obviously broken;
>> we should just treat it as a bug and fix it, including back-patching.
>> I do not think there is a compatibility problem of any significance.
>> Who out there is going to have an application that is relying on the
>> ability to insert BC dates in this way?

> You are agreeing with what I am suggesting then?

Hm, I read your reference to "the release notes" as suggesting that
we should change it only in a major release, ie HEAD only (and it
looks like David read it the same).  If you meant minor release notes,
then we're on the same page.

            regards, tom lane



Re: BUG #16419: wrong parsing BC year in to_date() function

From
Bruce Momjian
Date:
On Wed, Sep 30, 2020 at 02:11:54PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Tue, Sep 29, 2020 at 01:26:29PM -0400, Tom Lane wrote:
> >> Bruce Momjian <bruce@momjian.us> writes:
> >>> On Fri, Sep  4, 2020 at 12:45:36PM -0700, David G. Johnston wrote:
> >>>> Because we already have the to_date/make_date inconsistency, and the -1
> >>>> to -2 BC mapping is confusing, and doesn't match Oracle, I think the
> >>>> clean solution is to change PG 14 to treat -1 as 1 BC, and document the
> >>>> incompatibility in the release notes.
> 
> >>> I agree that someone else should write another patch to fix the behavior for
> >>> v14.  Still suggest committing the proposed patch to master and all supported
> >>> versions to document the existing behavior correctly.  The fix patch can work
> >>> from that.
> 
> >> I think this is nuts.  The current behavior is obviously broken;
> >> we should just treat it as a bug and fix it, including back-patching.
> >> I do not think there is a compatibility problem of any significance.
> >> Who out there is going to have an application that is relying on the
> >> ability to insert BC dates in this way?
> 
> > You are agreeing with what I am suggesting then?
> 
> Hm, I read your reference to "the release notes" as suggesting that
> we should change it only in a major release, ie HEAD only (and it
> looks like David read it the same).  If you meant minor release notes,
> then we're on the same page.

Yes, I was thinking just the major release notes.  What are you
suggesting, and what did you ultimately decide to do?  What I didn't
want to do was to document the old behavior in the old docs and change
it in PG 14.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: BUG #16419: wrong parsing BC year in to_date() function

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> On Wed, Sep 30, 2020 at 02:11:54PM -0400, Tom Lane wrote:
>> Hm, I read your reference to "the release notes" as suggesting that
>> we should change it only in a major release, ie HEAD only (and it
>> looks like David read it the same).  If you meant minor release notes,
>> then we're on the same page.

> Yes, I was thinking just the major release notes.  What are you
> suggesting, and what did you ultimately decide to do?  What I didn't
> want to do was to document the old behavior in the old docs and change
> it in PG 14.

Actually, I was just finishing up back-patching the patch I posted
yesterday.  I think we should just fix it, not document that it's
broken.

            regards, tom lane



Re: BUG #16419: wrong parsing BC year in to_date() function

From
Bruce Momjian
Date:
On Wed, Sep 30, 2020 at 02:50:31PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Wed, Sep 30, 2020 at 02:11:54PM -0400, Tom Lane wrote:
> >> Hm, I read your reference to "the release notes" as suggesting that
> >> we should change it only in a major release, ie HEAD only (and it
> >> looks like David read it the same).  If you meant minor release notes,
> >> then we're on the same page.
> 
> > Yes, I was thinking just the major release notes.  What are you
> > suggesting, and what did you ultimately decide to do?  What I didn't
> > want to do was to document the old behavior in the old docs and change
> > it in PG 14.
> 
> Actually, I was just finishing up back-patching the patch I posted
> yesterday.  I think we should just fix it, not document that it's
> broken.

Agreed, that's what I wanted.  You stated in a later email you couldn't
convince yourself of the backpatch, which is why I asked.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: BUG #16419: wrong parsing BC year in to_date() function

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> On Wed, Sep 30, 2020 at 02:50:31PM -0400, Tom Lane wrote:
>> Actually, I was just finishing up back-patching the patch I posted
>> yesterday.  I think we should just fix it, not document that it's
>> broken.

> Agreed, that's what I wanted.  You stated in a later email you couldn't
> convince yourself of the backpatch, which is why I asked.

Oh, I see where our wires are crossed.  I meant that I couldn't
convince myself to back-patch the make_timestamp() change.
(I'm still willing to listen to an argument to do so, if anyone
wants to make one --- but that part feels more like a feature
addition than a bug fix.)

            regards, tom lane



Re: BUG #16419: wrong parsing BC year in to_date() function

From
Bruce Momjian
Date:
On Wed, Sep 30, 2020 at 03:11:06PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Wed, Sep 30, 2020 at 02:50:31PM -0400, Tom Lane wrote:
> >> Actually, I was just finishing up back-patching the patch I posted
> >> yesterday.  I think we should just fix it, not document that it's
> >> broken.
> 
> > Agreed, that's what I wanted.  You stated in a later email you couldn't
> > convince yourself of the backpatch, which is why I asked.
> 
> Oh, I see where our wires are crossed.  I meant that I couldn't
> convince myself to back-patch the make_timestamp() change.
> (I'm still willing to listen to an argument to do so, if anyone
> wants to make one --- but that part feels more like a feature
> addition than a bug fix.)

OK, at least this is addressed fully in PG 14 and beyond.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: BUG #16419: wrong parsing BC year in to_date() function

From
Robert Haas
Date:
On Tue, Sep 29, 2020 at 1:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think this is nuts.  The current behavior is obviously broken;
> we should just treat it as a bug and fix it, including back-patching.
> I do not think there is a compatibility problem of any significance.
> Who out there is going to have an application that is relying on the
> ability to insert BC dates in this way?

I think that's entirely the wrong way to look at it. If nobody is
using the feature, then it will not break anything to change the
behavior, but on the other hand there is no reason to fix the bug
either. But if people are using the feature, making it behave
differently in the next minor release is going to break their
applications. I disagree *strongly* with making such changes in stable
branches and feel that the change to those branches should be
reverted.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: BUG #16419: wrong parsing BC year in to_date() function

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Sep 29, 2020 at 1:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think this is nuts.  The current behavior is obviously broken;
>> we should just treat it as a bug and fix it, including back-patching.
>> I do not think there is a compatibility problem of any significance.
>> Who out there is going to have an application that is relying on the
>> ability to insert BC dates in this way?

> I think that's entirely the wrong way to look at it. If nobody is
> using the feature, then it will not break anything to change the
> behavior, but on the other hand there is no reason to fix the bug
> either. But if people are using the feature, making it behave
> differently in the next minor release is going to break their
> applications. I disagree *strongly* with making such changes in stable
> branches and feel that the change to those branches should be
> reverted.

By that logic, we should never fix any bug in a back branch.

            regards, tom lane



Re: BUG #16419: wrong parsing BC year in to_date() function

From
Robert Haas
Date:
On Wed, Sep 30, 2020 at 5:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> By that logic, we should never fix any bug in a back branch.

No, by that logic, we should not change any behavior in a back-branch
upon which a customer is plausibly relying. No one relies on a certain
query causing a server crash, for example, or a cache lookup failure,
so fixing those things can only help people. But there is no reason at
all why someone shouldn't be relying on this very old and
long-established behavior not to change in a minor release.

One reason they might do that is because there was a discussion about
what I believe to this exact same case 4 years ago in which you and I
both endorsed the position you are now claiming is so unreasonable
that nobody will mind if we change it in a minor release.

https://www.postgresql.org/message-id/flat/CAKOSWNmwCH0wx6MApc1A8ww%2B%2BEQmG07AZ3t6w_XjRrV1xeZpTA%40mail.gmail.com

So you now think this should be back-patched when previously you
didn't even think it was be good enough for master.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: BUG #16419: wrong parsing BC year in to_date() function

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Sep 30, 2020 at 5:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> By that logic, we should never fix any bug in a back branch.

> No, by that logic, we should not change any behavior in a back-branch
> upon which a customer is plausibly relying.

I guess where we differ here is on the idea that somebody is plausibly
relying on to_date() to parse a BC date inaccurately.

> One reason they might do that is because there was a discussion about
> what I believe to this exact same case 4 years ago in which you and I
> both endorsed the position you are now claiming is so unreasonable
> that nobody will mind if we change it in a minor release.
> https://www.postgresql.org/message-id/flat/CAKOSWNmwCH0wx6MApc1A8ww%2B%2BEQmG07AZ3t6w_XjRrV1xeZpTA%40mail.gmail.com

What I complained about in that thread was mainly that that
patch was simultaneously trying to get stricter (throw error for
year zero) and laxer (parse negative years as BC).

Also, we did not in that thread have the information that Oracle
treats negative years as BC.  Now that we do, the situation is
different, and I'm willing to change my mind about it.  Admittedly,
Oracle seems to require an "S" in the format to parse a leading
dash as meaning a negative year.  But given that our code is willing
to read the case as a negative year without that, it seems pretty
silly to decide that it should read it as an off-by-one negative
year.

            regards, tom lane



Re: BUG #16419: wrong parsing BC year in to_date() function

From
Bruce Momjian
Date:
On Wed, Sep 30, 2020 at 06:10:38PM -0400, Robert Haas wrote:
> On Wed, Sep 30, 2020 at 5:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > By that logic, we should never fix any bug in a back branch.
> 
> No, by that logic, we should not change any behavior in a back-branch
> upon which a customer is plausibly relying. No one relies on a certain
> query causing a server crash, for example, or a cache lookup failure,
> so fixing those things can only help people. But there is no reason at
> all why someone shouldn't be relying on this very old and
> long-established behavior not to change in a minor release.

That is an interesting distinction.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: BUG #16419: wrong parsing BC year in to_date() function

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> On Wed, Sep 30, 2020 at 06:10:38PM -0400, Robert Haas wrote:
>> On Wed, Sep 30, 2020 at 5:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> By that logic, we should never fix any bug in a back branch.

>> No, by that logic, we should not change any behavior in a back-branch
>> upon which a customer is plausibly relying. No one relies on a certain
>> query causing a server crash, for example, or a cache lookup failure,
>> so fixing those things can only help people. But there is no reason at
>> all why someone shouldn't be relying on this very old and
>> long-established behavior not to change in a minor release.

> That is an interesting distinction.

I don't want to sound like I'm totally without sympathy for Robert's
argument.  But I do say it's a judgment call, and my judgment remains
that this patch is appropriate to back-patch.

We do not have, and never have had, a project policy against
back-patching non-crash-related behavioral changes.  If we did,
we would not for example put timezone database updates into the
back branches.  It's not terribly hard to imagine such updates
breaking applications that expected the meaning of, say,
'2022-04-01 12:34 Europe/Paris' to hold still.  But we do it
anyway.

As another not-too-old example, I'll cite Robert's own commits
0278d3f79/a08bfe742.  The argument for a back-patch there was
pretty much only that we were writing an alleged tar file that
didn't conform to the letter of the POSIX spec.  It's possible
to imagine that somebody had written bespoke archive-reading
code that failed after we changed the output; but that didn't
seem probable enough to justify continuing to violate the standard.

In this case the "standard" in question is the Oracle-derived
expectation that to_date will read negative years as BC, and
I'd argue that the possibility that someone already has code
that relies on getting an off-by-one result is outweighed by the
likelihood that the misbehavior will hurt somebody in the future.

This calculus would obviously change if we knew of such code
or thought it was really probable for it to exist.  That's
what makes it a judgment call.

            regards, tom lane



Re: BUG #16419: wrong parsing BC year in to_date() function

From
Bruce Momjian
Date:
On Wed, Sep 30, 2020 at 07:26:55PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Wed, Sep 30, 2020 at 06:10:38PM -0400, Robert Haas wrote:
> >> On Wed, Sep 30, 2020 at 5:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> By that logic, we should never fix any bug in a back branch.
> 
> >> No, by that logic, we should not change any behavior in a back-branch
> >> upon which a customer is plausibly relying. No one relies on a certain
> >> query causing a server crash, for example, or a cache lookup failure,
> >> so fixing those things can only help people. But there is no reason at
> >> all why someone shouldn't be relying on this very old and
> >> long-established behavior not to change in a minor release.
> 
> > That is an interesting distinction.
> 
> I don't want to sound like I'm totally without sympathy for Robert's
> argument.  But I do say it's a judgment call, and my judgment remains
> that this patch is appropriate to back-patch.

Agreed.  I was just thinking it was an interesting classification that
no one relies on crashes, or query failures.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: BUG #16419: wrong parsing BC year in to_date() function

From
Robert Haas
Date:
On Wed, Sep 30, 2020 at 7:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> We do not have, and never have had, a project policy against
> back-patching non-crash-related behavioral changes.  If we did,
> we would not for example put timezone database updates into the
> back branches.  It's not terribly hard to imagine such updates
> breaking applications that expected the meaning of, say,
> '2022-04-01 12:34 Europe/Paris' to hold still.  But we do it
> anyway.
>
> As another not-too-old example, I'll cite Robert's own commits
> 0278d3f79/a08bfe742.  The argument for a back-patch there was
> pretty much only that we were writing an alleged tar file that
> didn't conform to the letter of the POSIX spec.  It's possible
> to imagine that somebody had written bespoke archive-reading
> code that failed after we changed the output; but that didn't
> seem probable enough to justify continuing to violate the standard.

Right. Ultimately, this comes down to a judgement call about what you
think people are likely to rely on, and what you think they are
unlikely to rely on. If I recall correctly, I thought that case was a
close call, and back-patched because you argued for it. Either way, it
does seem very unlikely that someone would write archive-reading code
that relies on the presence of an extra 511 zero bytes, because (1) it
would be a lot easier to just use 'tar', (2) such code would fail if
used with a tar archive generated by anything other than PostgreSQL,
and (3) such code would fail on a tar archive generated by PostgreSQL
but without using -R. It is just barely plausible that someone has a
version of 'tar' that fails on the bogus archive and will work with
that fix, though I would guess that's also pretty unlikely.

But the present case does not seem to me to be comparable. If someone
is using to_date() to construct date values, I can't see why they
wouldn't test it, find out how it works with BC values, and then make
the application that generates the input to that function do the right
thing for the actual behavior of the function. There are discussions
of the behavior of to_date with YYYY = 0 going back at least 10 years
on this mailing list, and more recent discussions of the behavior of
negative numbers. Point being: I knew about the behavior that was here
reported as a bug and have known about it for years, and if I were
still an application developer I can easily imagine having coded to
it. I don't know why someone else should not have done the same. The
fact that we've suddenly discovered that this is not what Oracle does
doesn't mean that no users have discovered that it is what PostgreSQL
does.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: BUG #16419: wrong parsing BC year in to_date() function

From
"David G. Johnston"
Date:
On Wed, Sep 30, 2020 at 5:24 PM Robert Haas <robertmhaas@gmail.com> wrote:
The
fact that we've suddenly discovered that this is not what Oracle does
doesn't mean that no users have discovered that it is what PostgreSQL
does.

Presently I cannot seem to make up my mind so I'm going to go with my original opinion which was to only change the behavior in v14.  In part because it seems appropriate given our generally laissez-faire attitude toward this particular feature.

David J.

Re: BUG #16419: wrong parsing BC year in to_date() function

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Right. Ultimately, this comes down to a judgement call about what you
> think people are likely to rely on, and what you think they are
> unlikely to rely on.

Good, so at least we agree on that principle.

> But the present case does not seem to me to be comparable. If someone
> is using to_date() to construct date values, I can't see why they
> wouldn't test it, find out how it works with BC values, and then make
> the application that generates the input to that function do the right
> thing for the actual behavior of the function. There are discussions
> of the behavior of to_date with YYYY = 0 going back at least 10 years
> on this mailing list, and more recent discussions of the behavior of
> negative numbers.

Sure, we have at least two bug reports proving that people have
investigated this.  What I'm saying is unlikely is that there are any
production applications in which it matters.  I doubt that, say, the
Italian government has a citizenry database in which they've recorded
Julius Caesar's birthday; and even if they do, they're probably not
squirting the data through to_date; and even if they are, they're more
likely using the positive-year-with-BC representation, because that's
the only one that PG will emit.  Even if they've got code that somehow
relies on to_date working this way, it's almost certainly getting zero
use in practice.

I probably wouldn't have taken an interest in this at all, were it
not for the proposal that we document the misbehavior.  Doing that
rather than fixing it just seems silly.

            regards, tom lane