Re: BUG #16419: wrong parsing BC year in to_date() function - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: BUG #16419: wrong parsing BC year in to_date() function
Date
Msg-id CAKFQuwajC7+s-nHRujP+w5GsUTcp8usazdPas59Kz3kCTq+e3w@mail.gmail.com
Whole thread Raw
Responses Re: BUG #16419: wrong parsing BC year in to_date() function
List pgsql-hackers
Redirecting to -hackers for visibility.  I feel there needs to be something done here, even if just documentation (a bullet in the usage notes section - and a code comment update for the macro) pointing this out and not changing any behavior.

David J.

On Wed, May 6, 2020 at 8:12 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
‪On Wed, May 6, 2020 at 6:31 PM ‫دار الآثار للنشر والتوزيع-صنعاء Dar Alathar-Yemen‬‎ <dar_alathar@hotmail.com> wrote:‬

Any one suppose that these functions return the same:
make_date(-1,1,1)
to_date('-1-01-01','yyyy-mm-dd')

But make_date will give 0001-01-01 BC

And to_date will give 0002-01-01 BC



Interesting...and a fair point.

What seems to be happening here is that to_date is trying to be helpful by doing:

select to_date('0000','YYYY'); // 0001-01-01 BC

It does this seemingly by subtracting one from the year, making it positive, then (I infer) appending "BC" to the result.  Thus for the year "-1" it yields "0002-01-01 BC"

make_date just chooses to reject the year 0 and treat the negative as an alternative to specifying BC

There seems to be zero tests for to_date involving negative years, and the documentation doesn't talk of them.

I'll let the -hackers speak up as to how they want to go about handling to_date (research how it behaves in the other database it tries to emulate and either document or possibly change the behavior in v14) but do suggest that a simple explicit description of how to_date works in the presence of negative years be back-patched.  A bullet in the usage notes section probably suffices:

"If a YYYY format string captures a negative year, or 0000, it will treat it as a BC year after decreasing the value by one.  So 0000 maps to 1 BC and -1 maps to 2 BC and so on."

So, no, make_date and to_date do not agree on this point; and they do not have to.  There is no way to specify "BC" in make_date function so using negative there makes sense.  You can specify BC in the input string for to_date and indeed that is the only supported (documented) way to do so.

 
[and the next email]
 
Specifically:


/*
 * There is no 0 AD.  Years go from 1 BC to 1 AD, so we make it
 * positive and map year == -1 to year zero, and shift all negative
 * years up one.  For interval years, we just return the year.
 */
#define ADJUST_YEAR(year, is_interval) ((is_interval) ? (year) : ((year) <= 0 ? -((year) - 1) : (year)))

The code comment took me a bit to process - seems like the following would be better (if its right - I don't know why interval is a pure no-op while non-interval normalizes to a positive integer).

Years go from 1 BC to 1 AD, so we adjust the year zero, and all negative years, by shifting them away one year,  We then return the positive value of the result because the caller tracks the BC/AD aspect of the year separately and only deals with positive year values coming out of this macro.  Intervals denote the distance away from 0 a year is so we can simply take the supplied value and return it.  Interval processing code expects a negative result for intervals going into BC.

David J.


 

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: WIP: BRIN multi-range indexes
Next
From: Isaac Morland
Date:
Subject: Re: Our naming of wait events is a disaster.