Re: Rectifying wrong Date outputs - Mailing list pgsql-hackers

From Piyush Newe
Subject Re: Rectifying wrong Date outputs
Date
Msg-id AANLkTi=72rJjsfuRCuUvg+3Y9t6bObDfdw38NrpCEwPZ@mail.gmail.com
Whole thread Raw
In response to Re: Rectifying wrong Date outputs  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Rectifying wrong Date outputs
List pgsql-hackers


On Thu, Mar 17, 2011 at 7:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Mar 17, 2011 at 9:46 AM, Alvaro Herrera
> <alvherre@commandprompt.com> wrote:
>> Keep in mind that the datetime stuff was abandoned by the maintainer
>> some years ago with quite some rough edges.  Some of it has been fixed,
>> but a lot of bugs remain.  Looks like this is one of those places and it
>> seems appropriate to spend some time fixing it.  Since it would involve
>> a behavior change, it should only go to 9.2, of course.

> I wouldn't object to fixing the problem with # of digits > # of Ys in
> 9.1, if the fix is simple and clear-cut.  I think we are still
> accepting patches to make minor tweaks, like the tab-completion patch
> I committed yesterday.  It also doesn't bother me tremendously if we
> push it off, but I don't think that anyone's going to be too sad if
> TO_DATE('01-jan-2010',  'DD-MON-YYY') starts returning something more
> sensible than 3010-01-01.

Agreed, it's certainly not too late for bug fixes in 9.1.  I agree
that this isn't something we would want to tweak in released branches,
but 9.1 isn't there yet.


I feel the patch for the same would be easier and was attached in the initial mail of this mail thread. For your ready reference, I am attaching the same patch here again. 
 
Having said that, it's not entirely clear to me what sane behavior is
here.  Personally I would expect that an n-Ys format spec would consume
at most n digits from the input.  Otherwise how are you going to use
to_date to pick apart strings that don't have any separators?  So
I think the problem is actually upstream of the behavior complained of
here.  However, what we should first do is see what Oracle does in such
cases, because the main driving factor for these functions is Oracle
compatibility not what might seem sane in a vacuum.


Following is the extended chart which is comparing the behavior of Oracle, PG & EDBAS. 

Data Format Oracle PostgreSQL EDBAS

TO_DATE('01-jan-1',  'DD-MON-Y') 01-JAN-2011 01-JAN-2001 01-JAN-2001
TO_DATE('01-jan-1',  'DD-MON-YY') 01-JAN-2001 01-JAN-2001 01-JAN-2001
TO_DATE('01-jan-1',  'DD-MON-YYY') 01-JAN-2001 01-JAN-2001 01-JAN-2001
TO_DATE('01-jan-1',  'DD-MON-YYYY') 01-JAN-0001 01-JAN-0001 01-JAN-0001

In this case, all the cases are in sync except the 1st one. I didn't understand why Oracle is interpreting year '1' as '2011'. 

Data Format Oracle PostgreSQL EDBAS

TO_DATE('01-jan-10',  'DD-MON-Y') Error 01-JAN-2010 Error
TO_DATE('01-jan-10',  'DD-MON-YY') 01-JAN-2010 01-JAN-2010 01-JAN-2010
TO_DATE('01-jan-10',  'DD-MON-YYY') 01-JAN-2010 01-JAN-2010 01-JAN-2010
TO_DATE('01-jan-10',  'DD-MON-YYYY') 01-JAN-0010 01-JAN-0010 01-JAN-0010

In this case, it seems in last 3 cases PostgreSQL is behaving correctly. Oracle is throwing error in 1st case since the Format ('Y') is lesser than the actual value ('10'). But PostgreSQL is ignoring this case and throwing whatever is input. The output is might not be the same was user is expecting.

Data Format Oracle PostgreSQL EDBAS

TO_DATE('01-jan-067',  'DD-MON-Y') Error 01-JAN-2067 Error
TO_DATE('01-jan-111',  'DD-MON-YY') 01-JAN-0111 01-JAN-2011 Error
TO_DATE('01-jan-678',  'DD-MON-YYY') 01-JAN-2678 01-JAN-1678 01-JAN-2678
TO_DATE('01-jan-001',  'DD-MON-YYYY') 01-JAN-0001 01-JAN-0001 01-JAN-0001

In this case, just last case was correct in PG. Rest other cases are not in sync with Oracle, rather the output is vague.

Data Format Oracle PostgreSQL EDBAS

TO_DATE('01-jan-2010',  'DD-MON-Y') Error 01-JAN-4010 Error
TO_DATE('01-jan-2010',  'DD-MON-YY') 01-JAN-2010 01-JAN-3910 Error
TO_DATE('01-jan-2010',  'DD-MON-YYY') Error 01-JAN-3010 Error
TO_DATE('01-jan-2010',  'DD-MON-YYYY') 01-JAN-2010 01-JAN-2010 01-JAN-2010

In this case, PG is giving wrong output in first 3 cases. Those need to get rectified. Oracle is throwing error in 1st and 3rd case and the reason is, the format is lesser than the actual value. It seems this rule is not applicable for 2nd case in Oracle.  

In all above mentioned cases, the observation is, If the # Ys are lesser than the # of digits,, then it should throw an error. Only in case of 'YY', its not correct, unless the year is later than 9999. In this way, we can fix the wrong outputs in PG.



 
                       regards, tom lane



--
--
Piyush S Newe
Principal Engineer
EnterpriseDB
office: +91 20 3058 9500
www.enterprisedb.com

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: WIP patch: collation assignment algorithm rewrite
Next
From: Heikki Linnakangas
Date:
Subject: Re: Allowing multiple concurrent base backups